Introduction to Data Analysis
This chapter introduces some basic techniques to analyze data, including supervised approaches (classification using Bayes, prediction using regression) and unsupervised approaches (clustering, kNN algorithms, and association rules). Dealing with text and
- PDF / 2,938,136 Bytes
- 290 Pages / 439.371 x 666.143 pts Page_size
- 114 Downloads / 260 Views
Antonio Badia
SQL for Data Science Data Cleaning, Wrangling and Analytics with Relational Databases
Data-Centric Systems and Applications Series Editors Michael J. Carey, University of California, Irvine, CA, USA Stefano Ceri, Politecnico di Milano, Milano, Italy Editorial Board Members Anastasia Ailamaki, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland Shivnath Babu, Duke University, Durham, NC, USA Philip A. Bernstein, Microsoft Corporation, Redmond, WA, USA Johann-Christoph Freytag, Humboldt Universität zu Berlin, Berlin, Germany Alon Halevy, Facebook, Menlo Park, CA, USA Jiawei Han, University of Illinois, Urbana, IL, USA Donald Kossmann, Microsoft Research Laboratory, Redmond, WA, USA Gerhard Weikum, Max-Planck-Institut für Informatik, Saarbrücken, Germany Kyu-Young Whang, Korea Advanced Institute of Science & Technology, Daejeon, Korea (Republic of) Jeffrey Xu Yu, Chinese University of Hong Kong, Shatin, Hong Kong
Intelligent data management is the backbone of all information processing and has hence been one of the core topics in computer science from its very start. This series is intended to offer an international platform for the timely publication of all topics relevant to the development of data-centric systems and applications. All books show a strong practical or application relevance as well as a thorough scientific basis. They are therefore of particular interest to both researchers and professionals wishing to acquire detailed knowledge about concepts of which they need to make intelligent use when designing advanced solutions for their own problems. Special emphasis is laid upon: • Scientifically solid and detailed explanations of practically relevant concepts and techniques (what does it do) • Detailed explanations of the practical relevance and importance of concepts and techniques (why do we need it) • Detailed explanation of gaps between theory and practice (why it does not work) According to this focus of the series, submissions of advanced textbooks or books for advanced professional use are encouraged; these should preferably be authored books or monographs, but coherently edited, multi-author books are also envisaged (e.g. for emerging topics). On the other hand, overly technical topics (like physical data access, data compression etc.), latest research results that still need validation through the research community, or mostly product-related information for practitioners (“how to use Oracle 9i efficiently”) are not encouraged.
More information about this series at http://www.springer.com/series/5258
Antonio Badia
SQL for Data Science Data Cleaning, Wrangling and Analytics with Relational Databases
Antonio Badia Computer Engineering & Computer Science University of Louisville Louisville, KY, USA
ISSN 2197-9723 ISSN 2197-974X (electronic) Data-Centric Systems and Applications ISBN 978-3-030-57591-5 ISBN 978-3-030-57592-2 (eBook) https://doi.org/10.1007/978-3-030-57592-2 © Springer Nature Switzerland AG 2020 This work is subject to copyright. All rights a
Data Loading...