Bibliography Recommendation Data Science
Gregory Piatetsky-Shapiro (Analytics, Data Mining, Data Science Expert, KDnuggets President) em More Free Data Mining, Data Science Books and Resources
The list below based on the list compiled by Pedro Martins, but we added the book authors and year, sorted alphabetically by title, fixed spelling, and removed the links that did not work.
- An Introduction to Data Science by Jeffrey Stanton, Robert De Graaf, 2013.
An introductory level resource developed by Syracuse University
- An Introduction to Statistical Learning: with Applications in R by G. Casella, S, Fienberg, I Olkin, 2013.
Overview of statistical learning based on large datasets of information. The exploratory techniques of the data are discussed using the R programming language.
- A Programmer’s Guide to Data Mining by Ron Zacharski, 2012.
A guide through data mining concepts in a programming point of view. It provides several hands-on problems to practice and test the subjects taught on this online book.
- Bayesian Reasoning and Machine Learning by David Barber, 2012.
focusing on applying it to machine learning algorithms and processes. It is a hands-on resource, great to absorb all the knowledge in the book.
- Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners by Jared Dean, 2014.
On this resource the reality of big data is explored, and its benefits, from the marketing point of view. It also explains how to storage these kind of data and algorithms to process it, based on data mining and machine learning.
- Data Mining and Analysis: Fundamental Concepts and Algorithms by Mohammed J. Zaki, Wagner Meira, Jr., Data Mining and Analysis: Fundamental Concepts and Algorithms, Cambridge University Press, May 2014.
A great cover of the data mining exploratory algorithms and machine learning processes. These explanations are complemented by some statistical analysis.
- Data Mining and Business Analytics with R by Johannes Ledolter, 2013.
Another R based book describing all processes and implementations to explore, transform and store information. It also focus on the concept of Business Analytics.
- Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management by Michael J.A. Berry, Gordon S. Linoff, 2004.
A data mining book oriented specifically to marketing and business management. With great case studies in order to understand how to apply these techniques on the real world.
- Data Mining with Rattle and R: The Art of Excavating Data for Knowledge Discovery by Graham Williams, 2011.
The objective of this book is to provide you lots of information on data manipulation. It focus on the Rattle toolkit and the R language to demonstrate the implementation of these techniques.
- Gaussian Processes for Machine Learning by Carl Edward Rasmussen and Christopher K. I. Williams, 2006.
This is a theoretical book approaching learning algorithms based on probabilistic Gaussian processes. It’s about supervised learning problems, describing models and solutions related to machine learning.
Read the full post on KDnuggets: http://www.kdnuggets.com/2015/03/free-data-mining-data-science-books-resources.html
Gregory Piatetsky-Shapiro (Analytics, Data Mining, Data Science Expert, KDnuggets President)
Very interesting compilation published here, with a strong machine learning flavor (maybe machine learning book authors — usually academics — are more prone to making their books available for free). Many are O’Reilly books freely available. Here we display those most relevant to data science. I haven’t checked all the sources, but they seem legit. If you find some issue, let us know in the comment section below. Note that at DSC, we also have our free books:
- Data Science by Analyticbridge (internal to DSC, one of the first books about data science)
- Data Science 2.0 (internal to DSC, check the red-starred articles)
- 27 free data mining books
There are several sections in the listing in question:
- Data Science Overviews (4 books)
- Data Scientists Interviews (2 books)
- How To Build Data Science Teams (3 books)
- Data Analysis (1 book)
- Distributed Computing Tools (2 books)
- Data Mining and Machine Learning (29 books)
- Statistics and Statistical Learning (5 books)
- Data Visualization (2 books)
- Big Data (3 books)
Here we mention #1, #5 and #6:
Data Science Overviews
- An Introduction to Data Science (Jeffrey Stanton, 2013)
- School of Data Handbook (2015)
- Data Jujitsu: The Art of Turning Data into Product (DJ Patil, 2012)
- Art of Data Science (Roger D. Peng & Elizabeth Matsui, 2015)
Distributed Computing Tools
- Hadoop: The Definitive Guide (Tom White, 2011)
- Data-Intensive Text Processing with MapReduce (Jimmy Lin & Chris Dyer, 2010)
Data Mining and Machine Learning
- Introduction to Machine Learning (Amnon Shashua, 2008)
- Machine Learning (Abdelhamid Mellouk & Abdennacer Chebira)
- Machine Learning — The Complete Guide (Wikipedia)
- Social Media Mining An Introduction (Reza Zafarani, Mohammad Ali Abbasi, & Huan Liu, 2014)
- Data Mining: Practical Machine Learning Tools and Techniques (Ian H. Witten & Eibe Frank, 2005)
- Mining of Massive Datasets (Jure Leskovec, Anand Rajaraman, & Jeff Ullman, 2014)
- A Programmer’s Guide to Data Mining (Ron Zacharski, 2015)
- Data Mining with Rattle and R (Graham Williams, 2011)
- Data Mining and Analysis: Fundamental Concepts and Algorithms(Mohammed J. Zaki & Wagner Meria Jr., 2014)
- Mining the Social Web: Data Mining Facebook, Twitter, LinkedIn, Goo…(Matthew A. Russell, 2014)
- Probabilistic Programming & Bayesian Methods for Hackers (Cam Davidson-Pilon, 2015)
- Data Mining Techniques For Marketing, Sales, and Customer Relations…(Michael J.A. Berry & Gordon S. Linoff, 2004)
- Inductive Logic Programming: Techniques and Applications (Nada Lavrac & Saso Dzeroski, 1994)
- Pattern Recognition and Machine Learning (Christopher M. Bishop, 2006)
- Machine Learning, Neural and Statistical Classification (D. Michie, D.J. Spiegelhalter, & C.C. Taylor, 1999)
- Information Theory, Inference, and Learning Algorithms (David J.C. MacKay, 2005)
- Data Mining and Business Analytics with R (Johannes Ledolter, 2013)
- Bayesian Reasoning and Machine Learning (David Barber, 2014)
- Gaussian Processes for Machine Learning (C. E. Rasmussen & C. K. I. Williams, 2006)
- Reinforcement Learning: An Introduction (Richard S. Sutton & Andrew G. Barto, 2012)
- Algorithms for Reinforcement Learning (Csaba Szepesvari, 2009)
- Big Data, Data Mining, and Machine Learning (Jared Dean, 2014)
- Modeling With Data (Ben Klemens, 2008)
- KB — Neural Data Mining with Python Sources (Roberto Bello, 2013)
- Deep Learning (Yoshua Bengio, Ian J. Goodfellow, & Aaron Courville, 2015)
- Neural Networks and Deep Learning (Michael Nielsen, 2015)
- Data Mining Algorithms In R (Wikibooks, 2014)
- Data Mining and Analysis: Fundamental Concepts and Algorithms(Mohammed J. Zaki & Wagner Meira Jr., 2014)
- Theory and Applications for Advanced Text Mining (Shigeaki Sakurai, 2012)
- DSC Resources
- Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs
- Knowledge: Research | Competitions | Webinars | Our Book | Members Only | Search DSC
- Buzz: Business News | Announcements | Events | RSS Feeds
- Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers
- Additional Reading
- 50 Articles about Hadoop and Related Topics
- 10 Modern Statistical Concepts Discovered by Data Scientists
- Top data science keywords on DSC
- 4 easy steps to becoming a data scientist
- 13 New Trends in Big Data and Data Science
- 22 tips for better data science
- Data Science Compared to 16 Analytic Disciplines
- How to detect spurious correlations, and how to find the real ones
- 17 short tutorials all data scientists should read (and practice)
- 10 types of data scientists
- 66 job interview questions for data scientists
- High versus low-level data science
- Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge