“If I have seen further than others, it is by standing upon the shoulders of giants.” - Isaac Newton

My Go To List of Machine Learning & Data Science Resources

I have often been asked what resources I recommend for those looking to get into machine learning, whether you want to be a data scientist or ml engineer. In this delve I’ll cover my go to list of resources I continue to rely on whenever I need to refresh my own knowledge or delve deeper into a specific subject matter.

Books

I have found that books are an excellent way for me to absorb knowledge. I still enjoy having a physical book shelf I can refer to, books I can add sticky notes in, and physical copies I can lend out to others to read. If you enjoy learning from books, these are my go to list in no particular order:

  • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron - This is easily the best reference book I have come across for learning the basic (and advanced!) concepts of machine learning, equivalent to several semesters of graduate level coursework. The first half of the book covers “classical” methods of machine learning such as linear models, SVM, and trees, while the second half of the book is fully devoted to neural networks. I have run several books clubs using this book and have had great success using it as a reference material whenever I need a refresher on a particular concept. My only minor criticism is that the second half of the book utilizes Tensorflow over PyTorch (where I have seen the industry trending to over the last few years) for implementing neural networks, however all of the concepts are easily transferable between frameworks.

  • Practical DataOps Delivering Agile Data Science at Scale by Harvinder Atwal - If you are interested in learning more about the process of utilizing machine learning as part of a product, and the software delivery management best practices involved look no further than this book! Whether you are an individual contributor or the director of several teams, there is something for every practitioner in the ML space. Whenever I join a new organization, this is the first book I recommend to the leadership team to read. I have not found anything else that comes as close to perfectly describing the challenges involved in putting machine learning into products and how to overcome them.

  • Feature Engineering for Machine Learning by Alice Zheng & Amanda Casari - A more specialized book once you have a grasp of machine learning fundamentals, this book is a great reference for specific feature engineering techniques. Whenever I have a specific type of data, be it numerical, categorical, text, or images, this books has a reference for the appropriate feature engineering techniques available.

  • The Lean Startup by Eric Ries - Classic book on how to correctly ideate through an invocation cycle, find out what works, and pivot when needed. Not specifically an ML book, however its concepts of trying new ideas, measuring their impact, and adjusting are just as applicable when testing out new ML models.

Blogs & Newsletters

Blogs and newsletters are some of the best ways I’ve found to stay up to date on the latest developments, research, and industry trends. Here are some of my favorites to follow:

  • DataDelver - It’s my hope that this blog becomes an invaluable resource for the world of ML Ops, as informative as some of the other blogs on this list!

  • ruder.io - Sebastian Ruder’s excellent blog/newsletter with a focus on Natural Language Processing (NLP) research. My go to source for the latest NLP research developments.

  • Pycoders Weekly - Not specifically focused on machine learning, but an excellent weekly newsletter on all things python, I’ve found quite a few nuggets of knowledge or useful packages following this newsletter!

  • Zillow - The Zillow AI blog is an excellent resource for examining how machine learning is applied at a large organization, particularly in the realm of recommendation and personalization. This post in particular is one of my favorites!

  • AirBnB - The AirBnB blog is another excellent resource for gaining an industry perspective into utilizing machine learning. I always enjoyed this post on computer vision!

Courses

I have not relied on many online courses within my career, instead preferring books or blogs, but here the the ones I have personally gone through and gotten value from:

  • Building Recommender Systems with Machine Learning and AI - Excellent foundational course on building recommendation systems. Covers the traditional approaches for building recommendation systems such as content-based filtering, collaborative filtering, and matrix factorization. Also covers deep learning approaches. Importantly, it spends a great deal of time on how to properly evaluate recommendation systems. Lots of great content!

Reddit

Finally here are some of the subreddits I follow to stay up to date:

  • r/MachineLearning - Generalist subreddit on all things machine learning, with a particular focus on research.

  • r/datascience - Generalist subreddit on all things data science, with less emphasis on research and more beginner friendly.

  • r/LanguageTechnology - Natural language processing focused subreddit, if you need to process text this one’s for you!

  • r/StableDiffusion - If you want to get started using GenAI models to create your own images start here! Also see what other people have been able to generate!

  • r/dataisbeautiful - Great resource for learning techniques for one of the trickiest parts of machine learning, visualizing the data in a way people can understand.

Conclusion

That’s my list of resources, I hope some of them are useful to you all on your own delves! Thank you to all of the people that contribute to them, I know they’ve certainly made my own journey much easier! With this delve I close out 2023, I look forward to many more delves in 2024!

Delve Data

  • There are lots of great resources out there for learning about machine learning, data science, and MLOps, many of which are free.
  • I hope this blog becomes such a resource for you!
  • Stay tuned for more delves in 2024!