Skip to content
Menu
machinelearning.to
  • Home
machinelearning.to

Data Collection: Questions to Ask

What is “good” data?

  • Defined consistently (definition of labels y is unambiguous)
  • Cover of important cases (good coverage of inputs x)
  • Has timely feedback from production data (distribution covers data drift and concept drift)
  • Sized appropriately

What kind of problem are we trying to solve?

What data sources already exist?

What privacy concerns are there?

Is the data public?

Where should we store the data?

Status: Online

All pages will be updated and added to, thank you for your patience!

Categories

Quick Links:

  • ML Tutorials
  • ML Everyday Challenge – Anjum Ismail
  • ML Discussions
  • ML Applications
  • ML News
  • ML Ops
  • ML Books
  • ML Careers
  • ML Researchers
  • ML Podcasts
  • ML Papers
  • ML Domains
  • ML Ethics
  • ML Certificate Programs
  • ML Degree Programs

Recent Posts:

  • Tutorials: Towards AI – Machine Learning Fundamentals
  • Tutorial: KDnuggets – Retraining the Model
  • Tutorial: Siddhardhan – Machine Learning Models
  • Tutorial: Siddhardhan – Machine Learning Projects
  • Tutorial: Siddhardhan – Python Basics for Machine Learning

Sites We Like:

 

  • madewithml
  • Mr. Daniel Bourke
  • Tech with Tim
  • https://pythonprogramming.net
  • geeksforgeeks
  • mlexpert
  • Chip Huyen
  • /r/MachineLearning
  • /r/LearnMachineLearning
  • machinelearningmastery
  • paperswithcode
  • towardsai
  • kdnuggets
  • Analytics Vidhya
  • William Rinehart – Resource DB
  • https://online.datasciencedojo.com/

YouTube Channels We Like

  • Sentdex
  • freeCodeCamp.org
  • Clément Mihailescu
  • Tech With Tim
  • 3Blue1Brown
  • Aaron Jack
  • Statquest with Josh Starmer
  • Ken Jee
  • Daniel Bourke
  • DeepLearningAI
  • Mike Dane
  • Khan Academy
  • Keith Galli
  • Lex Fridman
  • Professor Leonard
  • Part Time Larry
  • Jon Krohn
  • Tübingen Machine Learning
  • Shai Ben-David
  • Krish Naik

Help support this site:

Buy me a coffee

©2025 machinelearning.to