Concept

MACHINE LEARNING

Machine learning explained: what it is, how it differs from traditional programming, and why it's the engine behind modern AI.

Machine learning inverts the traditional software development paradigm. Instead of a programmer specifying rules and a computer following them, the programmer specifies a learning objective and the computer discovers the rules from data.

The three dominant learning paradigms are supervised learning (learning from labeled examples), unsupervised learning (finding structure in unlabeled data), and reinforcement learning (learning through trial, error, and reward). Deep learning — using neural networks with many layers — has become the dominant technique because it scales well with data and compute.

The scaling hypothesis is the central empirical finding of the past decade: across a wide range of tasks, ML model performance improves predictably as compute, data, and model size increase. This finding, more than any algorithmic breakthrough, explains why AI capabilities have advanced so rapidly — and why the companies with the most resources have the largest capability advantages.

For practitioners: Understanding the basics of ML is increasingly important even for non-ML engineers. The APIs you integrate, the tools you use, and the systems you build on top of are increasingly ML-driven. Knowing what models can and can't do, what they're sensitive to, and where they fail is operational knowledge.

The distribution shift problem: ML models are only as reliable as the match between the data they were trained on and the data they encounter in deployment. When the input distribution shifts — different user behavior, different seasonal patterns, different market conditions — model performance degrades in ways that can be invisible until something breaks. This is why ML systems in production require monitoring infrastructure that classical software doesn't: you can't just ship a model and assume it will behave the same way six months later. The operational discipline of tracking data drift, monitoring prediction quality, and triggering retraining is as important as the initial model development, and it's where many ML deployments fail in practice.

Why labels are the real constraint: In supervised learning, the training bottleneck is almost never compute — it's labeled data. Getting humans to annotate examples correctly, consistently, and at scale is expensive, slow, and error-prone. This is why techniques that reduce label dependency (semi-supervised learning, self-supervised learning, few-shot prompting) have become so valuable, and why foundation models trained on unlabeled internet data represent such a significant leverage point: they amortize the labeling cost across thousands of downstream tasks.