ChatGPT’s Laws of Machine Learning

August 10, 2023

Every few years, I encounter Akin’s Laws of Spacecraft Design. A collection of pithy bits of pragmatic wisdom, hard earned through experience, success, and failure. A gift from the past, lighting the path for future explorers.

I’ve always found these sorts of insights and mental models to be an inspiring aspirational engineering goal. These insights are what push entire fields forward.

I feel like the AI field is in its infancy, and completely amorphous. We're sort of building the plane while we're flying it, and just trying to make sense along the way.

As we travel forth, it’s important to remember where we are in the journey. It is your responsibility to share your insights, and light the path for the future travelers.

Write about your successes.

Write about your failures.

Share everything.

🤷‍♂️

Okay, enough philosophizing.

I asked ChatGPT to translate akin’s laws to the field of machine learning, and I thought the results were pretty neat!

Here’s a link to the conversation: https://chat.openai.com/share/a83358c1-e819-4296-92fd-eb80aae17957

Here are the laws, henceforth known as, ChatGPT’s Laws of Machine Learning:

Data is King, but Quality Rules the Kingdom. Garbage in, garbage out.

Model Complexity is an Iterative Process. The necessary number of iterations is one more than what you've currently done.

No Model is Perfect. Strive for one that works well enough with real-world data.

Understand Your Bias. All models have biases; understanding them is key to responsible development.

(Occam's Razor) Simpler is Often Better. Don't use a complex model where a simple one will do.

Overfitting is the Silent Killer of Generalization. Regularize, cross-validate, and don't get too attached to your training data.

Trust but Verify. Trust your tools, libraries, and frameworks, but verify with testing and validation.

(No Free Lunch Theorem) There's No Universal Algorithm. Different problems require different solutions.

Explainability Isn't Optional. If you can't explain it, you can't trust it.

Ethics Aren't an Afterthought. Consider the societal impact from the start.

(Curse of Dimensionality) More Features Aren't Always Better. High dimensionality can lead to sparse data and poor performance.

Real-World Deployment is a Different Beast. What works in a controlled environment may not in the wild.

Don't Chase Metrics Blindly. Optimizing for a metric without understanding the context can lead to poor decisions.

(Sutton's Bitter Lesson) General Methods Tend to Outperform Specific Solutions Over Time. Focus on foundational ideas.

Failure Isn't Final; It's Informative. Learn from mistakes and iterate.

Collaboration Between Domain Experts and Data Scientists is Key. Don't design in a vacuum.

Interpretable Models are Preferable to Black Boxes. Transparency builds trust.

Scale Carefully. What works on a small dataset might not scale to big data.

A Model in Production is Worth a Thousand in a Jupyter Notebook. Deployment is where real value is created.

Automating Bias Automates Injustice. Be aware of underlying biases in your data.

The Best Model for the Job Might Already Exist. Look for existing solutions before reinventing the wheel.

(Moore's Law of ML) More Data Often Beats a Clever Algorithm. But understand its limitations and quality.

ML Models Don't Love Your Product; They Just Minimize a Loss Function. Always align the loss with the business objective.

There's No Such Thing as Unsupervised Learning. Even without labels, you guide the learning process.

If It's Not Reproducible, It's Not Science. Ensure that others can follow your steps.

Weekly Newsletter