Don’t rush to machine learning

A simpler approach—good data, SQL queries, if/then statements—often gets the job done.

It turns out the best way to do machine learning (ML) is sometimes to not do any machine learning at all. In fact, according to Amazon Applied Scientist Eugene Yan, “The first rule of machine learning [is to] start without machine learning.”


Yes, it’s cool to trot out ML models painstakingly crafted over months of arduous effort. It’s also not necessarily the most effective approach. Not when there are simpler, more accessible methods.

It may be an oversimplification to say, as data scientist Noah Lorang did years ago, that “data scientists mostly just do arithmetic.” But he’s not far off, and certainly he and Yan are correct that however much we may want to complicate the process of putting data to work, much of the time it’s better to start small.