Even in areas and domains where deep learning excels, simpler approaches are worth examining.
Accuracy is the main objective and a lot of effort goes towards raising it. But in practice tradeoffs have to be made, and other considerations play a role in model selection. Speed (to train/score) is important if the model is to be used in production. Interpretability is critical if a model has to be explained for transparency reasons (“black-boxes” are always an option, but are opaque by definition). Simplicity is important for practical reasons: if a model has “too many knobs to tune” and optimizations have to be done manually, it might be too involved to build and maintain it in production.
While deep learning has emerged as a technique capable of producing state-of-the-art results across several domains and types of data it’s far from the optimal choice in some situations. Simple techniques can sometimes produce comparable results and they register better along the other dimensions listed above (interpretability and speed).
A few weeks ago I tweeted a few examples where simpler approaches outperformed deep learning (or in the case of the last tweet, bag-of-words + CNNs outperforming RNNs). These examples struck a chord which prompted me to collect them into a post. It also got me thinking that I should solicit similar examples from my readers: so please leave your favorite example in the comments below, and I will update this post with the best suggestions.
— Ben Lorica (@bigdata) April 24, 2016
— Ben Lorica (@bigdata) April 25, 2016
— Ben Lorica (@bigdata) April 23, 2016