Using Agile development techniques for data science projects

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: John Akred on building data platforms and enterprise data strategies.

Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.

In this episode of the O’Reilly Data Show, I spoke with John Akred, cofounder and CTO of Silicon Valley Data Science. Akred and his colleagues teach two of the more popular Strata + Hadoop World tutorials—“Developing a Modern Enterprise Data Strategy” and “Architecting a Data Platform.” We talked about his career in data science and consulting, and his penchant for bringing emerging technologies and tools into large enterprises.

Here are some highlights from our conversation:
Continue reading

3 ideas to add to your data science toolkit

[A version of this post appears on the O’Reilly Radar.]

Techniques to address overfitting, hyperparameter tuning, and model interpretability.

I’m always on the lookout for ideas that can improve how I tackle data analysis projects. I particularly favor approaches that translate to tools I can use repeatedly. Most of the time, I find these tools on my own—by trial and error—or by consulting other practitioners. I also have an affinity for academics and academic research, and I often tweet about research papers that I come across and am intrigued by. Often, academic research results don’t immediately translate to what I do, but I recently came across ideas from several research projects that are worth sharing with a wider audience.

The collection of ideas I’ve presented in this post address problems that come up frequently. In my mind, these ideas also reinforce the notion of data science as comprising data pipelines, not just machine learning algorithms. These ideas also have implications for engineers trying to build artificial intelligence (AI) applications.
Continue reading

Commercial speech recognition systems in the age of big data and deep learning

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Yishay Carmiel on applications of deep learning in text and speech.