Issue #14: Chatbots, the Hiring Pipeline, and the reliability of ML

woman in blue suit jacket

Subscribe • Previous Issues

This edition has 874 words which will take you about 5 minutes to read.

“Try to be a rainbow in someone’s cloud.” – Maya Angelou

Data Exchange podcast

  • How to build state-of-the-art chatbots  Lauren Kunze and Pandorabots have been at the forefront of many important developments in the conversational applications space. They help enterprises build and deploy bots, and they also create leading edge chatbots like Mitsuku.
  • Improving the hiring pipeline for software engineers  I recently spoke about hiring with my friends Karthik Ramasamy (Senior Director Of Engineering at Splunk) and Arun Kejariwal (an experienced engineering leader). They have both hired technical talent across many companies, including startups, enterprise software companies, and social media giants.  The global pandemic has caused a global economic slowdown and massive layoffs across many industry sectors. But companies are still hiring and they are still competing fiercely for technical talent.
  • Gradient Flow videos  We continue to add content to our YouTube channel. We now have video versions of recent podcast episodes, alongside our popular weekly 2-minute snapshot. Make sure to subscribe.

[Image: Meiji Jingu Garden in Tokyo, by Ben Lorica]

Machine Learning tools and infrastructure

  • Beyond Accuracy: Behavioral Testing of NLP models with CheckList   This just won the Best Paper Award at ACL 2020 (the annual conference of the Assoc. of Computational Linguistics).  NLP models are usually evaluated based on their accuracy on a single statistic after data is split into training & validation data sets. The authors of this paper drew inspiration from behavioral testing in software engineering and propose CheckList: “a model-agnostic and task-agnostic testing methodology that tests individual capabilities of the model using three different test types”.  It’s an interesting step towards identifying NLP models that generalize better.
  • Bringing hyperparameter tuning to scikit-learn    Outside of the deep learning libraries, scikit-learn is probably the most popular machine learning library in use today.  A new project called tune-sklearn (built on top of Ray), allows users to employ cutting edge hyperparameter tuning techniques while staying within the scikit-Learn API.
  • Machine Learning is still not reliable technology  “having 88.5% Top-1 accuracy on ImageNet—while a stunning achievement—doesn’t tell us how to get to systems with failure rates on the order of 1 in a billion. As Boeing has tragically shown, cutting corners on autonomous system safety standards has horrible, tragic consequences.” There are several good points about ML in this new essay by Ben Recht. Here’s another one: “greedy contextual bandit” algorithms are prime candidates for applying reinforcement learning in the enterprise.
  • The Computational Limits of Deep Learning  I’ve long believed that we need the ML research community to start exploring other machine learning methods. The authors show that “across many areas of deep learning, progress in training models has depended on large increases in the amount of computing power being used”. If we continue down this path the computational requirements for training large neural network models will become economically, technically, and environmentally unsustainable. [I plotted a couple of charts using a key table in the paper in this recent post.]

Virtual Conferences

  • Ray Summit (new speakers)    Dave Patterson (2017 Turing Award recipient), Oriol Vinyals (of DeepMind), and Matt Honnibal (co-creator of spaCy) join an outstanding lineup at the first Ray Summit, a FREE virtual conference centered on “Scalable machine learning, scalable Python, for everyone”. Register HERE
  • NLP Summit   I am the program chair of this new FREE virtual conference focused on practical applications of Natural Language Processing. There has been so much progress in NLP in recent years, and there’s so much excitement surrounding record-setting neural models (GPT-3, BERT and it’s variants, etc.).  We will continue to add speakers to an already impressive roster, register HERE.
  • Practical ML Series     I am hosting a series of virtual events for Databricks, the first one anchored by Sean Owen and Clemens Mewald, is slated for August 27th. 

Work and Hiring

[Image: Fractured Water, upper Skógafoss by Dean Wampler.]


  • Humankind: A Hopeful History   I’ve been looking forward to Rutger Bregman’s new book, it’s the perfect read for this shelter-in-place and global pandemic era. Bregman cites many studies that run counter to conventional wisdom: humans are wired to be “cooperative rather than competitive”, and human evolution was really about the “survival of the friendliest”. It’s backed by studies, including fresh takes on some of the most cited research projects in the social sciences. I highly recommend this book. I highly recommend this book. 
  • Analyzing wastewater may assist census takers   Dutch researchers recently showed that sewage – wastewater-based epidemiology (WBE) – can be used to profile an area’s population. Beyond some not so surprising correlations (“alcohol and caffeine were associated with high-rent districts”), they also devised a model that utilized WBE to predict socio-economic characteristics of areas covered by specific sewage plants.
  • Millions of Americans Can’t Afford Water  An alarming new study commissioned by the Guardian and Consumer Reports.
  • Safety and Natural Language Generation models   Jerome Presenti, Head of AI at Facebook, has a thoughtful Twitter thread on the safety of large NLG models. Lauren Kunze made similar points during our recent conversation: to the extent that these models are partially trained using Reddit data, safeguards need to be in place before they are deployed to production.

Subscribe to our newsletter, our YouTube channel, and to the Data Exchange podcast.

%d bloggers like this: