Gradient Flow #34: Modernizing Data Governance, DataOps for ML, Declarative Interfaces

Subscribe • Previous Issues

This edition has 510 words which will take you about 3 minutes to read.

“If something cannot go on forever it will stop.” – Herbert Stein

Data Exchange podcast

  • Injecting Software Engineering Practices and Rigor into Data Governance  As the amount and importance of data grows within organizations, there is growing interest in tools that enable them to strategically utilize, manage, and unlock their data resources. I speak with Steve Touw, cofounder and CTO of Immuta,  a startup at the forefront of data governance, data discovery, data privacy and security.
  • AI Beyond Automation   Jenn Webb and I sit down with Jerry Overton, who up until recently served as a DXC Fellow, Head of AI at DXC Technology. One of the things we discussed was his leadership role in helping establish a Center of Excellence for AI within DXC.


Data & Machine Learning tools and infrastructure

  • Why you should build your AI Applications with Ray   Ion Stoica and I explain why Ray is the ideal platform for building a diverse set of compute-intensive applications.
  • Data Validation for Machine Learning Models and Applications   A special edition of IEEE’s Data Engineering Bulletin focused on data quality and data validation in the context of MLOps and Responsible AI
  • TabNet   Deep learning has taken over computer vision (images, video), speech technologies, and most recently natural language models.  But many companies continue to need models for structured data, and for tabular data, decision trees and XGBoost still reign supreme.  I’ve been playing with TabNet and I suspect that once such models can be made available to non-experts – through a declarative interface – deep learning will begin capturing its share of models for structured data in the future. 
  • The NLP Index   This new site houses 3,000+ [code repositories + papers], organized in a 2-level taxonomy that captures the most important topics in natural language technologies. The breadth of tools and techniques to choose from bodes well for companies who are developing tools that can vastly simplify things for developers. Think  AutoNLP solutions or even declarative interfaces along the lines of Ludwig for deep learning.

[Image: Japan by SGL.]


  • Graph Deep Learning    Slides from a recent talk by Simone Scardapane.
  • Unpacking Tiger Global’s Venture Capital playbook
  • Geopolitical Alpha   My first job after academia was as lead quant at a hedge fund and ever since I’ve been an avid reader of books about the industry. My favorite topic to read about (and my favorite hedge fund style) is global macro, which can be broadly described as trades that profit from political or economic events. With that said, you need not be a finance junkie to benefit from this book. The author introduces a broadly applicable and compelling “forecasting framework” that non-traders would benefit from.
  • New AI regulations are coming … Are you ready?    Brush up on the three trends that unite current and proposed AI regulations.

Featured Virtual Conference

I helped put together the outstanding program for the upcoming Data+AI Summit, a FREE virtual conference with over a hundred sessions on data infrastructure, analytics, data science and machine learning.  Among the keynote speakers is 2014 Nobel Laureate, Malala Yousafzai.  This event takes place May 24-28:

Register Now

Funding Updates

Closing short: A taxonomy discovered through Charles Martin (on Linkedin).

If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe:

%d bloggers like this: