Graphs as the front end for machine learning

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Leo Meyerovich on building large-scale, interactive applications that enable visual investigations.

In this episode of the Data Show, I spoke with Leo Meyerovich, co-founder and CEO of Graphistry. Graphs have always been part of the big data revolution (think of the large graphs generated by the early social media startups). In recent months, I’ve come across companies releasing and using new tools for creating, storing, and (most importantly) analyzing large graphs. There are many problems and use cases that lend themselves naturally to graphs, and recent advances in hardware and software building blocks have made large-scale analytics possible.

Starting with his work as a graduate student at UC Berkeley, Meyerovich has pioneered the combination of hardware and software acceleration to create truly interactive environments for visualizing large amounts of data. Graphistry has built a suite of tools that enables analysts to wade through large data sets and investigate business and security incidents. The company is currently focused on the security domain—where it turns out that graph representations of data are things security analysts are quite familiar with.

Here are some highlights from our conversation:

Graphs as the front end for machine learning

They’re really flexible. First of all, there’s a pure analytic reason in that there are certain types of queries that one could do efficiently with a graph database. If you needed do a bunch of joins, graphs are really great at that. … Companies want to get into stuff like 360-degree views of things; they want to understand correlations to actually explain what’s going on at a more intelligent level.

… I think that’s where graphs really start to shine. Because companies deal with pretty heterogeneous data, and a graph ends up being a really easy way to deal with that. A lot of questions are basically, “What’s nearby?”—almost like your nearest neighbor type of stuff; the graph becomes, both at the query level and at the visual level, very interpretable. I now have a hypothesis about graphs as being the front end and the UI for machine learning, but that might be a topic for another day.

Continue reading “Graphs as the front end for machine learning”

Machine learning needs machine teaching

[A version of this post appears on the O’Reilly Radar.]

The O’Reilly Data Show Podcast: Mark Hammond on applications of reinforcement learning to manufacturing and industrial automation.

In this episode of the Data Show, I spoke with Mark Hammond, founder and CEO of Bonsai, a startup at the forefront of developing AI systems in industrial settings. While many articles have been written about developments in computer vision, speech recognition, and autonomous vehicles, I’m particularly excited about near-term applications of AI to manufacturing, robotics, and industrial automation. In a recent post, I outlined practical applications of reinforcement learning (RL)—a type of machine learning now being used in AI systems. In particular, I described how companies like Bonsai are applying RL to manufacturing and industrial automation. As researchers explore new approaches for solving RL problems, I expect many of the first applications to be in industrial automation.

Here are some highlights from our conversation:

Machine learning and machine teaching

Everyone is so focused on making better and faster learning algorithms; what do we do when we have it? Let’s just suppose that you now have an algorithm that can learn as well as or better than humans. How do we use that, how do we apply that in a predictable, scalable, repeatable way toward the objectives that we want to apply it toward?

… I thought about that for a while, and it’s one of those things where the answer is obvious in hindsight, but until you sit down and really chew on it, it doesn’t jump out at you. And it’s that, by design, if you’re building a learning system—if you want to program it—you have to teach it. Machine teaching and machine learning are necessary complements to one another; you need both. And for the large part, most of what comprises machine teaching these days consists of giant label data sets.

… You need machine teaching and machine learning. It dawned on me that this was the core abstraction that was going to make it possible for us to start applying all of this stuff more broadly across all the myriad use cases that we see in the real world without having to turn all of the people who are looking to use it into experts in machine learning and data science. It’s what enabled me to realize what Bonsai’s mission is: to enable your subject matter experts (a chemical engineer or a mechanical engineer, someone who is very, very well versed in whatever their domain is but not necessarily in machine learning or data science) to take that expertise and use it as the foundation for describing what to teach and then automating the underlying pieces for how you can actually effectively learn that.

Continue reading “Machine learning needs machine teaching”