[A version of this post appears on the O’Reilly Radar.]
The O’Reilly Data Show Podcast: Leo Meyerovich on building large-scale, interactive applications that enable visual investigations.
In this episode of the Data Show, I spoke with Leo Meyerovich, co-founder and CEO of Graphistry. Graphs have always been part of the big data revolution (think of the large graphs generated by the early social media startups). In recent months, I’ve come across companies releasing and using new tools for creating, storing, and (most importantly) analyzing large graphs. There are many problems and use cases that lend themselves naturally to graphs, and recent advances in hardware and software building blocks have made large-scale analytics possible.
Starting with his work as a graduate student at UC Berkeley, Meyerovich has pioneered the combination of hardware and software acceleration to create truly interactive environments for visualizing large amounts of data. Graphistry has built a suite of tools that enables analysts to wade through large data sets and investigate business and security incidents. The company is currently focused on the security domain—where it turns out that graph representations of data are things security analysts are quite familiar with.
Here are some highlights from our conversation:
Graphs as the front end for machine learning
They’re really flexible. First of all, there’s a pure analytic reason in that there are certain types of queries that one could do efficiently with a graph database. If you needed do a bunch of joins, graphs are really great at that. … Companies want to get into stuff like 360-degree views of things; they want to understand correlations to actually explain what’s going on at a more intelligent level.
… I think that’s where graphs really start to shine. Because companies deal with pretty heterogeneous data, and a graph ends up being a really easy way to deal with that. A lot of questions are basically, “What’s nearby?”—almost like your nearest neighbor type of stuff; the graph becomes, both at the query level and at the visual level, very interpretable. I now have a hypothesis about graphs as being the front end and the UI for machine learning, but that might be a topic for another day.