Graphs as the front end for machine learning

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Leo Meyerovich on building large-scale, interactive applications that enable visual investigations. In this episode of the Data Show, I spoke with Leo Meyerovich, co-founder and CEO of Graphistry. Graphs have always been part of the big data revolution (think ofContinue reading “Graphs as the front end for machine learning”

How machine learning can be used to write more secure computer programs

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Fabian Yamaguchi on the potential of using large-scale analytics on graph representations of code. In this episode of the Data Show, I spoke with Fabian Yamaguchi, chief scientist at ShiftLeft. His 2015 Ph.D. dissertation sketched out how the combination ofContinue reading “How machine learning can be used to write more secure computer programs”

Graph databases are powering mission-critical applications

The O’Reilly Data Show Podcast: Emil Eifrem on popular applications of graph technologies, cloud computing, and company culture. [This piece was co-written by Shannon Cutt. A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. WhileContinue reading “Graph databases are powering mission-critical applications”

Network structure and dynamics in online social systems

Understanding information cascades, viral content, and significant relationships. [A version of this post appears on the O’Reilly Radar blog.] I rarely work with social network data, but I’m familiar with the standard problems confronting data scientists who work in this area. These include questions pertaining to network structure, viral content, and the dynamics of informationContinue reading “Network structure and dynamics in online social systems”

The evolution of GraphLab

[A version of this post appears on the O’Reilly Radar blog.] Editor’s note: Carlos Guestrin will be part of the team teaching Large-scale Machine Learning Day at Strata + Hadoop World in San Jose. Visit the Strata + Hadoop World website for more information on the program. I only really started playing around with GraphLabContinue reading “The evolution of GraphLab”

Bits from the Data Store

Semi-regular field notes from the world of data: Tucked away in the community room at the recent GraphLab conference, I took a few people to a demo by Graphistry, a startup that lets users visually interact and analyze massive amounts of data. In particular their technology can handle and draw many more points than d3.jsContinue reading “Bits from the Data Store”

There are many use cases for graph databases and analytics

Business users are becoming more comfortable with graph analytics [A version of this post appears on the O’Reilly Radar blog.] The rise of sensors and connected devices will lead to applications that draw from network/graph data management and analytics. As the number of devices surpasses the number of people — Cisco estimates 50 billion connectedContinue reading “There are many use cases for graph databases and analytics”

Network Science Dashboards

Networks graphs can be used as primary visual objects with conventional charts used to supply detailed views [A version of this post appears on the O’Reilly Data blog.] With Network Science well on its way to being an established academic discipline, we’re beginning to see tools that leverage it. Applications that draw heavily from thisContinue reading “Network Science Dashboards”

Extending GraphLab to tables

The popular graph analytics framework extends its coverage of the data science workflow [A version of this post appears on the O’Reilly Data blog and Forbes.] GraphLab’s SFrame, an interesting and somewhat under-the-radar tool was unveiled1 at Strata Santa Clara. It is a disk-based, flat table representation that extends GraphLab to tabular data. With theContinue reading “Extending GraphLab to tables”

Graphs, Time-series, Dataviz, and Crowdsourcing at Strata Santa Clara 2014

There are many fantastic talks at Strata and it can be overwhelming to navigate the schedule. I plan to list talks I’m hoping to catch in a series of “time-turner” posts (check this blog on Wed/Thu at 10 a.m.). But for now let me highlight talks from a few categories: Graphs and Network Analysis: Large-scaleContinue reading “Graphs, Time-series, Dataviz, and Crowdsourcing at Strata Santa Clara 2014”