Network structure and dynamics in online social systems

Understanding information cascades, viral content, and significant relationships. [A version of this post appears on the O’Reilly Radar blog.] I rarely work with social network data, but I’m familiar with the standard problems confronting data scientists who work in this area. These include questions pertaining to network structure, viral content, and the dynamics of informationContinue reading “Network structure and dynamics in online social systems”

The evolution of GraphLab

[A version of this post appears on the O’Reilly Radar blog.] Editor’s note: Carlos Guestrin will be part of the team teaching Large-scale Machine Learning Day at Strata + Hadoop World in San Jose. Visit the Strata + Hadoop World website for more information on the program. I only really started playing around with GraphLabContinue reading “The evolution of GraphLab”

Building and deploying large-scale machine learning pipelines

[A version of this post appears on the O’Reilly Radar blog.] There are many algorithms with implementations that scale to large data sets (this list includes matrix factorization, SVM, logistic regression, LASSO, and many others). In fact, machine learning experts are fond of pointing out: if you can pose your problem as a simple optimizationContinue reading “Building and deploying large-scale machine learning pipelines”

A brief look at data science’s past and future

[A version of this post appears on the O’Reilly Radar blog.] Back in 2008, when we were working on what became one of the first papers on big data technologies, one of our first visits was to LinkedIn’s new “data” team. Many of the members of that team went on to build interesting tools andContinue reading “A brief look at data science’s past and future”

Lessons from next-generation data wrangling tools

[A version of this post appears on the O’Reilly Radar blog.] One of the trends we’re following is the rise of applications that combine big data, algorithms, and efficient user interfaces. As I noted in an earlier post, our interest stems from both consumer apps as well as tools that democratize data analysis. It’s noContinue reading “Lessons from next-generation data wrangling tools”

Hardcore Data Science day: Strata+Hadoop World 2015

My co-organizer Ben Recht and I are proud to announce the return of Hardcore Data Science day to Strata+Hadoop World in California. We have outstanding speakers – 11 talks in total – and I expect the track to sell out (as it has done in the past). Deep Learning enthusiasts will enjoy sessions on itsContinue reading “Hardcore Data Science day: Strata+Hadoop World 2015”

Building Apache Kafka from scratch

[A version of this post originally appeared on the O’Reilly Radar blog.] In this episode of the O’Reilly Data Show Podcast, Jay Kreps talks about data integration, event data, and the Internet of Things. At the heart of big data platforms are robust data flows that connect diverse data sources. Over the past few years,Continue reading “Building Apache Kafka from scratch”

Decoding bitcoin and the blockchain

[A version of this post originally appeared on the O’Reilly Radar blog.] When the creators of bitcoin solved the “double spend” problem in a decentralized manner, they introduced techniques that have implications far beyond digital currency. Our newly announced one-day event — Bitcoin & the Blockchain: An O’Reilly Radar Summit — is in line withContinue reading “Decoding bitcoin and the blockchain”

The science of moving dots: the O’Reilly Data Show Podcast

Rajiv Maheswaran talks about the tools and techniques required to analyze new kinds of sports data [This post originally appeared on the O’Reilly Radar blog.] Editor’s note: you can subscribe to the O’Reilly Data Show Podcast through iTunes, SoundCloud or through our RSS feed. Many data scientists are comfortable working with structured operational data andContinue reading “The science of moving dots: the O’Reilly Data Show Podcast”

Time-turner: Strata NYC 2014, day 2

There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 2 (maybe 3) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day wouldContinue reading “Time-turner: Strata NYC 2014, day 2”