There are many use cases for graph databases and analytics

Business users are becoming more comfortable with graph analytics [A version of this post appears on the O’Reilly Radar blog.] The rise of sensors and connected devices will lead to applications that draw from network/graph data management and analytics. As the number of devices surpasses the number of people — Cisco estimates 50 billion connectedContinue reading “There are many use cases for graph databases and analytics”

A growing number of applications are being built with Spark

Many more companies are willing to talk about how they’re using Apache Spark in production [A version of this post appears on the O’Reilly Data blog.] One of the trends we’re following closely at Strata is the emergence of vertical applications. As components for creating large-scale data infrastructures enter their early stages of maturation, companiesContinue reading “A growing number of applications are being built with Spark”

Welcome to Intelligence Matters

Casting a critical eye on the exciting developments in the world of AI [A version of this post appears on the O’Reilly Radar blog and Forbes.] Editor’s note: this post was co-authored by Ben Lorica and Roger Magoulas Today the O’Reilly Radar is kicking off Intelligence Matters (IM), a new series exploring current issues inContinue reading “Welcome to Intelligence Matters”

The re-emergence of Time-series

[A version of this post appeared on the O’Reilly Strata and Radar blogs.] My first job after leaving academia was as a quant1 for a hedge fund, where I performed (what are now referred to as) data science tasks on financial time-series. I primarily used techniques from probability & statistics, econometrics, and optimization, with occasionalContinue reading “The re-emergence of Time-series”

MLbase: Scalable Machine-learning made accessible

[Cross-posted on the O’Reilly Strata blog.] In the course of applying machine-learning against large data sets, data scientists face a few pain points. They need to tune and compare several suitable algorithms – a process that may involve having to configure a hodgepodge of tools, requiring different input files, programming languages, and interfaces. Some softwareContinue reading “MLbase: Scalable Machine-learning made accessible”

Seven Reasons I like Spark

[This post originally appeared on the O’Reilly Radar .] A large portion of this week’s Amp Camp at UC Berkeley, is devoted to an introduction to Spark – an open source, in-memory, cluster computing framework. After playing with Spark over the last month, I’ve come to consider it a key part of my big dataContinue reading “Seven Reasons I like Spark”