A scalable time-series database that supports SQL

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Michael Freedman on TimescaleDB and scaling SQL for time-series. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS. In this episodeContinue reading “A scalable time-series database that supports SQL”

Semi-supervised, unsupervised, and adaptive algorithms for large-scale time series

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Ira Cohen on developing machine learning tools for a broad range of real-time applications. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes,Continue reading “Semi-supervised, unsupervised, and adaptive algorithms for large-scale time series”

Building self-service tools to monitor high-volume time-series data

[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Phil Liu on the evolution of metric monitoring tools and cloud computing. One of the main sources of real-time data processing tools is IT operations. In fact, a previous post I wrote on the re-emergence of real-time, was to aContinue reading “Building self-service tools to monitor high-volume time-series data”

Redefining power distribution using big data

[A version of this post appears on the O’Reilly Radar blog.] The O’Reilly Data Show Podcast: Erich Nachbar on testing and deploying open source, distributed computing components. When I first hear of a new open source project that might help me solve a problem, the first thing I do is ask around to see ifContinue reading “Redefining power distribution using big data”

Graphs, Time-series, Dataviz, and Crowdsourcing at Strata Santa Clara 2014

There are many fantastic talks at Strata and it can be overwhelming to navigate the schedule. I plan to list talks I’m hoping to catch in a series of “time-turner” posts (check this blog on Wed/Thu at 10 a.m.). But for now let me highlight talks from a few categories: Graphs and Network Analysis: Large-scaleContinue reading “Graphs, Time-series, Dataviz, and Crowdsourcing at Strata Santa Clara 2014”

How Twitter monitors millions of time-series

[A version of this post appears on the O’Reilly Strata blog.] One of the keys to Twitter’s ability to process 500 millions tweets daily is a software development process that values monitoring and measurement. A recent post from the company’s Observability team detailed the software stack for monitoring the performance characteristics of software services, andContinue reading “How Twitter monitors millions of time-series”

Surfacing anomalies and patterns in Machine Data

[A version of this post appears on the O’Reilly Strata blog.] I’ve been noticing that many interesting big data systems are coming out of IT operations. These are systems that go beyond the standard “capture/measure, display charts, and send alerts”. IT operations has long been a source of many interesting big data1 problems and IContinue reading “Surfacing anomalies and patterns in Machine Data”

The re-emergence of Time-series

[A version of this post appeared on the O’Reilly Strata and Radar blogs.] My first job after leaving academia was as a quant1 for a hedge fund, where I performed (what are now referred to as) data science tasks on financial time-series. I primarily used techniques from probability & statistics, econometrics, and optimization, with occasionalContinue reading “The re-emergence of Time-series”

Mining Time-series with Trillions of Points: Dynamic Time Warping at scale

Take a similarity measure that’s already well-known to researchers who work with time-series, and devise an algorithm to compute it efficiently at scale. Suddenly intractable problems become tractable, and Big Data mining applications that use the metric are within reach. The classification, clustering, and searching through time series have important applications in many domains. InContinue reading “Mining Time-series with Trillions of Points: Dynamic Time Warping at scale”