Investing in big data technologies

The O’Reilly Data Show podcast: A fireside chat with Ben Horowitz, plus Reynold Xin on the rise of Apache Spark in China. [A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In this special holidayContinue reading “Investing in big data technologies”

Apache Spark in the Enterprise and in China

Enterprise Adoption IBM’s announcements at the recent Spark Summit in SF bodes well for enterprise adoption of Spark. Ben Horowitz jokingly referred to IBM’s endorsement as akin to a Rabbi blessing Spark as kosher for use in an enterprise. I recently sat down with a set of luminaries at the Spark Summit and asked themContinue reading “Apache Spark in the Enterprise and in China”

Large-scale Data Science and Machine Learning with Spark

[Full disclosure: I’m an advisor to Databricks.] At last year’s Spark Summit in SF, Ali Ghodsi gave the first public demo of Databricks Cloud and Workspace. As I noted at the time, it was a showstopper! This year Ali gave an update and while I wasn’t on hand to see it in person, judging fromContinue reading “Large-scale Data Science and Machine Learning with Spark”

Apache Spark: Powering applications on-premise and in the cloud

[A version of this post appears on the O’Reilly Radar.] As organizations shift their focus toward building analytic applications, many are relying on components from the Apache Spark ecosystem. I began pointing this out in advance of the first Spark Summit in 2013 and since then, Spark adoption has exploded. With Spark Summit SF rightContinue reading “Apache Spark: Powering applications on-premise and in the cloud”