My co-organizer Ben Recht and I are proud to announce the return of Hardcore Data Science day to Strata+Hadoop World in California. We have outstanding speakers – 11 talks in total – and I expect the track to sell out (as it has done in the past). Deep Learning enthusiasts will enjoy sessions on itsContinue reading “Hardcore Data Science day: Strata+Hadoop World 2015”
Author Archives: Ben Lorica
Building Apache Kafka from scratch
[A version of this post originally appeared on the O’Reilly Radar blog.] In this episode of the O’Reilly Data Show Podcast, Jay Kreps talks about data integration, event data, and the Internet of Things. At the heart of big data platforms are robust data flows that connect diverse data sources. Over the past few years,Continue reading “Building Apache Kafka from scratch”
Decoding bitcoin and the blockchain
[A version of this post originally appeared on the O’Reilly Radar blog.] When the creators of bitcoin solved the “double spend” problem in a decentralized manner, they introduced techniques that have implications far beyond digital currency. Our newly announced one-day event — Bitcoin & the Blockchain: An O’Reilly Radar Summit — is in line withContinue reading “Decoding bitcoin and the blockchain”
The Future of Bitcoin
I’m hosting a webcast on Dec 3rd – featuring Kieren James-Lubin – titled The Future of Bitcoin, A Data-Driven Perspective: The Bitcoin ecosystem in 2014 is often compared to the internet in 1993. Taking a holistic, data-driven perspective, I’ll project where Bitcoin might be in a decade.
The science of moving dots: the O’Reilly Data Show Podcast
Rajiv Maheswaran talks about the tools and techniques required to analyze new kinds of sports data [This post originally appeared on the O’Reilly Radar blog.] Editor’s note: you can subscribe to the O’Reilly Data Show Podcast through iTunes, SoundCloud or through our RSS feed. Many data scientists are comfortable working with structured operational data andContinue reading “The science of moving dots: the O’Reilly Data Show Podcast”
Spark + Cassandra: Technical Integration Details
I’ll be hosting a Nov 12th webcast on two of the most popular components in the big data ecosystem: Apache Spark and Apache Cassandra. As highlighted in a recent Databricks blog post, recent improvements to Spark’s shuffle have led to significant speedups (Spark is faster than Hadoop MapReduce, even on disk). While Spark has longContinue reading “Spark + Cassandra: Technical Integration Details”
Anomaly Detection with ElasticSearch
One of the technologies that I’m hearing more about is ElasticSearch. In particular the combination of ElasticSearch, Logstash, and Kibana (the ELK stack) has proven to be a popular platform for real-time analytics on both structured and unstructured data. I’ll be hosting a webcast on October 30th on the ELK stack featuring Mark Harwood, softwareContinue reading “Anomaly Detection with ElasticSearch”
Time-turner: Strata NYC 2014, day 2
There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 2 (maybe 3) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day wouldContinue reading “Time-turner: Strata NYC 2014, day 2”
Time-turner: Strata NYC 2014, day 1
There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 2 (maybe 3) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day wouldContinue reading “Time-turner: Strata NYC 2014, day 1”
Unboxing Apache Spark 1.1
Apache Spark version 1.1 shipped a few weeks ago. I’ve been enjoying enhancements to MLlib, Spark SQL, and Spark Streaming. Next week I’ll be hosting a webcast with Spark’s release manager – and Databricks co-founder – Patrick Wendell. (Full disclosure: I’m an advisor to Databricks.) In this webcast, Patrick Wendell from Databricks will be speakingContinue reading “Unboxing Apache Spark 1.1”
