[A version of this post appears on the O’Reilly Radar.]
The O’Reilly Data Show Podcast: Karthik Ramasamy on Heron, DistributedLog, and designing real-time applications.
Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data, data science, and AI. Find us on Stitcher, TuneIn, iTunes, SoundCloud, RSS.
In this episode of the Data Show, I spoke with Karthik Ramasamy, adjunct faculty member at UC Berkeley, former engineering manager at Twitter, and co-founder of Streamlio. Ramasamy managed the team that built Heron, an open source, distributed stream processing engine, compatible with Apache Storm. While Ramasamy has seen firsthand what it takes to build and deploy large-scale distributed systems (within Twitter, he worked closely with the team that built DistributedLog), he is first and foremost interested in designing and building end-to-end applications. As someone who organizes many conferences, I’m all too familiar with the vast array of popular big data frameworks available. But, I also know that engineers and architects are most interested in content and material that helps them cut through the options and decisions.
Ramasamy and I discussed the importance of designing systems that can be combined to produce end-to-end applications with the requisite characteristics and guarantees.
Here are some highlights from our conversation:
Moving from Apache Storm to Heron
A major consideration was that we had to fundamentally change a lot of things. So, the team weighed the cost: should we go with an existing code base or develop a new code base? We thought that even if we developed a new code base, we would be able to get it done very quickly and the team was excited about it. That’s what we did and we got the first version of Heron done in eight or nine months.
I think it was one of the quickest transitions that ever happened in the history of Twitter. Apache Storm was hit by a lot of failures. There was a strong incentive to move to a new system. Once we proved the new system was highly reliable, we created a compelling value for the engineering teams. We also made it very painless for people to move. All they had to do was recompile a job and launch it. So, when you make a system like that, then people are just going to say, ‘let me give it a shot.’ They just compile it, launch it, then they say, ‘for a week, my job has been running without any issues; that’s good, I’m moving.’ So, we got migration done, from Storm to Heron, in less than six months. All the teams cooperated with us, and it was just amazing that we were able to get it done in less than six months. And we provided them a level of reliability that they never had with Storm.
Continue reading “Architecting and building end-to-end streaming applications”