[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Joe Hellerstein on data wrangling, distributed systems, and metadata services. In this episode of the O’Reilly Data Show, I spoke with one of the most popular speakers at Strata+Hadoop World: Joe Hellerstein, Professor of Computer Science at UC Berkeley andContinue reading “Metadata services can lead to performance and organizational improvements”
Category Archives: Data Engineer
Compressed representations in the age of big data
[A version of this post appears on the O’Reilly Radar.] Emerging trends in intelligent mobile applications and distributed computing When developing intelligent, real-time applications, one often has access to a data platform that can wade through and unlock patterns in massive data sets. The back-end infrastructure for such applications often relies on distributed, fault-tolerant, scaleoutContinue reading “Compressed representations in the age of big data”
Investing in big data technologies
The O’Reilly Data Show podcast: A fireside chat with Ben Horowitz, plus Reynold Xin on the rise of Apache Spark in China. [A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In this special holidayContinue reading “Investing in big data technologies”
Building a scalable platform for streaming updates and analytics
[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Evan Chan on the early days of Spark+Cassandra, FiloDB, and cloud computing. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In this episode of the O’Reilly Data Show, IContinue reading “Building a scalable platform for streaming updates and analytics”
Graph databases are powering mission-critical applications
The O’Reilly Data Show Podcast: Emil Eifrem on popular applications of graph technologies, cloud computing, and company culture. [This piece was co-written by Shannon Cutt. A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. WhileContinue reading “Graph databases are powering mission-critical applications”
Architecting big data applications in the cloud
The O’Reilly Data Show podcast: Jai Ranganathan on the Hadoop ecosystem, the recent surge in interest in all things real time, and developments in hardware. [This piece was co-written by Shannon Cutt. A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities andContinue reading “Architecting big data applications in the cloud”
Building systems for massive scale data applications
The O’Reilly Data Show podcast: Tyler Akidau on the evolution of systems for bounded and unbounded data processing. [This piece was co-written by Shannon Cutt. A version of this post appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. ManyContinue reading “Building systems for massive scale data applications”
How intelligent data platforms are powering smart cities
[A version of this post appears on the O’Reilly Radar.] Smart cities and smart nations run on data. According to a 2014 U.N. report, 54% of the world’s population resides in urban areas, with further urbanization projected to push that share up to 66% by the year 2050. This projected surge in population has encouragedContinue reading “How intelligent data platforms are powering smart cities”
Resolving transactional access and analytic performance trade-offs
[A version of this article appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Todd Lipcon on hybrid and specialized tools in distributed systems. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In recent months, I’ve been hearing about hybrid systems designed toContinue reading “Resolving transactional access and analytic performance trade-offs”
Specialized and hybrid data management and processing engines
A new crop of interesting solutions for the complexity of operating multiple systems in a distributed computing setting The 2004 holiday shopping season marked the start of Amazon’s investigation into alternative database technologies that led to the creation of DynamoDB — a key-value storage system that went onto inspire several NoSQL projects. A new groupContinue reading “Specialized and hybrid data management and processing engines”
