I’ll be hosting a webcast next week – featuring Alex Bordei – on a topic that should be of interest to anyone building data applications and data products: When harnessed correctly, hardware can generate performance improvements in software of up to 60% in an existing setup, with zero or minimal investment. In this webcast AlexContinue reading “Best Practices for Optimizing Infrastructure Performance and Budget”
Tag Archives: data engineer
5 Fun Facts about HBase that you didn’t know
HBase has made inroads in companies across many industries and countries [A version of this post appears on the O’Reilly Data blog.] With HBaseCon right around the corner, I wanted to take stock of one of the more popular1 components in the Hadoop ecosystem. Over the last few years, many more companies have come toContinue reading “5 Fun Facts about HBase that you didn’t know”
Bridging the gap between research and implementation
[A version of this post appears on the O’Reilly Data blog.] One of the most popular offerings at Strata Santa Clara was Hardcore Data Science day. Over the next few weeks we hope to profile some of the speakers who presented, and make the video of the talks available as a bundle. In the meantimeContinue reading “Bridging the gap between research and implementation”
Big Data solutions through the combination of tools
[A version of this post appears on the O’Reilly Data blog and Forbes.] As a user who tends to mix-and-match many different tools, not having to deal with configuring and assembling a suite of tools is a big win. So I’m really liking the recent trend towards more integrated and packaged solutions. A recent exampleContinue reading “Big Data solutions through the combination of tools”
Data Scientists and Data Engineers like Python and Scala
[A version of this post appears on the O’Reilly Strata blog.] In exchange for getting personalized recommendations many Meetup members declare1 topics that they’re interested in. I recently looked at the topics listed by members of a few local, data Meetups that I’ve frequented. These Meetups vary in size from 600 to 2,000 total (andContinue reading “Data Scientists and Data Engineers like Python and Scala”
Data Analysis: Just one component of the Data Science workflow
[A version of this post appears on the O’Reilly Strata blog.] Judging from articles in the popular press the term data scientist has increasingly come to refer to someone who specializes in data analysis (statistics, machine-learning, etc.). This is unfortunate since the term originally described someone who could cut across disciplines. Far from being confinedContinue reading “Data Analysis: Just one component of the Data Science workflow”
It’s getting easier to build Big Data Applications
[A version of this post appears on the O’Reilly Strata blog.] Hadoop’s low-cost, scale-out architecture has made it a new platform for data storage. With a storage system in place, the Hadoop community is slowly building a collection of open source, analytic engines. Beginning with batch processing (MapReduce, Pig, Hive), Cloudera has added interactive SQLContinue reading “It’s getting easier to build Big Data Applications”