[A version of this post appears on the O’Reilly Radar.] Emerging trends in intelligent mobile applications and distributed computing When developing intelligent, real-time applications, one often has access to a data platform that can wade through and unlock patterns in massive data sets. The back-end infrastructure for such applications often relies on distributed, fault-tolerant, scaleoutContinue reading “Compressed representations in the age of big data”
Category Archives: Data Science
Is 2016 the year you let robots manage your money?
[A version of this post appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Vasant Dhar on the race to build “big data machines” in financial investing. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. In this episode of the O’Reilly Data Show,Continue reading “Is 2016 the year you let robots manage your money?”
Turning big data into actionable insights
[A version of this article appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Evangelos Simoudis on data mining, investing in data startups, and corporate innovation. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. Can developments in data science and big data infrastructureContinue reading “Turning big data into actionable insights”
How intelligent data platforms are powering smart cities
[A version of this post appears on the O’Reilly Radar.] Smart cities and smart nations run on data. According to a 2014 U.N. report, 54% of the world’s population resides in urban areas, with further urbanization projected to push that share up to 66% by the year 2050. This projected surge in population has encouragedContinue reading “How intelligent data platforms are powering smart cities”
We need open and vendor-neutral metadata services
[A version of this article appears on the O’Reilly Radar.] Comprehensive metadata collection and analysis can pave the way for many interesting applications. As I spoke with friends leading up to Strata + Hadoop World NYC 2015, one topic continued to come up: metadata. It’s a topic that data engineers and data management researchers haveContinue reading “We need open and vendor-neutral metadata services”
Hardcore Data Science, NYC 2015
Ben Recht and I hosted another great edition of Hardcore Data Science in NYC yesterday. From the very first talk, the room was full, the audience was attentive, and the energy in the room was high – and it remained that way throughout the day. A summary can be found below. Short detour: Stanford CSContinue reading “Hardcore Data Science, NYC 2015”
From search to distributed computing to large-scale information extraction
Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. February 2016 marks the 10th anniversary of Hadoop — at a point in time when many IT organizations actively use Hadoop, and/or one of the open source, big data projects that originated after, and in someContinue reading “From search to distributed computing to large-scale information extraction”
Bridging the divide: Business users and machine learning experts
[A version of this articles appears on the O’Reilly Radar.] Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. As tools for advanced analytics become more accessible, data scientist’s roles will evolve. Most media stories emphasize a need for expertise in algorithms and quantitative techniquesContinue reading “Bridging the divide: Business users and machine learning experts”
Pattern recognition and sports data
[A version of this article appears on the O’Reilly Radar.] One of my favorite books from the last few years is David Epstein’s engaging tour through sports science using examples and stories from a wide variety of athletic endeavors. Epstein draws on examples from individual sports (including track and field, winter sports) and major U.S.Continue reading “Pattern recognition and sports data”
Understanding neural function and virtual reality
[A version of this article appears on the O’Reilly Radar.] The O’Reilly Data Show Podcast: Poppy Crum explains that what matters is efficiency in identifying and emphasizing relevant data. Like many data scientists, I’m excited about advances in large-scale machine learning, particularly recent success stories in computer vision and speech recognition. But I’m also cognizantContinue reading “Understanding neural function and virtual reality”
