Building enterprise data applications with open source components

[A version of this article appears on the O’Reilly Radar.] The O’Reilly Data Show podcast: Dean Wampler on bounded and unbounded data processing and analytics. Subscribe to the O’Reilly Data Show Podcast to explore the opportunities and techniques driving big data and data science. I first found myself having to learn Scala when I startedContinue reading “Building enterprise data applications with open source components”

A compelling family of DSLs for Data Science

[A version of this post appears on the O’Reilly Data blog.] An important reason why pydata tools and Spark appeal to data scientists is that they both cover many data science tasks and workloads (Spark users can move seamlessly between batch and streaming). Being able to use the same programming style and syntax for workflowsContinue reading “A compelling family of DSLs for Data Science”

Data Scientists and Data Engineers like Python and Scala

[A version of this post appears on the O’Reilly Strata blog.] In exchange for getting personalized recommendations many Meetup members declare1 topics that they’re interested in. I recently looked at the topics listed by members of a few local, data Meetups that I’ve frequented. These Meetups vary in size from 600 to 2,000 total (andContinue reading “Data Scientists and Data Engineers like Python and Scala”