Gradient Flow #39: Becoming TikTok, Next-gen Workflow Orchestration and Forecasting

Subscribe • Previous Issues

This edition has 450 words which will take you about 2 minutes to read.

“There’s always a way if you’re not in a hurry.” – Paul Theroux

Data Exchange podcast

  • Towards a next-generation data orchestrator   Chris White is the CTO of Prefect, a startup building tools to help companies build, monitor, and manage dataflows. Prefect originated from lessons Chris and his co-founder learned while they were at Capital One, where they were early users and contributors to related projects like Apache Airflow.
  • Building a flexible, intuitive, and fast forecasting library    Reza Hosseini and Albert Chen of Linkedin, are part of the team behind one my favorite new open source tools: Greykite, a flexible and fast library for time-series forecasting.
[Image: Books in Byeol-Madang Library, at the Starfield COEX Shopping Mall in Seoul from Wikimedia.]

Data & Machine Learning tools and infrastructure

  • The Road to Intelligent Process Automation  We examine the state of process automation technologies in the Fortune 1000 and in key technology hubs in the US.
  • BytePlus   According to the FT, this new division of ByteDance is selling the technology that powers its viral video app TikTok to websites and apps outside China. BytePlus has several SaaS offerings including recommendation models and tools for testing new data products and services. Given the rather frosty relationship between China and the West, BytePlus faces an uphill battle in Europe, the Five Eyes, and their allies.
  • Julia: Fast as Fortran, Beautiful as Python
  • EdgeQL  A new, strictly typed query language that aims to surpass SQL for graph applications (the parent project EdgeDBstores and describes data as strongly typed objects and relationships between them). It is functional in nature and designed to be composable and easy to learn.
  • IBM open sources CodeFlare   Built on top of Ray, CodeFlare simplifies the integration and scaling of analytic and machine learning workflows in hybrid clouds.
  • The Geography of Open Source Software  A team of economists measure open source software contribution from 2010-2020 at a national, regional, and local level using data from GitHub and adjacent platforms. The overall share of active developers has become more evenly distributed between countries, but in a nod to the importance of technology hubs, within-country regional differences persist. They hope to include GitLab and Bitbucket in future versions.

2021 Data Engineering Survey

Tell us which data tools you are most likely to adopt in the next 12-24 months—and what criteria your DataOps team uses to evaluate them.  The survey takes about 5 minutes to fill out and we’ll share the report of the survey findings with you. You’ll also be entered in a drawing for free copies of Jesse Anderson’s Data Teams book and other prizes.

Begin Survey

Recommendations

Closing Short: When the media is a few steps ahead of the story.

If you enjoyed this newsletter please support our work by encouraging your friends and colleagues to subscribe:

%d bloggers like this: