This edition has 710 words which will take you about 4 minutes to read.
“Most people who have the data are in power. And most people who are powerless do not have data.” – Cathy O’Neil
Data Exchange podcast
- From Python beginner to seasoned software engineer Renowned programmer and author, Joel Grus, previously served as a Senior Research Engineer at the Allen Institute for AI, where he was a core engineer on AllenNLP, a PyTorch-based library for NLP research. We discuss his evolution as a programmer and data scientist, and his fantastic new book “Ten Essays on Fizz Buzz”.
- Assessing Models and Simulations of Epidemic Infectious Diseases With COVID-19 infection rates surging across the US, I check back with Bruno Gonçalves, a data scientist who spent several years as a researcher focused on mathematical models in epidemiology.
Machine Learning tools and infrastructure
- What you need to look for in a model server to build ML-powered services In this new post (co-written with Ion Stoica) we examine model servers, software at the heart of machine learning services that operate in real-time or offline.
- Five questions investors are asking about Snowflake’s IPO A short but comprehensive overview of the enterprise data warehouse market.
- One Simple Chart: Technology Adoption in the U.S. Based on a massive survey from the second half of 2018, this study shows that we are still in the very early stages of adoption of machine learning and related technologies.
- 2020 AI Whitepaper from Tencent Meanwhile in China … This paper describes how Chinese technology giant Tencent views AI: “we believe that artificial intelligence is entering the stage of integration of technology and industry”. They use the term “ubiquitous intelligence” to describe how AI capabilities will be used across entire industries and domains. [Summary here.]
- Ignore the minor version number Neil Conway lists the many new cool features in the new release of the Determined Training Platform for deep learning.
Virtual Conferences
- NLP Summit The preliminary program is out! Featured speakers include Clément Delangue (CEO at Hugging Face), Piero Molino (creator of Ludwig), Dirk Groeneveld (of AllenNLP), Joel Grus, Kira Radinsky, Amy Heineike and more. Paco Nathan will give a keynote on the results of our NLP Industry Survey. Marco Túllio Ribeiro (of Microsoft Research) will give a talk on a recent project which won a Best Paper Award at ACL 2020.
- Applications of RL to business process simulation, automation, and optimization A great overview by Max Pumperla, engineer at Pathmind and maintainer of Hyperopt.
- Comprehensive Review of Deep Reinforcement Learning Methods and Applications in Economics This 2020 survey paper lists an amazing number of use cases in stock pricing and investing, insurance, auctions, banking and online markets, macroeconomics, and financial risk management. The upcoming Ray Summit (a free, virtual conference) has numerous sessions from financial services companies, including a keynote by Manuela Veloso (Head of J.P. Morgan AI Research). Manuela will describe how they use RL in electronic trading models.
Work and Hiring
- Interviewing Programmers “Accept that there are different perspectives on competence, and evaluate candidates based on their criteria.”
- Why OKRs (Objectives and Key Results) may not be for your company
- What I Learned from Doing 60+ Technical Interviews in 30 Days
- Parallels between culinary school & coding bootcamps
Recommendations
- A Thousand Cuts This brilliant new documentary about social media disinformation centers around events in the Philippines and attacks against award winning journalist Maria Ressa and her team at Rappler. [Bonus: Official theme song by Ruby Ibarra]
- Calling Bullshit: The Art of Skepticism in a Data-Driven World A timely book in the era of increasingly sophisticated disinformation and gaslighting. As the authors describe it “New-school bullshit uses the language of math and science and statistics to create the impression of rigor and accuracy”. This is a must-read for those who have to attend meetings at large organizations.
- Question → NLP → SQL Previous Natural Language database query tools only seem to work well during demos. Hopefully this GPT-3 based demo is the start of something better.
- Bias in machine learning … speech recognition edition A new PNAS paper analyzed five state-of-the-art automatic speech recognition models (from Amazon, Apple, Google, IBM, and Microsoft) and found all of them exhibited substantial racial disparities.
- Most popular YouTuber in each country A set of graphics that list top YouTube personalities and estimates their annual take (millions of dollars each year for the most popular personalities).
Subscribe to our newsletter, our YouTube channel, and to the Data Exchange podcast.