There are so many good talks happening at the same time that it’s impossible to not miss out on good sessions. But imagine I had a time-turner necklace and could actually “attend” 3 (maybe 5) sessions happening at the same time. Taking into account my current personal interests and tastes, here’s how my day would look:
11 a.m.
- Uber’s data science workbench
- The rise of real time: Apache Kafka and the streaming revolution
- A behind-the-scenes look into Spark’s API and engine evolutions
- How Microsoft predicts churn of cloud customers using deep learning and explains those predictions in an interpretable way
11:50 a.m.
- Making Structured Streaming ready for production: Updates and future directions
- Squeezing deep learning onto mobile phones
- Going real time: Creating online datasets for personalization
- Recommending 1+ billion items to 100+ million users in real time
1:50 p.m.
- Executive Briefing: An executive’s guide to understanding advanced analytics in the cloud
- Lessons from a year of supporting Apache Kafka
- Semantic natural language understanding at scale using Spark, machine-learned annotators, and deep-learned ontologies
- Spark Structured Streaming for machine learning
2:40 p.m.
- PyTorch: A flexible and intuitive framework for deep learning
- Sparklyr: An R interface for Apache Spark
- The dangers of statistical significance when studying weak effects in big data: From natural experiments to p-hacking
- Processing millions of events per second without breaking the bank
4:20 p.m.
- Spark at scale in Bing: Use cases and lessons learned
- Why the next wave of data lineage is driven by automation, visualization, and interaction
- Delivering relevant filtered news to save hours of drudgery each day for fixed-income securities analysts
- Deep learning for IT operations intelligence using open source tools
5:10 p.m.