There are many fantastic talks at Strata and it can be overwhelming to navigate the schedule. I plan to list talks I’m hoping to catch in a series of “time-turner” posts (check this blog on Wed/Thu at 10 a.m.). But for now let me highlight talks from a few categories:
Graphs and Network Analysis:
- Large-scale Machine Learning Cookbook using GraphLab (a 3-hour tutorial led by Carlos Guestrin)
- AMP Camp 4 will include a presentation on, and hands-on training with, GraphX (a new graph processing and analytics tool built on top of Spark)
- Adaptive Adversaries: Building Systems to Fight Fraud and Cyber Intruders
- Network Science Made Simple: SNA for Pie Chart Makers
- Friending Graph Analytics: Large-Scale Graph Processing Made Easy
- Graph All The Things! 11 Graph Database Use Cases That Aren’t Social
- Graph Analysis with One Trillion Edges on Apache Giraph
- Socializing Search. Professionally.
Time-series:
- David Andrzejewski is hosting an interesting series of talks on Machine Data
- How Twitter Monitors Millions of Time-series
- Working With Time Series Data Using Apache Cassandra
Data visualization:
- We have sessions by people who built some of the most popular visualization tools data scientists have come to rely on (ggplot2, d3.js, Superconductor, and Processing): see my O’Reilly Data post for details.
- Information Visualization for Large-Scale Data Workflows
- Minority Report Meets Big Data: Touch and Interactive Big Data is Here
- Napoleon’s March to d3.js: The Future of Big, Real-time Interactive Data Visualization
- Unlocking the Secrets of Gertrude Stein
Crowdsourcing tips for Data Scientists:
- Crowdsourcing at Locu: How I Learned to Stop Worrying and Love the Crowd
- Organizing Big Data with the Crowd
Pydata:
- There will also be luminaries from the Pydata community who will be presenting at Strata. Fortunately Brian Granger already wrote a great post that highlights the Pydata talks at Santa Clara.