Alibaba ♥ Spark: Next time someone asks you if Apache Spark scales, point them to this recent post by Chinese e-commerce juggernaut Alibaba. What particularly caught my eye is the company’s heavy usage of GraphX, Spark’s library for graph analytics.
[Full disclosure: I’m an advisor to Databricks, a startup commercializing Apache Spark.]
Narrative Recommendations: When NarrativeScience started out, I thought of it primarily as a platform for generating short, factual stories for (hyperlocal) news services (a newer startup OnlyBoth seems to be focused on this, their working example being the use of “box scores” to cover “college” teams). More recently NarrativeScience has aimed its technology at the lucrative Business Intelligence market. Starting from structured data, NarrativeScience extracts and ranks facts, and weaves a narrative arc that analysts consume. The company retains the traditional elements of BI tools (tables, charts, dashboards) and supplements it with narrative summaries and recommendations. I like the concept of adding narrative outputs, and as with all relatively new technologies, the algorithms and accompanying user interfaces are bound to get better over time. The technology is largely “language” agnostic, but to reap maximum benefit it does need to be tuned for the specific domain you want to use it in.
With spreadsheets, you have to calculate. With visualizations, you have to interpret. With narratives, all you have to do is read.