[A version of this post appears on the O’Reilly Data blog.] I use a variety of tools for advanced analytics, most recently I’ve been using Spark (and MLlib), R, scikit-learn, and GraphLab. When I need to get something done quickly, I’ve been turning to scikit-learn for my first pass analysis. For access to high-quality, easy-to-use,Continue reading “Six reasons why I recommend scikit-learn”
Tag Archives: pydata
Data Scientists and Data Engineers like Python and Scala
[A version of this post appears on the O’Reilly Strata blog.] In exchange for getting personalized recommendations many Meetup members declare1 topics that they’re interested in. I recently looked at the topics listed by members of a few local, data Meetups that I’ve frequented. These Meetups vary in size from 600 to 2,000 total (andContinue reading “Data Scientists and Data Engineers like Python and Scala”
Data analysis tools target non-experts
[A version of this post appears on the O’Reilly Strata blog.] A new set of tools make it easier to do a variety of data analysis tasks. Some require no programming, while other tools make it easier to combine code, visuals, and text in the same workflow. They enable users who aren’t statisticians or dataContinue reading “Data analysis tools target non-experts”
Data Science tools: Are you “all in” or do you “mix and match”?
[A version of this post appears on the O’Reilly Strata blog.] An integrated data stack boosts productivity As I noted in my previous post, Python programmers willing to go “all in”, have Python tools to cover most of data science. Lest I be accused of oversimplification, a Python programmer still needs to commit to learningContinue reading “Data Science tools: Are you “all in” or do you “mix and match”?”
Python data tools just keep getting better
[A version of this post appeared on the O’Reilly Strata blog.] Here are a few observations inspired by conversations I had during the just concluded PyData conference1. The Python data community is well-organized: Besides conferences (PyData, SciPy, EuroSciPy), there is a new non-profit (NumFOCUS) dedicated to supporting scientific computing and data analytics projects. The listContinue reading “Python data tools just keep getting better”