Sunday, February 12, 2012

DATA, DATA EVERYWHERE

There's an interesting article in the Sunday New York Times today, purportedly on Big Data.  The piece doesn't particularly provide a definition of Big Data - there's no mention of the Volume, Velocity, or Variety attributes that have become the "standard" measures of Big Data today.  Rather, the article is more focused on discussing the practice of data science, and how analytics are an ever-increasing requirement for innovation and competition in today's market.  What the article highlights is the increasing need for the competent practice of data science - running "Big Analytics" on the ever-growing volume of Big Data sets available.  The referenced McKinsey study in the piece provides a great perspective on what they refer to as Deep Analytical Talent.

The implications of this shift (let's assume it is a shift) are numerous - but one is that the emerging practice and community of data science will require a new set of tools and capabilities.  I read a great article recently (if someone can point me to the link, I can reference it appropriately) that compared the emerging data science practice to the emergence of computer scientists several decades ago.  Before there were classes, tools, etc. for computer scientists we had electrical engineers, applied mathematicians, and other quantitative fields contributing to the emerging field of computer science.  Today you can major in computer science, there are software packages built exclusively for computer scientists (IDEs, for example) - it's simply become part of our standard nomenclature.  The same thing is happening with data science, and what we refer to at Greenplum as Big Analytics - a set of tools and capabilities are emerging (and still need to be developed) that enable the world's data scientists to their jobs better, faster, and with bigger and bigger data.

No comments:

Post a Comment