The two faces of data scientists: Why sorting out data science is confusing everyone - Part 1
October 21, 2013
Data science is the hot new buzzword in technology and business and data scientist is the new hot professional but the love affair is starting to cool. Right now data science jobs are not able to be filled due to a lack of adequate skills in applicants and universities have now introduced a myriad of “Data science Programs” into the marketplace with a great deal of variation in their course offerings thus creating a pool of professionals with very inconsistent skill sets. Worse still, data science as a whole is starting to fall out of favor and risks being a passing fad term.
And the disillusionment with data science is understandable: The field is having a serious identity crisis and no one knows where to start fixing it.
By and large, our perception of what a data scientist is defined by what we want them to do or the industries that drive demand. I think anyone who has followed current events in Big Data and Analytics will notice that explanations of what a data scientist is or does is dubious at best and the requisite skill set is a massive hodgepodge of scientific disciplines, industry experience and broad subjective personality traits like “imagination” or “ingenuity”.
But this is not how science, REAL science, is practiced.
Previously, I have blogged about the importance of viewing the make up of the data science discipline through the lens of hard science first (Check it out here) as essential to understanding Data science. Once you start with treating data science academically, understanding the data scientist will be much easier.
Scientific progress is made possible through the seamless functioning of a scientific continuum where theoretical scientists create and confirm theories to explain some given phenomenon and that allows the applied scientists to refine those ideas into functional models that can then be funneled to engineers, technicians and developers for public use.
Basically, scientists fall into 2 broad categories: ones who DO stuff and ones who THINK UP stuff. It is the line between the theoretical scientific investigation and applied scientific R & D.
In most fields of scientific study, there are two distinct arms: the theoretical and the applied. This separation MUST exist because while both require specific expertise and hardwork, their fundamental viewpoints and goals are entirely different.
And while I’m not saying that there should be two different job roles, I think data science should be considered consisting of 2 tracks; 2 related but distinct arms directed at pushing data science forward through 2 main scientific lenses: theoretical data science and applied/experimental data science
The theoretical scientist possesses broad knowledge of the essentials of their discipline and looks to create theories or models that exist in a hypothetical realm in order to understand some mysterious, unexplained phenomenon.
These are the imaginative pioneers who create new ideas of how our universe operates much like composers will create a symphony: a combination of skill, exploration and imagination.
The applied scientist then has the job of applying those models and processing them into useable knowledge as well as formulate the real-world scientific basis in which future engineers can play within and invent.
Applied scientists take new theories and put them into practice designing experiments to identify innovative areas of application.
These are the specialists if you will. The experimental physicist with years of research in lasers who you hire to design your new laser cutting tool.
Applied scientists are just as brilliant but are focused on refining more than expanding.
It should also be noted that at the tip of the chain (but equally important) are the engineers and technicians who will ultimately bring their skill in design and production to the table to make that new data science tool/program/etc.
So at the heart of all the confusion is this: With Data Science making such a rapid and huge burst onto the map, everyone is trying to capture its practice under one broad and clumsy umbrella.
Just like in more traditional scientific fields, Data Science needs to acknowledge that there are different modes within each scientific discipline that work together to propel the field forward and those skill sets are different. The skill sets that one should look for in a data scientist will be different depending on what your objectives are as well as how the data scientist will fit into your plan for achieving them. Whereas a theoretical data scientist will be able to model your data for you or give you new insights into how to manage data, you are going to want to bring in an applied data scientist to manage those insights and process them into real tools and strategies.
Data Science isn’t a foundational science yet but it can be. At present, our data sets keep growing in size exponentially and that much data can be hiding patterns, behaviors and insights that may revolutionize our world. But lets re-focus our efforts so we find the best patterns and to do that, the right data scientists need to be put in the right place at the right time.
image courtesy of Master Isolated / Images/freedigitalphotos.net