Data scientist is a hot new term in tech industry. You may picturize a data scientist as some one who applies advanced statistical, analytical and machine learning tools to big data. It is not that all data scientists are same, they come from different backgrounds and attack problems from different angles. I would like to thank Harlan D Harris, Sean Patrick Murphy and Marck Vaisman for conducting an introspective survey of data scientists and their work. Using the survey results they were able to identify 5 different personas of data scientists.Each type of data scientist has their own strength and weakness.
Let’s go through the different groups of data scientists.
Data Business people
People like kirthika are found in large organizations or in their own start-ups. They are great at dealing with other professionals and have a comprehensive knowledge in data science process. This kind of data scientist has a project management role in large organizations.
Data creative are good at statistics, programming and big data technologies. They are a great asset for smaller companies, where flexibility is important. They are good at doing day-to- day work of a data scientist.
Data developer’s day-to-day work involves getting data from different sources and sorting the data in large databases, querying those databases, and analyzing the results to derive meaningful information from them.
They come from the academic world and have a strong background in statistics. They also tend to have PhDs. Business skills are not their strength, but they are excellent analysts.
Data scientists Generic
Generic data scientists are similar to data business people but without the immense experience or the intense business focus. They are more balanced than the four types of data scientists. They are flexible like data science creative, but with a better understanding of the business world. Generic data scientists are passionate about the field, and have a T-shaped skill set.
Data science is a creative field, where a professional has to work with various other people such as data base administrators, business people, software engineers, etc. So a data scientist should collaborate with others to complete his/her project. A data scientist should not only have broad range of skills, but also posses a deep expertise in their area of specialization.
For example a data scientist with an statistics background and deep skills in probability, descriptive, and inferential statistics might find value in learning some of the machine learning algorithms and optimization techniques. The same data scientist should also posses enough broad programming, big data, and business skills.
Skills of other 4 data scientists
As I said before generic data scientists show a T- Shaped skill set. So let’s see how the skills of other 4 data science professionals are depicted. The two graphs below clearly represent the skills of 4 data scientists.
As you can see Data Businesspeople are most likely to have primarily Business-related skills. Data Businesspeople have strongest skill rankings in other areas, such as Statistics and ML/Big Data. Data Researchers are also those most likely to have expertise in Statistics and Mathematics. Both Data Businesspeople and Data Researchers were quite unlikely to rate Programming skills as their highest skills. Data creative and Data developers are likely to have expertise in programming and big data. But Data creatives are good at statistics when compared to data developers.
To know about your strengths and weakness as a data scientist take this survey, and see where you fit in. If you are an aspiring data scientist I would recommend you to first concentrate on your core strength. If you are from computer science background develop your skills in python, machine learning, big data and be proficient at any one of the lower level languages(c, c++ or java). If you have a business background it is good to concentrate on statistics, business and data visualization.
Which type of data scientist are you??
Manu Jeevan is a Big Data blogger at BigDataExaminer, where he writes about Data Science, Python and Digital analytic