Four Data Driven Careers Deciphered
A succinct introduction to Data Science career.
Inside this modern technophile name DATA SCIENCE lies the archaic desire to organise gigantic amount of data created by us and finding better methodologies and engineering tools to analyse and make use of the data produced to drive decisions in the proper course. All the data crunching methods require an extensive amount of people in data mastery making the process more efficient.
Data is omnipresent. We are in a time where digital data is augmenting rapidly. This is changing the way industries, be it the healthcare, financial or manufacturing operate. It has thus created the need to examine the available data and draw meaningful insights for organizations to advance and flourish. Along these lines, the interest in Data Analysis and Data Science is augmenting amongst almost all the sectors. While these roles share some parallels, there are specific skill sets that a person must acquire to adept in the particular domain. Hence comes the encoded trending careers in data namely Business Analyst, Data Analyst, Data Engineer, and Data Scientist.
These terms are used interchangeably frequently as there are no fixed definitions. I analysed a few job postings on LinkedIn to understand what most of the companies expect from each of these roles.
Business Analysts basically extract valuable information from structured and unstructured sources to explain the current and future business performance. They help in analysing the best model and providing solutions to business users. They work mostly for defining business problems and translating the analysis into data-driven business intelligence that improves business performance.
A few core know-hows are required which comprise of:
- Knowledge of Statistical Tools/Softwares: SAS, STATA, SPSS, R
- Able to demonstrate understanding of analytical techniques including regression, trend analysis, forecasting and A/B testing.
- MS Excel Mastery including Pivot tables, Array function & VBA.
- Visualisation Tools like Tableau, Power BI, QlikView
- Substantial amount of domain knowledge
A Data Analyst is basically a novice Data Scientist which is probably a position to inchoate your career in Data Science. The basic responsibility is interpretation and alteration of prevailing data sets, looking out for patterns and bringing out conclusions. In order to perform the aforementioned tasks, a rudimentary knowledge about data munging, data visualization and statistics is mostly expected. Apart from this, to present findings to non-technical people, knowledge of simplifying complex data to ad-hoc reports and charts using visualization tools is also necessary.
A baseline understanding of a few core know-hows are required which comprises of:
- Programming (Python & R)
- Applied Statistical analysis
- Applied Machine learning
- Data Visualization – Tableau, PowerBI, QlikView
- Data Munging
- Data Collection & Processing
A Data Engineer focuses on the hardware which assists the data driven activities. They are responsible for maintaining, expanding and improvising it whenever necessary in order to increase the competence. They are basically the back-end workers which help the Data Scientists and Analysts to do their roles meritoriously.
The Skill set for a data engineer includes:
- Data Tools & Ecosystems – MapReduce, Hive, Pig, Spark, Kafka
- SQL based technologies – MySQL, PostgreSQL
- NoSQL Technologies – Mongo DB, Cassandra
- Data Warehousing Solutions
DATA SCIENTIST (DESIGN, DEVELOP & DEPLOY):
A Data Scientist is dissimilar to a Data Analyst in terms of Skill Set and Experience. At this Position the person will be confronting a larger Volume, Velocity and Variety of Data. Bagged up with the knowledge of tools to alter current Data Sets they can invent new Algorithms as well, in order to ameliorate the efficiency in solving Data Problems. Usually, data scientists have collaborative and niche knowledge about business as well as details about cutting-edge data visualization skills. The skills required are highly variable and largely dependant on the type of business they are involved in or the problem they are trying to solve.
Toolkit for a Data Scientist Comprising of a few Concepts along with the ones Mentioned above for A Data Analyst:
- Multivariate Statistics – Regression, Principal components analysis and clustering
- Natural Language Processing
- Computer Vision
- Prescriptive & Predictive Modelling.
- Cloud Services – AWS, Google cloud, Microsoft Azure.
- Familiarity with deployment mechanisms for Machine Learning algorithms – Docker, Kubernetes, TIDAL
By: Kanksha Masrani