top of page

Exploring Data Scientist Jobs using Python and Pandas

As a Data Scientist, I believe it is of particular importance to be kept up to date through frequently deriving insights regarding the state of the Data Scientist job market. This will be further explored in the following section.

The dataset used for this project was downloaded from the Kaggle Website. The goal of the project was to conduct Exploratory Data Analysis in order to identify valuable insights. It is crucial to mention that the data pertains specifically to the US market,  however, even though it doesn’t apply directly to my case, I was still curious to learn about Data Science jobs in the US. 

The code developed for the project, along with the data cleaning process performed, can be viewed here.

​

Question 1: Which terms are most commonly used to advertise jobs for Data Scientists?

Companies use three distinct terms in their Job Postings. The terms are: Data Scientist, Data Engineer, and Data Analyst. Even though these terms are often used inter-changeably, the actual work done is quite different.

 

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

​

 

So what is a Data Scientist, Data Engineer and Data Analyst?

Data Analyst: Collects, processes and performs analysis to aid companies in optimizing their decision-making process.

Data Scientist: Uses Data Analysis & Statistical Methods, and Machine Learning to analyse and interpret complex data.

Data Engineer: Builds and maintains Data Systems and Data Infrastructure, while also setting up Databases, Data pipelines & Warehouses.

A more detailed post about the differences between these three terms can be found here posted by Anjali Ariscrisnã and Pedro Coelho.

​

Question 2: Which sectors use data science to analyse their data?

Information Technology is the number one sector that uses Data Science, followed by Business Services, Biotech & Pharmaceuticals and Finance. Nowadays, businesses are required to process huge amounts of data, therefore there is a demand for analytical skills.

 

 

 

 

 

 

 

 

 

 

 

​

 

 

​

 

 

Question 3: Which companies have the most data science jobs?

Apple, IBM and Amazon are the top three companies that have advertised the most jobs related to Data Science. All three companies have similar revenue and size (employees), however the salary range of Apple is much wider than what is found with IBM and Amazon.

 

 

 

 

 

 

 

 

​

​

​

​

Question 4: How is revenue influenced by the location and sector of the company?

The heatmap demonstrates the average revenue of the company according to the location and sector. It was expected that sectors such as Information Technology, Finance, and Business services would have greater revenue than other sectors based on the results above. From the heatmap though we can see that the sectors from different locations have different revenues. Companies need to think about their sector and location, since these two factors might influence their revenue.  

​

​

​

​

​

 

 

 

 

​

 

 

 

 

 

 

 

 

 

 

​

 

 

 

 

 

 

 

 

 

 

Again, the code can be found in my GitHub repo.

Top10_DataScience_Jobs.png
Top10_DataScience_Sectors.png
Top10_Companies.png
heatmap_revenue_location_sector.png
bottom of page