For this part, you will need to install some stuff on your system to be able to visualize data.
Let’s go and install these:
-jupyter notebooks
-pandas
-matplotlib
-seaborn
So, after installing dependencies now it is the time to run some code in the jupyter notebook.
Datasets usually are stored as CSV files which stands for Comma Separated Values file. Loading datasets is one of the vital steps of data visualization. So, for making that step to happen we need to learn how to load the data. Now, the example that we have here is a dataset of historical FIFA rankings for six countries: Argentina (ARG), Brazil (BRA), Spain (ESP), France (FRA), Germany (GER), and Italy (ITA).

Here, we know that the first column of our data consists of dates for we want to label the first column as Date (index_col=”Date”).
Furthermore, all the rows are recognized as dates (that is why we have parse_dates=True).
Examining the data
It is always good to have a glance at the dataset just to make sure it loaded properly.
For this part, it is fine to just print out the first five rows of the dataset (by using head()) and the five last rows of the dataset (by using tail()).
It could be done by one line of code as follows:

Alright, now we are done with loading the data. Now, it is time for real visualization.
Plot the data
There are different types of charts:
Trends (pattern changes):
-Line charts are best to show trends over a period of time, and multiple lines can be used to show trends in more than one group. (sns.linplot)
Relationship (relationships between variables):
-Bar charts are useful for comparing quantities corresponding to different groups. (sns.barplot)
-Heatmaps can be used to find color-coded patterns in tables of numbers. (sns.heatmap)
-Scatter plots show the relationship between two continuous variables; if color-coded, we can also show the relationship with a third categorical variable.
– Including a regression line in the scatter plot makes it easier to see any linear relationship between two variables. (sns.regplot)
-This command is useful for drawing multiple regression lines, if the scatter plot contains multiple, color-coded groups. (sns.lmplot)
-Categorical scatter plots show the relationship between a continuous variable and a categorical variable. (sns.swarmplot)
Distribution (possible value that is expected to see in a variable):
-Histograms show the distribution of a single numerical variable. (sns. distplot)
-KDE or 2D KDE plots show an estimated, smooth distribution of a single or two numerical variables. (sns.kdeplot)
-This command is useful for simultaneously displaying a 2D KDE plot with the corresponding KDE plots for each individual variable. (sns.jointplot)
Of course, the seaborn package has different functions as above that give us different types of data visualization. We are not going to go through all of them but some of them.
The seaborn package has a function named lineplot() which generates a line chart and we are going to use that.

Furthermore, we can modify the size of the figure or the title of the chart. Below, you can see an example with a different dataset (from Spotify).

Barchart-Plot:
Let’s plot the bar chart with a dataset from the US Department of Transportation that tracks flight delays.
For your information this is the CSV file of our new dataset in Excel:

Now, we wan to specify in our bar chart to show the average arrival delay for Spirit Airlines (airline code: NK) flights, by month.

Heatmaps-Plot:
Let’s plot the heatmap of this dataset:

Scatter-Plot:
For this part, we are going to use a dataset of insurance charges as below:

Let’s generate the scatter plot by using the scatterplot() function in seaborn package:
- the horizontal x-axis (
x=insurance_data['bmi']
), and - the vertical y-axis (
y=insurance_data['charges']
).

The scatterplot above suggests that Body Mass Index (BMI) and insurance charges are positively correlated, where customers with higher BMI typically also tend to pay more in insurance costs.
Now, there is an option for double-check the strength of this relationship, we add a regression line. We do this by using the function regplot().

Plot the subset of the data
In this case, we have a different dataset which is focused on five popular songs from 2017 and 2018:
- “Shape of You”, by Ed Sheeran
- “Despacito”, by Luis Fonzi
- “Something Just Like This”, by The Chainsmokers and Coldplay
- “HUMBLE.”, by Kendrick Lamar
- “Unforgettable”, by French Montana
We are going to plot a subset of the columns. We could start by printing the names of the columns.

Now, we will generate the plot of the first two columns.


It worth mentioning that all of the data above is coming from a very good site Kaggle which I took the course and learn a lot and I shared what I have understood.