Data Visualization with Matplotlib | Python for Analysis Tutorial - Learn with VOKS
Back Next

Data Visualization with Matplotlib


While Pandas gives you the numbers, Matplotlib gives you the story. Matplotlib is the foundational plotting library for Python. It is designed to look and feel like MATLAB, providing a "low-level" control that allows you to customize every single pixel of your chart.

In data analysis, visualization is used for two purposes: Exploration (understanding the data yourself) and Communication (showing your findings to others).


1. Intro to Matplotlib & Pyplot

To use Matplotlib, we primarily use a sub-module called pyplot. It provides a collection of functions that make Matplotlib work like a "state machine"—meaning it keeps track of the current figure and plotting area.

The Standard Import:

Python


import matplotlib.pyplot as plt
import numpy as np

2. Basic Plotting & Line Styling

The plot() function is used to draw points or lines in a diagram. By default, it draws a line from point to point.

A. Markers

If you want to emphasize the actual data points, you use Markers.

  • plt.plot(ypoints, marker = 'o') (Circles)
  • plt.plot(ypoints, marker = '*') (Stars)

B. Line Customization

You can change the style of the line to make your charts more readable.

  • Line Style (ls): 'dotted', 'dashed', or 'None'.
  • Color (c): Use names like 'red' or Hex codes like '#4CAF50'.
  • Line Width (lw): A float value (e.g., 2.5).

3. Labels and Grid

A chart without labels is just a wavy line. For a professional report, you must define your context.

  • Title: plt.title("Monthly Sales Data")
  • Axis Labels: plt.xlabel("Month") and plt.ylabel("Revenue ($)")
  • Grid: plt.grid() adds a background grid to help viewers trace values accurately.

4. Subplots: Multiple Charts in One

Sometimes you need to compare two different datasets side-by-side. The subplot() function allows you to draw multiple plots in one figure.

  • Syntax: plt.subplot(rows, columns, index)
  • Example: plt.subplot(1, 2, 1) creates a grid of 1 row and 2 columns, and selects the first plot.

5. Common Chart Types for Analysts

A. Scatter Plots

Used to observe the relationship (correlation) between two variables.

  • Function: plt.scatter(x, y)
  • Use Case: Comparing "Advertising Spend" vs. "Total Sales."

B. Bar Charts

Used for comparing categories.

  • Function: plt.bar(x, y) (Vertical) or plt.barh(x, y) (Horizontal).
  • Use Case: Comparing sales across different regions (North, South, East, West).

C. Histograms

Used to show the distribution of data (how often values fall into certain "bins").

  • Function: plt.hist(data)
  • Use Case: Seeing the age distribution of your customer base.

D. Pie Charts

Used to show proportions of a whole.

  • Function: plt.pie(data, labels=my_labels)
  • Use Case: Showing market share percentage.

6. The "Golden Rule" of Visualization

Before you plot, ask yourself: What is the question I am trying to answer?

  1. Trends over time? Use a Line Chart.
  2. Comparison between groups? Use a Bar Chart.
  3. Relationship between two numbers? Use a Scatter Plot.
  4. Distribution of a single variable? Use a Histogram.


Example Code:
<br />
<b>Deprecated</b>:  htmlspecialchars(): Passing null to parameter #1 ($string) of type string is deprecated in <b>/home/voksinst/tutorials.voksinstitute.com/admin/topics.php</b> on line <b>265</b><br />
Python for Analysis
What is Python? Python Syntax, Comments, and Variables Python Data Types — Numeric, Strings, and Sequences Mapping Data Types — The Power of Dictionaries The Boolean Data Type — The Logic of Data Analysis Numbers and Type Casting Conditional Statements — If and Else Python Modules — Organizing and Reusing Code Number Arrays (NumPy) — The Foundation of Data Analysis Pandas; Pandas series, Dataframe, Read CSV, cleaning data, dealing with empty data, removing duplicates, pandas plotting Mastering Pandas for Data Analysis Data Visualization with Matplotlib Statistical Data Visualization
All Courses
Advance AI Bootstrap C C++ Computer Vision Content Writing CSS Cyber Security Data Analysis Deep Learning Email Marketing Excel Figma HTML Java Script Machine Learning MySQLi Node JS PHP Power Bi Python Python for AI Python for Analysis React React Native SEO SMM SQL