Cleaning Data | Power Bi Tutorial - Learn with VOKS
Back Next

Cleaning Data


Cleaning data is one of the most important steps in data analysis. If your data is messy, your reports and dashboards will be wrong or misleading.

In Power BI, data cleaning mainly happens inside Power Query Editor, which is part of Microsoft Power BI.


1️⃣ What is Data Cleaning?

Data cleaning means:

  • Fixing errors
  • Removing duplicates
  • Filling in missing values
  • Standardizing formats (dates, numbers, text)
  • Removing unnecessary columns
  • Transforming data into a usable format

Think of it like washing vegetables before cooking.

You clean the data before analyzing it.


2️⃣ Where Do You Clean Data in Power BI?

You clean data inside Power Query Editor.

To open it:

  1. Open Power BI Desktop
  2. Click Transform Data
  3. Power Query Editor opens


3️⃣ Example of Raw (Dirty) Data

Here’s an example of messy sales data:


| OrderID | Customer Name | Date       | Amount  | Country     |
|---------|--------------|-----------|---------|------------|
| 1001    | john doe     | 1/2/23    | 200     | usa        |
| 1002    | Jane Smith   | 02-01-23  | 350     | USA        |
| 1003    | JOHN DOE     |           | 400     | United States |
| 1004    | Alice Brown  | 2023/01/04|         | U.S.A      |
| 1001    | john doe     | 1/2/23    | 200     | usa        |

Problems in This Data:

  • Duplicate row (OrderID 1001)
  • Different date formats
  • Missing values (Date, Amount)
  • Inconsistent country names (usa, USA, U.S.A)
  • Inconsistent text capitalization


4️⃣ Common Data Cleaning Steps in Power BI

✅ Step 1: Remove Duplicates

Why?

Duplicates can cause incorrect totals.

How?

  1. Select the column (e.g., OrderID)
  2. Click Remove Rows → Remove Duplicates

M Code Example:


= Table.Distinct(Source, {"OrderID"})

✅ Step 2: Handle Missing Values

Missing values appear as null.

Option A: Remove rows with nulls


= Table.SelectRows(Source, each [Amount] <> null)

Option B: Replace null with 0


= Table.ReplaceValue(Source, null, 0, Replacer.ReplaceValue, {"Amount"})

✅ Step 3: Change Data Types

Power BI must know if something is:

  • Text
  • Number
  • Date
  • Currency

Fix Date Column

Select column → Change Type → Date

M Code:


= Table.TransformColumnTypes(Source, {{"Date", type date}})

✅ Step 4: Standardize Text (Fix Capitalization)

Fix inconsistent names like:

  • john doe
  • JOHN DOE

Capitalize Each Word:


= Table.TransformColumns(Source, {{"Customer Name", Text.Proper}})

✅ Step 5: Replace Values (Standardize Country Names)

Replace:

  • usa
  • U.S.A
  • United States

With:

  • USA

M Code:


= Table.ReplaceValue(Source,"United States","USA",Replacer.ReplaceText,{"Country"})

✅ Step 6: Trim and Clean Text

Sometimes data has extra spaces:

Example:

  • " John Doe "

Remove Extra Spaces:


= Table.TransformColumns(Source, {{"Customer Name", Text.Trim}})

✅ Step 7: Split Columns

If you have:


| Full Name     |
|--------------|
| John Doe     |
| Jane Smith   |

You can split into:


| First Name | Last Name |
|------------|-----------|
| John       | Doe       |
| Jane       | Smith     |

M Code:


= Table.SplitColumn(Source, "Full Name", Splitter.SplitTextByDelimiter(" "), {"First Name", "Last Name"})

✅ Step 8: Rename Columns

Rename unclear columns like:


| CustNm |

To:


| Customer Name |

M Code:


= Table.RenameColumns(Source, {{"CustNm", "Customer Name"}})


5️⃣ Cleaned Data Result

After cleaning:


| OrderID | Customer Name | Date       | Amount | Country |
|---------|--------------|------------|--------|---------|
| 1001    | John Doe     | 2023-01-02 | 200    | USA     |
| 1002    | Jane Smith   | 2023-01-02 | 350    | USA     |
| 1003    | John Doe     | 2023-01-03 | 400    | USA     |
| 1004    | Alice Brown  | 2023-01-04 | 0      | USA     |

Now the data is:

  • Consistent
  • Complete
  • Accurate
  • Ready for analysis


6️⃣ Understanding M Code (Power Query Language)

Power Query uses a language called M.

Each step you perform in Power BI creates M code automatically.

You can view it by:

Home → Advanced Editor

Example of a full cleaning process:


let
    Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
    RemovedDuplicates = Table.Distinct(Source, {"OrderID"}),
    ReplacedNullAmount = Table.ReplaceValue(RemovedDuplicates, null, 0, Replacer.ReplaceValue, {"Amount"}),
    ChangedType = Table.TransformColumnTypes(ReplacedNullAmount, {{"Date", type date}, {"Amount", type number}}),
    ProperNames = Table.TransformColumns(ChangedType, {{"Customer Name", Text.Proper}}),
    TrimNames = Table.TransformColumns(ProperNames, {{"Customer Name", Text.Trim}}),
    StandardizedCountry = Table.ReplaceValue(TrimNames,"United States","USA",Replacer.ReplaceText,{"Country"})
in
    StandardizedCountry


7️⃣ Why Data Cleaning Is Important

If you skip cleaning:

  • Sales totals will be wrong
  • Charts will show duplicates
  • Dates won’t sort properly
  • Country analysis will split into multiple categories

Example of incorrect grouping:


| Country        | Sales |
|---------------|-------|
| USA           | 1000  |
| usa           | 500   |
| United States | 700   |

Correct version:


| Country | Sales |
|--------|-------|
| USA    | 2200  |


8️⃣ Best Practices for Beginners

✅ Always check data types

✅ Remove unnecessary columns

✅ Keep transformations simple

✅ Rename steps clearly

✅ Document your cleaning steps

✅ Check for duplicates early


FULL COMPILATION OF ALL M CODE USED

Here is all the code together:

Example Code:
let
    Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],

    // Remove Duplicates
    RemovedDuplicates = Table.Distinct(Source, {"OrderID"}),

    // Replace Null Amount with 0
    ReplacedNullAmount = Table.ReplaceValue(RemovedDuplicates, null, 0, Replacer.ReplaceValue, {"Amount"}),

    // Remove Rows with Null Amount (Optional Alternative)
    RemovedNullRows = Table.SelectRows(Source, each [Amount] <> null),

    // Change Data Types
    ChangedType = Table.TransformColumnTypes(ReplacedNullAmount, {{"Date", type date}, {"Amount", type number}}),

    // Capitalize Names
    ProperNames = Table.TransformColumns(ChangedType, {{"Customer Name", Text.Proper}}),

    // Trim Spaces
    TrimNames = Table.TransformColumns(ProperNames, {{"Customer Name", Text.Trim}}),

    // Standardize Country Name
    StandardizedCountry = Table.ReplaceValue(TrimNames,"United States","USA",Replacer.ReplaceText,{"Country"}),

    // Rename Column Example
    RenamedColumns = Table.RenameColumns(StandardizedCountry, {{"CustNm", "Customer Name"}})

in
    RenamedColumns
Power Bi
Introduction to Power BI Core Features of Power BI Loading and Opening Existing Reports Communicating Key Metrics with Cards Interactivity and Detail — Slicers and Tables Slicers Cleaning Data Power query editor; renaming and re ordering of columns, finding anomalies Field Aggregation & Data Manipulation Transforming & Formatting Columns Formatting Currency Making maps with geographic data Visualization options; dashboards or reports, tables and scatter charts, bubble charts, KPIs, guage Conditional formatting Sorting, Removing Duplicates, and Plotting in Pandas Dax in power bi, context Dax formulas, date data bars, histogram and pie charts Load and Transforming Data Dimensional modeling Facts and dimensional table modeling Breaking tables into multiple tables Finding relationships between tables
All Courses
Advance AI Bootstrap C C++ Computer Vision Content Writing CSS Cyber Security Data Analysis Deep Learning Email Marketing Excel Figma HTML Java Script Machine Learning MySQLi Node JS PHP Power Bi Python Python for AI Python for Analysis React React Native SEO SMM SQL