Cleaning data is one of the most important steps in data analysis. If your data is messy, your reports and dashboards will be wrong or misleading.
In Power BI, data cleaning mainly happens inside Power Query Editor, which is part of Microsoft Power BI.
1️⃣ What is Data Cleaning?
Data cleaning means:
Think of it like washing vegetables before cooking.
You clean the data before analyzing it.
2️⃣ Where Do You Clean Data in Power BI?
You clean data inside Power Query Editor.
To open it:
3️⃣ Example of Raw (Dirty) Data
Here’s an example of messy sales data:
| OrderID | Customer Name | Date | Amount | Country | |---------|--------------|-----------|---------|------------| | 1001 | john doe | 1/2/23 | 200 | usa | | 1002 | Jane Smith | 02-01-23 | 350 | USA | | 1003 | JOHN DOE | | 400 | United States | | 1004 | Alice Brown | 2023/01/04| | U.S.A | | 1001 | john doe | 1/2/23 | 200 | usa |
Problems in This Data:
4️⃣ Common Data Cleaning Steps in Power BI
✅ Step 1: Remove Duplicates
Why?
Duplicates can cause incorrect totals.
How?
M Code Example:
= Table.Distinct(Source, {"OrderID"})
✅ Step 2: Handle Missing Values
Missing values appear as null.
Option A: Remove rows with nulls
= Table.SelectRows(Source, each [Amount] <> null)
Option B: Replace null with 0
= Table.ReplaceValue(Source, null, 0, Replacer.ReplaceValue, {"Amount"})
✅ Step 3: Change Data Types
Power BI must know if something is:
Fix Date Column
Select column → Change Type → Date
M Code:
= Table.TransformColumnTypes(Source, {{"Date", type date}})
✅ Step 4: Standardize Text (Fix Capitalization)
Fix inconsistent names like:
Capitalize Each Word:
= Table.TransformColumns(Source, {{"Customer Name", Text.Proper}})
✅ Step 5: Replace Values (Standardize Country Names)
Replace:
With:
M Code:
= Table.ReplaceValue(Source,"United States","USA",Replacer.ReplaceText,{"Country"})
✅ Step 6: Trim and Clean Text
Sometimes data has extra spaces:
Example:
Remove Extra Spaces:
= Table.TransformColumns(Source, {{"Customer Name", Text.Trim}})
✅ Step 7: Split Columns
If you have:
| Full Name | |--------------| | John Doe | | Jane Smith |
You can split into:
| First Name | Last Name | |------------|-----------| | John | Doe | | Jane | Smith |
M Code:
= Table.SplitColumn(Source, "Full Name", Splitter.SplitTextByDelimiter(" "), {"First Name", "Last Name"})
✅ Step 8: Rename Columns
Rename unclear columns like:
| CustNm |
To:
| Customer Name |
M Code:
= Table.RenameColumns(Source, {{"CustNm", "Customer Name"}})
5️⃣ Cleaned Data Result
After cleaning:
| OrderID | Customer Name | Date | Amount | Country | |---------|--------------|------------|--------|---------| | 1001 | John Doe | 2023-01-02 | 200 | USA | | 1002 | Jane Smith | 2023-01-02 | 350 | USA | | 1003 | John Doe | 2023-01-03 | 400 | USA | | 1004 | Alice Brown | 2023-01-04 | 0 | USA |
Now the data is:
6️⃣ Understanding M Code (Power Query Language)
Power Query uses a language called M.
Each step you perform in Power BI creates M code automatically.
You can view it by:
Home → Advanced Editor
Example of a full cleaning process:
let
Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
RemovedDuplicates = Table.Distinct(Source, {"OrderID"}),
ReplacedNullAmount = Table.ReplaceValue(RemovedDuplicates, null, 0, Replacer.ReplaceValue, {"Amount"}),
ChangedType = Table.TransformColumnTypes(ReplacedNullAmount, {{"Date", type date}, {"Amount", type number}}),
ProperNames = Table.TransformColumns(ChangedType, {{"Customer Name", Text.Proper}}),
TrimNames = Table.TransformColumns(ProperNames, {{"Customer Name", Text.Trim}}),
StandardizedCountry = Table.ReplaceValue(TrimNames,"United States","USA",Replacer.ReplaceText,{"Country"})
in
StandardizedCountry
7️⃣ Why Data Cleaning Is Important
If you skip cleaning:
Example of incorrect grouping:
| Country | Sales | |---------------|-------| | USA | 1000 | | usa | 500 | | United States | 700 |
Correct version:
| Country | Sales | |--------|-------| | USA | 2200 |
8️⃣ Best Practices for Beginners
✅ Always check data types
✅ Remove unnecessary columns
✅ Keep transformations simple
✅ Rename steps clearly
✅ Document your cleaning steps
✅ Check for duplicates early
FULL COMPILATION OF ALL M CODE USED
Here is all the code together:
let
Source = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
// Remove Duplicates
RemovedDuplicates = Table.Distinct(Source, {"OrderID"}),
// Replace Null Amount with 0
ReplacedNullAmount = Table.ReplaceValue(RemovedDuplicates, null, 0, Replacer.ReplaceValue, {"Amount"}),
// Remove Rows with Null Amount (Optional Alternative)
RemovedNullRows = Table.SelectRows(Source, each [Amount] <> null),
// Change Data Types
ChangedType = Table.TransformColumnTypes(ReplacedNullAmount, {{"Date", type date}, {"Amount", type number}}),
// Capitalize Names
ProperNames = Table.TransformColumns(ChangedType, {{"Customer Name", Text.Proper}}),
// Trim Spaces
TrimNames = Table.TransformColumns(ProperNames, {{"Customer Name", Text.Trim}}),
// Standardize Country Name
StandardizedCountry = Table.ReplaceValue(TrimNames,"United States","USA",Replacer.ReplaceText,{"Country"}),
// Rename Column Example
RenamedColumns = Table.RenameColumns(StandardizedCountry, {{"CustNm", "Customer Name"}})
in
RenamedColumns