🧹 Data Cleaning & Transformation Using Power Query in Power BI

Data Cleaning in Power BI Using Power Query – Beginner’s Guide

Raw data is rarely clean. Missing values, duplicate records, incorrect data types, and inconsistent formats can ruin your analysis and dashboards.

That’s where Power Query comes in.

Power Query is Power BI’s data preparation engine that allows you to clean, transform, and shape data before it reaches your data model.

Cleaning in PowerBi


🔍 What Is Power Query?

Power Query is a no-code / low-code ETL tool inside Power BI used for:

  • Data extraction
  • Data transformation
  • Data loading (ETL)

Open Power Query using:

Home → Transform Data

Every action is recorded as an Applied Step, making transformations repeatable and auditable.


🗑️ Removing Duplicates in Power Query

Duplicate records often occur due to:

  • Multiple data sources
  • System errors
  • Manual data entry

🔹 How to Remove Duplicates

  1. Select the column(s)
  2. Go to Home → Remove Rows → Remove Duplicates

Power Query keeps the first occurrence and removes others.

Best Practice:

  • Remove duplicates before creating relationships
  • Apply on unique keys (Order ID, Customer ID)

🚫 Removing Null or Blank Values

Null values can break calculations and visuals.

🔹 Remove Null Rows

  1. Select the column
  2. Click the filter dropdown
  3. Uncheck (null)

🔹 Replace Null Values

  • Go to Transform → Replace Values
  • Replace nulls with:
    • 0 for numeric fields
    • Unknown for text fields

Tip: Never blindly remove nulls—understand why they exist.


🔢 Changing Data Types (Critical Step)

Incorrect data types lead to:

  • Wrong aggregations
  • Broken DAX formulas
  • Visual errors

🔹 Common Data Types

Data Recommended Type
Sales Amount Decimal / Whole Number
Order Date Date
Customer Name Text
IsActive True / False

🔹 How to Change Data Type

  1. Select the column
  2. Choose Data Type from the toolbar
  3. Select the correct type

Always fix data types in Power Query, not in the data model.


🔗 Merging Tables in Power Query (SQL JOIN)

Merging combines tables horizontally using a common key.

🔹 Example Use Cases

  • Sales table + Customer table
  • Orders + Product details

🔹 Steps to Merge Tables

  1. Home → Merge Queries
  2. Select primary & secondary tables
  3. Select matching columns
  4. Choose join type:
    • Left Outer (most common)
    • Inner
    • Full Outer
  5. Expand required columns

Power Query supports SQL-style joins without writing SQL.


➕ Appending Tables (Stacking Data)

Appending combines tables vertically.

🔹 Example Use Cases

  • Monthly sales files
  • Year-wise data
  • Multiple region datasets

🔹 Steps to Append Queries

  1. Home → Append Queries
  2. Select:
    • Two tables
    • Three or more tables
  3. Load combined table

Ensure:

  • Same column names
  • Same data types

🧱 Building Clean, Analysis-Ready Datasets

✅ Data Quality Checklist

  • No duplicate primary keys
  • Correct data types
  • Consistent column naming
  • No unnecessary columns
  • Clean date tables

📌 Recommended Flow

Raw Data → Power Query Cleaning → Data Model → DAX → Visuals

Clean data results in faster performance, simpler DAX, and reliable dashboards.


🚀 Power Query Best Practices

  • Clean data once, reuse everywhere
  • Rename queries meaningfully
  • Disable load for staging queries
  • Avoid unnecessary steps
  • Maintain query folding where possible

🔮 What’s Next?

👉 Part 5: Creating Your First Power BI Report

  • Adding visuals
  • Filters & slicers
  • Dashboard formatting
  • Publishing reports

Published by: DataHark.in

Post a Comment

0 Comments