Combining and Merging DataFrames with Pandas

pandas
dataframe
merge
concat
data-manipulation
Learn how to combine and merge Pandas DataFrames using concat() and merge(). This guide covers row-wise and column-wise concatenation, and database-style joins.
Author

Mohammed Adil Siraju

Published

September 19, 2025

This notebook is a guide to combining and merging DataFrames in Pandas, two fundamental operations for data wrangling. We’ll explore: - pd.concat(): For stacking DataFrames vertically or horizontally. - pd.merge(): For performing database-style joins on DataFrames.

1. Setting Up Sample DataFrames

First, let’s import Pandas and create a few sample DataFrames to work with.

import pandas as pd

data1 = {
    'A': [1,2,3],
    'B': [4,5,6]
}

data2 = {
    'A': [7,8,9],   
    'B': [10,11,12]
}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df1
A B
0 1 4
1 2 5
2 3 6
df2
A B
0 7 10
1 8 11
2 9 12

2. Concatenating DataFrames with pd.concat()

Concatenation is like stacking DataFrames on top of each other (row-wise) or side-by-side (column-wise).

Default Concatenation (Row-wise)

By default, pd.concat() stacks DataFrames vertically. This is useful when you have DataFrames with the same columns.

pd.concat([df1, df2])
A B
0 1 4
1 2 5
2 3 6
0 7 10
1 8 11
2 9 12

Column-wise Concatenation

You can also concatenate side-by-side by setting axis=1. This is useful for adding new columns.

pd.concat([df1, df2], axis=1)
A B A B
0 1 4 7 10
1 2 5 8 11
2 3 6 9 12

3. Merging DataFrames with pd.merge()

Merging is used for database-style joins. It combines DataFrames based on a common column (a “key”).

Default Merge (Inner Join)

pd.merge() performs an inner join by default, combining rows that have matching values in the specified on column.

data3 = {
    'A': [1,2,3],   
    'C': [13,14,15]
}

df3 = pd.DataFrame(data3)

df3
A C
0 1 13
1 2 14
2 3 15
df1
A B
0 1 4
1 2 5
2 3 6
pd.merge(df1, df3, on='A')
A B C
0 1 4 13
1 2 5 14
2 3 6 15

Summary

In this notebook, you learned the key differences between concat and merge:

  • pd.concat(): Best for stacking DataFrames.
    • axis=0 (default): Stacks vertically (appends rows).
    • axis=1: Stacks horizontally (appends columns).
  • pd.merge(): Best for database-style joins based on common columns.
    • Use the on parameter to specify the key to join on.
    • Supports inner, outer, left, and right joins (though we only covered the default inner join here).

Understanding when to use each is crucial for effective data manipulation in Pandas. Happy coding! 🚀