Advanced Data Manipulation (apply, map, applymap)

pandas
dataframe
transformation
Overview of apply, map, and applymap for advanced DataFrame/Series transformations.
Author

Mohammed Adil Siraju

Published

September 21, 2025

This notebook demonstrates advanced element-wise and row/column-wise transformations in pandas using apply, map, and applymap.

Introduction

Pandas provides flexible methods to transform data: - Series.map(func): elementwise mapping for a Series. - DataFrame.apply(func, axis=...): apply a function to each column or row (as Series). - DataFrame.applymap(func): elementwise operation across the entire DataFrame.

We’ll illustrate each with short examples and best-practice notes.

# Import libraries and create sample DataFrame
import pandas as pd

data = {
    'A': [1, 2, 3, 4, 5],
    'B': [10, 20, 30, 40, 50],
}

df = pd.DataFrame(data)
df
A B
0 1 10
1 2 20
2 3 30
3 4 40
4 5 50

Using apply

DataFrame.apply calls a function on each column (by default) or each row when axis=1. The function receives a Series and should return a single value or a Series (for aggregation or transformation).

Use apply when your operation needs to work on an entire row/column at once (e.g., compute a statistic or combine multiple columns).

# Example: multiply each column (Series) by 2 using apply
# Note: apply receives a Series (column) by default, so multiplying the Series scales all values in that column
df_apply = df.apply(lambda col: col * 2)
df_apply
A B
0 2 20
1 4 40
2 6 60
3 8 80
4 10 100

Using map (Series)

Series.map is an elementwise operation on a Series. Use it for simple scalar transformations or to map values via a dict/Series/function. It is not available on DataFrame directly (use applymap for elementwise on DataFrame).

# Series example using map
series_data = pd.Series([1, 2, 3, 4, 5])
mapped_data = series_data.map(lambda x: x ** 2)
mapped_data
0     1
1     4
2     9
3    16
4    25
dtype: int64
# original series
series_data
0    1
1    2
2    3
3    4
4    5
dtype: int64

Using applymap (elementwise on DataFrame)

DataFrame.applymap applies a function to each element of the DataFrame. This is the correct choice for elementwise numeric transforms across all cells. For column/row-wise operations, prefer apply.

# show the DataFrame
df
A B
0 1 10
1 2 20
2 3 30
3 4 40
4 5 50
# elementwise cube using applymap
df_applymap = df.applymap(lambda x: x ** 3)
df_applymap
C:\Users\adila\AppData\Local\Temp\ipykernel_5712\4130872161.py:2: FutureWarning: DataFrame.applymap has been deprecated. Use DataFrame.map instead.
  df_applymap = df.applymap(lambda x: x ** 3)
A B
0 1 1000
1 8 8000
2 27 27000
3 64 64000
4 125 125000

Creating new columns with apply (row-wise)

When you need to compute a value using multiple columns, use apply with axis=1. For better performance, prefer vectorized operations when possible (see Best Practices below).

# create column 'C' as product of A and B using apply row-wise
df['C'] = df.apply(lambda row: row['A'] * row['B'], axis=1)
df
A B C
0 1 10 10
1 2 20 40
2 3 30 90
3 4 40 160
4 5 50 250

Best Practices

  • Prefer pandas vectorized operations (e.g., df['A'] * df['B']) over apply when possible — they are faster and clearer.
  • Use map for Series-to-Series elementwise mappings or label replacements.
  • Use applymap only when you need a uniform elementwise transform across the entire DataFrame.
  • When using apply with axis=1, consider np.where, pd.Series.where, or vectorized arithmetic to improve performance.
  • Keep functions simple and avoid expensive Python-level loops inside apply/map for large DataFrames.

Summary & Further Reading

This notebook covered the differences between map, apply, and applymap and showed practical examples. For more, see the pandas documentation: https://pandas.pydata.org/pandas-docs/stable/reference/index.html

Further exercises: try rewriting the df['C'] calculation using a fully vectorized expression and compare timing with %timeit.