Back to Blog

How to Merge Multiple CSV Files Into One

Published: March 31, 2026

How to Merge Multiple CSV Files Into One

Meta description: Need to combine several CSV files into one? Here are 4 methods to merge CSV files — from simple copy-paste to Python scripts that handle edge cases.

You exported 12 monthly reports. Or your team sent you five different spreadsheets. Or your API dumps data into a new CSV every day. Now you need all of it in one file.

Merging CSV files is one of those tasks that sounds trivial until you actually do it and discover your headers are duplicated, your encodings don't match, or half your columns are misaligned.

Here's how to do it right.

Before You Merge: Check Compatibility

Open a couple of your files in a CSV viewer and verify:

  • Same columns? Headers should match across all files (same names, same order ideally)
  • Same delimiter? All commas, or all semicolons — not a mix
  • Same encoding? All UTF-8, or all Latin-1 — not a mix
  • Same data formats? Dates in the same format, numbers with the same decimal separator

If any of these differ, fix them before merging. Garbage in, garbage out.

Method 1: Copy-Paste (2-3 Small Files)

The brute-force approach. Works when you have a handful of small files.

  1. Open all files in a text editor
  1. Copy everything from file 2 except the header row
  1. Paste it at the end of file 1
  1. Repeat for each additional file
  1. Save

Pros: No tools needed.

Cons: Tedious beyond 3 files. Easy to accidentally include duplicate headers.

Method 2: Command Line (Fast, Any Number of Files)

The fastest method for combining many files with identical structures.

macOS / Linux

bash

Keep header from first file, then data from all files

head -1 file1.csv > merged.csv

tail -n +2 -q *.csv >> merged.csv

Windows (PowerShell)

powershell

Get header from first file

Get-Content (Get-ChildItem *.csv | Select-Object -First 1) |

Select-Object -First 1 | Out-File merged.csv -Encoding UTF8

Append data (skip headers) from all files

Get-ChildItem *.csv | ForEach-Object {

Get-Content $_ | Select-Object -Skip 1

} | Add-Content merged.csv -Encoding UTF8

Pros: Handles hundreds of files in seconds.

Cons: No validation — if columns don't match, you won't know until later.

Method 3: Python with Pandas (Recommended)

The most reliable method. Handles mismatched columns, different encodings, and deduplication.

Basic Merge

python

import pandas as pd

import glob

files = glob.glob('data/*.csv')

dfs = [pd.read_csv(f) for f in files]

merged = pd.concat(dfs, ignore_index=True)

merged.to_csv('merged.csv', index=False)

print(f"Merged {len(files)} files — {len(merged)} total rows")

With Source Tracking

Add a column so you know which file each row came from:

python

dfs = []

for f in files:

df = pd.read_csv(f)

df['source_file'] = f

dfs.append(df)

merged = pd.concat(dfs, ignore_index=True)

With Deduplication

Remove duplicate rows that might exist across files:

python

merged = pd.concat(dfs, ignore_index=True)

before = len(merged)

merged = merged.drop_duplicates()

print(f"Removed {before - len(merged)} duplicate rows")

Handling Different Column Orders

Pandas matches columns by name, not position. If file A has [name, email, city] and file B has [city, name, email], pd.concat handles it correctly.

Handling Different Column Sets

If files have different columns, pd.concat fills missing values with NaN:

python

File 1: name, email

File 2: name, email, phone

Result: name, email, phone (phone is NaN for file 1 rows)

merged = pd.concat(dfs, ignore_index=True)

To keep only columns that exist in ALL files:

python

common_cols = set.intersection(*[set(df.columns) for df in dfs])

merged = pd.concat([df[list(commoncols)] for df in dfs], ignoreindex=True)

Method 4: Google Sheets

For people who prefer a visual approach:

  1. Import each CSV into separate sheets (tabs)
  1. In a new sheet, use IMPORTRANGE or copy-paste data from each tab
  1. Export the combined sheet as CSV

Works fine for small files but hits Google Sheets' 10 million cell limit quickly.

Common Pitfalls

Duplicate Headers

The most common mistake. If you concatenate files without removing headers, you end up with header rows scattered through your data:


name,email,city

Alice,alice@example.com,Paris

name,email,city ← this shouldn't be here

Bob,bob@example.com,London

All methods above handle this except raw cat concatenation. Always use tail -n +2 or equivalent to skip headers.

Mixed Encodings

One file in UTF-8, another in Latin-1. The merged result will have broken characters somewhere.

Fix: Convert all files to UTF-8 before merging:

bash

Convert Latin-1 to UTF-8

iconv -f ISO-8859-1 -t UTF-8 oldfile.csv > oldfile_utf8.csv

Or in Python:

python

df = pd.read_csv(f, encoding='latin-1') # read with original encoding

pandas will write UTF-8 by default

Mixed Delimiters

Some files use commas, others use semicolons. Pandas detects this automatically in most cases, but you can be explicit:

python

df = pd.read_csv(f, sep=';') # for semicolon-separated files

Overlapping Date Ranges

If you're merging monthly exports and the date ranges overlap (e.g., end of January and start of February both appear in two files), you'll get duplicate rows.

python

merged = merged.drop_duplicates(subset=['id', 'date'])

Which Method Should You Use?

| Scenario | Best Method |

|----------|-------------|

| 2-3 small identical files | Copy-paste or command line |

| Many identical files (same columns) | Command line |

| Files with different columns | Python (Pandas) |

| Need deduplication | Python (Pandas) |

| Recurring task (monthly merge) | Python script saved for reuse |

| Quick structure check before merging | CSV Viewer Online |

Automating Recurring Merges

If you merge files regularly, save a reusable script:

python

#!/usr/bin/env python3

"""Merge all CSV files in a directory."""

import pandas as pd

import glob

import sys

directory = sys.argv[1] if len(sys.argv) > 1 else '.'

output = sys.argv[2] if len(sys.argv) > 2 else 'merged.csv'

files = sorted(glob.glob(f'{directory}/*.csv'))

if not files:

print(f"No CSV files found in {directory}")

sys.exit(1)

dfs = []

for f in files:

df = pd.read_csv(f)

df['_source'] = f

dfs.append(df)

print(f" Read {f}: {len(df)} rows")

merged = pd.concat(dfs, ignore_index=True)

merged = merged.dropduplicates(subset=[c for c in merged.columns if c != 'source'])

merged.to_csv(output, index=False)

print(f"\nMerged {len(files)} files into {output}: {len(merged)} rows")

Usage: python mergecsvs.py ./monthlyexports/ combined_2024.csv

Merging CSVs is a solved problem. Pick the method that matches your comfort level and file complexity, verify the result in a viewer, and you're done.