How to Merge Multiple CSV Files Into One

Published: March 31, 2026

How to Merge Multiple CSV Files Into One

Meta description: Need to combine several CSV files into one? Here are 4 methods to merge CSV files — from simple copy-paste to Python scripts that handle edge cases.

You exported 12 monthly reports. Or your team sent you five different spreadsheets. Or your API dumps data into a new CSV every day. Now you need all of it in one file.

Merging CSV files is one of those tasks that sounds trivial until you actually do it and discover your headers are duplicated, your encodings don't match, or half your columns are misaligned.

Here's how to do it right.

Before You Merge: Check Compatibility

Open a couple of your files in a CSV viewer and verify:

Same columns? Headers should match across all files (same names, same order ideally)

Same delimiter? All commas, or all semicolons — not a mix

Same encoding? All UTF-8, or all Latin-1 — not a mix

Same data formats? Dates in the same format, numbers with the same decimal separator

If any of these differ, fix them before merging. Garbage in, garbage out.

Method 1: Copy-Paste (2-3 Small Files)

The brute-force approach. Works when you have a handful of small files.

Open all files in a text editor

Copy everything from file 2 except the header row

Paste it at the end of file 1

Repeat for each additional file

Save

Pros: No tools needed.

Cons: Tedious beyond 3 files. Easy to accidentally include duplicate headers.

Method 2: Command Line (Fast, Any Number of Files)

The fastest method for combining many files with identical structures.

macOS / Linux

bash
Keep header from first file, then data from all files
head -1 file1.csv > merged.csv
tail -n +2 -q *.csv >> merged.csv

Windows (PowerShell)

powershell
Get header from first file
Get-Content (Get-ChildItem *.csv | Select-Object -First 1) |
Select-Object -First 1 | Out-File merged.csv -Encoding UTF8

Append data (skip headers) from all files
Get-ChildItem *.csv | ForEach-Object {
Get-Content $_ | Select-Object -Skip 1
} | Add-Content merged.csv -Encoding UTF8

Pros: Handles hundreds of files in seconds.

Cons: No validation — if columns don't match, you won't know until later.

Method 3: Python with Pandas (Recommended)

The most reliable method. Handles mismatched columns, different encodings, and deduplication.

Basic Merge

python
import pandas as pd
import glob
files = glob.glob('data/*.csv')
dfs = [pd.read_csv(f) for f in files]
merged = pd.concat(dfs, ignore_index=True)
merged.to_csv('merged.csv', index=False)
print(f"Merged {len(files)} files — {len(merged)} total rows")

With Source Tracking

Add a column so you know which file each row came from:

python
dfs = []
for f in files:
df = pd.read_csv(f)
df['source_file'] = f
dfs.append(df)
merged = pd.concat(dfs, ignore_index=True)

With Deduplication

Remove duplicate rows that might exist across files:

python
merged = pd.concat(dfs, ignore_index=True)
before = len(merged)
merged = merged.drop_duplicates()
print(f"Removed {before - len(merged)} duplicate rows")

Handling Different Column Orders

Pandas matches columns by name, not position. If file A has [name, email, city] and file B has [city, name, email], pd.concat handles it correctly.

Handling Different Column Sets

If files have different columns, pd.concat fills missing values with NaN:

python
File 1: name, email
File 2: name, email, phone
Result: name, email, phone (phone is NaN for file 1 rows)
merged = pd.concat(dfs, ignore_index=True)

To keep only columns that exist in ALL files:

python
common_cols = set.intersection(*[set(df.columns) for df in dfs])
merged = pd.concat([df[list(commoncols)] for df in dfs], ignoreindex=True)

Method 4: Google Sheets

For people who prefer a visual approach:

Import each CSV into separate sheets (tabs)

In a new sheet, use IMPORTRANGE or copy-paste data from each tab

Export the combined sheet as CSV

Works fine for small files but hits Google Sheets' 10 million cell limit quickly.

Common Pitfalls

Duplicate Headers

The most common mistake. If you concatenate files without removing headers, you end up with header rows scattered through your data:


name,email,city
Alice,alice@example.com,Paris
name,email,city          ← this shouldn't be here
Bob,bob@example.com,London

All methods above handle this except raw cat concatenation. Always use tail -n +2 or equivalent to skip headers.

Mixed Encodings

One file in UTF-8, another in Latin-1. The merged result will have broken characters somewhere.

Fix: Convert all files to UTF-8 before merging:

bash
Convert Latin-1 to UTF-8
iconv -f ISO-8859-1 -t UTF-8 oldfile.csv > oldfile_utf8.csv

Or in Python:

python
df = pd.read_csv(f, encoding='latin-1')  # read with original encoding
pandas will write UTF-8 by default

Mixed Delimiters

Some files use commas, others use semicolons. Pandas detects this automatically in most cases, but you can be explicit:

python
df = pd.read_csv(f, sep=';')  # for semicolon-separated files

Overlapping Date Ranges

If you're merging monthly exports and the date ranges overlap (e.g., end of January and start of February both appear in two files), you'll get duplicate rows.

python
merged = merged.drop_duplicates(subset=['id', 'date'])

Which Method Should You Use?

| Scenario | Best Method |

|----------|-------------|

| 2-3 small identical files | Copy-paste or command line |

| Many identical files (same columns) | Command line |

| Files with different columns | Python (Pandas) |

| Need deduplication | Python (Pandas) |

| Recurring task (monthly merge) | Python script saved for reuse |

| Quick structure check before merging | CSV Viewer Online |

Automating Recurring Merges

If you merge files regularly, save a reusable script:

python
#!/usr/bin/env python3
"""Merge all CSV files in a directory."""
import pandas as pd
import glob
import sys
directory = sys.argv[1] if len(sys.argv) > 1 else '.'
output = sys.argv[2] if len(sys.argv) > 2 else 'merged.csv'
files = sorted(glob.glob(f'{directory}/*.csv'))
if not files:
print(f"No CSV files found in {directory}")
sys.exit(1)
dfs = []
for f in files:
df = pd.read_csv(f)
df['_source'] = f
dfs.append(df)
print(f"  Read {f}: {len(df)} rows")
merged = pd.concat(dfs, ignore_index=True)
merged = merged.dropduplicates(subset=[c for c in merged.columns if c != 'source'])
merged.to_csv(output, index=False)
print(f"\nMerged {len(files)} files into {output}: {len(merged)} rows")

Usage: python mergecsvs.py ./monthlyexports/ combined_2024.csv

Merging CSVs is a solved problem. Pick the method that matches your comfort level and file complexity, verify the result in a viewer, and you're done.