Back to Blog

CSV File Too Large for Excel? Here's What to Do

Published: March 31, 2026

CSV File Too Large for Excel? Here's What to Do

Meta description: Excel can't handle your large CSV file? Learn practical solutions to view, filter, and analyze CSV files with millions of rows — no crashes, no limits.

You tried to open a 2GB CSV file in Excel and got the dreaded "file not loaded completely" message. Or maybe Excel just froze. Or it opened but silently dropped everything after row 1,048,576.

That last one is the worst — because you might not even realize you're missing data.

Excel has a hard limit of 1,048,576 rows and 16,384 columns. If your CSV exceeds that, Excel won't tell you clearly. It just stops loading. For many real-world datasets — server logs, transaction records, IoT sensor data, e-commerce catalogs — a million rows isn't enough.

Here's what actually works.

Quick Fix: Just Need to Look at the Data?

If you don't need to edit or analyze — you just need to see what's in the file — use a CSV viewer. Online viewers handle large files without the row limits of spreadsheet software.

This is the fastest way to check column names, spot data quality issues, or find specific rows without waiting for Excel to choke.

Option 1: Filter Before Opening

You probably don't need all the data at once. Extract what you need first, then open the smaller file in Excel.

With Command Line

bash

First 100,000 rows (including header)

head -100001 huge_file.csv > sample.csv

Only rows containing "2024" (keeps header + matching rows)

head -1 huge_file.csv > filtered.csv

grep "2024" huge_file.csv >> filtered.csv

Specific columns only (e.g., columns 1, 3, 5)

cut -d',' -f1,3,5 hugefile.csv > fewercolumns.csv

With Python

python

import pandas as pd

Read only first 100K rows

df = pd.readcsv('hugefile.csv', nrows=100000)

df.to_csv('sample.csv', index=False)

Read only specific columns

df = pd.readcsv('hugefile.csv', usecols=['name', 'email', 'date'])

df.tocsv('fewercolumns.csv', index=False)

Filter while reading (chunked processing)

chunks = pd.readcsv('hugefile.csv', chunksize=50000)

filtered = pd.concat(

chunk[chunk['country'] == 'US'] for chunk in chunks

)

filtered.tocsv('usonly.csv', index=False)

Option 2: Split Into Multiple Files

Break the large file into Excel-friendly chunks:

python

import pandas as pd

chunk_size = 500000 # rows per file

reader = pd.readcsv('hugefile.csv', chunksize=chunk_size)

for i, chunk in enumerate(reader):

chunk.tocsv(f'part{i+1}.csv', index=False)

print(f"Wrote part_{i+1}.csv: {len(chunk)} rows")

Now you can open each part in Excel separately.

Option 3: Use Tools Built for Large Data

Python + Pandas

Pandas can handle files much larger than Excel, limited only by your RAM:

python

import pandas as pd

df = pd.readcsv('hugefile.csv')

print(f"Rows: {len(df):,}")

print(f"Columns: {len(df.columns)}")

print(df.describe())

For files larger than your RAM, use chunked reading:

python

total_rows = 0

for chunk in pd.readcsv('massivefile.csv', chunksize=100000):

total_rows += len(chunk)

# Process each chunk

print(f"Total rows: {total_rows:,}")

Polars (Faster Than Pandas)

Polars is a newer DataFrame library that's significantly faster for large files:

python

import polars as pl

Lazy evaluation — only reads what's needed

df = pl.scancsv('hugefile.csv')

result = (

df.filter(pl.col('amount') > 1000)

.group_by('category')

.agg(pl.col('amount').sum())

.collect()

)

print(result)

DuckDB (SQL on CSV Files)

Query CSV files directly with SQL, no import step:

sql

-- In DuckDB CLI or Python

SELECT category, COUNT(*), AVG(amount)

FROM 'huge_file.csv'

WHERE date >= '2024-01-01'

GROUP BY category

ORDER BY COUNT(*) DESC;

python

import duckdb

result = duckdb.sql("""

SELECT category, COUNT(*) as count, AVG(amount) as avg_amount

FROM 'huge_file.csv'

WHERE date >= '2024-01-01'

GROUP BY category

""").fetchdf()

DuckDB processes CSVs faster than Pandas and uses much less memory.

Miller (mlr) — Command Line Swiss Army Knife

bash

Summary stats

mlr --csv stats1 -a count,mean,min,max -f amount huge_file.csv

Filter and sort

mlr --csv filter '$country == "US"' then sort-nf amount huge_file.csv

Group by

mlr --csv group-by category then stats1 -a sum -f amount huge_file.csv

Option 4: Load Into a Database

For repeated analysis on the same data, import it into a database:

SQLite (Zero Setup)

bash

Import CSV into SQLite

sqlite3 data.db <

.mode csv

.import hugefile.csv mytable

EOF

Now query it

sqlite3 data.db "SELECT COUNT(*) FROM my_table WHERE country='US'"

PostgreSQL

sql

COPY mytable FROM '/path/to/hugefile.csv' WITH (FORMAT csv, HEADER true);

Once the data is in a database, you can run any query without worrying about row limits.

Comparison

| Tool | Max Rows | Speed | Learning Curve | Best For |

|------|----------|-------|----------------|----------|

| Excel | 1,048,576 | Slow | None | Small files, quick edits |

| Google Sheets | ~100K-200K | Slow | None | Collaboration |

| CSV Viewer Online | No hard limit | Fast | None | Quick inspection |

| Pandas | Limited by RAM | Medium | Low | Analysis, transformation |

| Polars | Limited by RAM | Fast | Low | Large-scale analysis |

| DuckDB | Limited by disk | Very fast | Medium | SQL queries on files |

| SQLite | Billions | Fast | Medium | Repeated queries |

How to Avoid This Problem

If you're generating CSVs that consistently exceed Excel's limits, consider:

  1. Export with filters built in. Only export the data you actually need.
  1. Split by time period. Monthly files instead of one annual dump.
  1. Use a database from the start. If you have millions of records, a database is the right tool — not a flat file.
  1. Compress with Parquet. For archival, Parquet files are smaller and faster than CSV. Every tool mentioned above can read them.

Excel is a great tool, but it was never designed for millions of rows. When your data outgrows it, move to something that can handle the scale. The tools above are all free and work with files of any size.