Find the Biggest Files First When an Oracle Filesystem Fills Up

Find the Biggest Files First When an Oracle Filesystem Fills Up

Purpose

A full Oracle filesystem is a five-alarm incident. Whether it is the /u01/oradata mount holding datafiles, the Fast Recovery Area filling with archive logs, or the diag directory overflowing with trace files, the first question is always the same: which files are consuming the space? The two commands below — taken from the shutdownabort.com DBA Quick Guides (Andrew Barry, 2007–2013, preserved via the Wayback Machine) — are the DBA's first-response toolkit for answering that question within thirty seconds on any UNIX host without any Oracle database tools.

ls -lS (sort by file size, largest first) gives a flat list of the biggest individual files in a directory. du -sk | sort -rn gives a directory-level breakdown, identifying which subdirectory the bulk of the space is in before drilling down further. Use both together: ls to find big individual files (a runaway trace file, an old dump), du to identify which subdirectory the bulk of the space lives in.

Code

 1# Top files by size in the current directory — largest first
 2ls -lhS | head -20
 3
 4# Directory-level space breakdown, sorted largest first (kilobytes)
 5du -sk * | sort -rn | head -20
 6
 7# Recursive — breakdown of every subdirectory under a mount point
 8du -sk /u01/* | sort -rn | head -20
 9
10# Which Oracle database's diag dir is biggest
11du -sk $ORACLE_BASE/diag/rdbms/*/ | sort -rn | head -10

Code Breakdown

ls -lhS | head -20

ls -l — Long format: permissions, owner, size, modification date, filename on one line per file.

-h — Human-readable sizes: 1.2G, 450M, 12K instead of raw bytes. Essential for quickly reading output in large directories where sizes span several orders of magnitude.

-S — Sort by file size, largest first. The biggest file appears on the first output line.

| head -20 — Limit to the top 20 entries. On a trace directory with thousands of files, the full ls output scrolls off the screen; head -20 gives the actionable part immediately.

du -sk * | sort -rn | head -20

du -sk — Disk usage summary.

  • -s — Summarize: print only the total for each argument rather than recursing into subdirectories. One line per immediate child of the current directory.
  • -k — Output in kilobytes. Consistent numeric units are required for sort -n to produce correct results; without -k, units may be mixed (KB vs blocks depending on locale).

sort -rn

  • -r — Reverse order (largest first).
  • -n — Numeric sort. Without -n, 9 sorts after 10 alphabetically — the wrong result for sizes.

du -sk /u01/* — Expands to every immediate child of /u01/. Starting from the mount point and drilling down gives a top-level breakdown without recursing all the way to individual files, which can be slow on large directory trees.

Key Points

  • Check the mount point with df -h first. df -h /u01/oradata tells you which filesystem is full before digging. There is no point running du in the wrong directory.
  • Oracle trace files are the most common surprise. The ADR trace directory ($ORACLE_BASE/diag/rdbms/<db>/<sid>/trace/) grows to tens of gigabytes when an incident loop fires. du -sk $ORACLE_BASE/diag/rdbms/*/ identifies which database is writing the most.
  • The Fast Recovery Area has its own quota separate from filesystem space. Even if the OS filesystem is not full, the FRA can be "full" to Oracle when V$RECOVERY_FILE_DEST.SPACE_USED reaches SPACE_LIMIT. Always check both df -h and SELECT * FROM V$RECOVERY_FILE_DEST.
  • Never delete files while the instance is running without checking lsof first. A deleted file whose descriptor is still held open by Oracle will not reclaim disk space until Oracle closes the descriptor — typically requiring the process or the instance to restart. Use lsof | grep deleted to identify these space ghosts.
  • ls -lhS operates on the current directory only. It does not recurse. For recursive large-file hunting, combine with find: find /u01 -type f -printf "%s %p\n" | sort -rn | head -20.

Insights and Best Practices

Top-down triage sequence

Run this sequence in order when a filesystem hits 95%+:

1df -h /u01                                               # Confirm which filesystem is full
2du -sk /u01/* | sort -rn | head -10                      # Top-level directory breakdown
3du -sk $ORACLE_BASE/diag/rdbms/*/ | sort -rn | head -5  # Which DB's diag dir is biggest
4ls -lhS $ORACLE_BASE/diag/rdbms/<db>/<sid>/trace/ | head -20  # Biggest individual files

This narrows from filesystem → top-level directory → per-database diag directory → individual file in four commands, each taking under five seconds.

FRA full: the RMAN catalog problem

The most common reason an Oracle FRA appears full when disk space is available: archive logs that have been backed up to tape still occupy space in the FRA because RMAN's catalog has not been reconciled. du -sk on the FRA shows real disk usage, but Oracle reports FRA full via ORA-19809 until RMAN confirms the backups:

1RMAN> CROSSCHECK ARCHIVELOG ALL;
2RMAN> DELETE EXPIRED ARCHIVELOG ALL;

After this, Oracle reclaims the FRA space even without deleting files on disk.

Automate with a threshold alert

A simple cron catches filesystem growth before it becomes an outage:

1# Alert when any Oracle-related filesystem exceeds 90% full
2df -h | awk 'NR>1 {sub(/%/,"",$5); if ($5+0 > 90 && /u01|fra|diag/) print $0}' \
3  | mail -s "Oracle filesystem WARNING $(hostname)" dba@company.com

Tune the path patterns (u01|fra|diag) to match your mount point naming convention.

When to Run This

  • Immediately when ORA-01565 (file access error) or ORA-19809 (FRA quota exceeded) appears
  • When archive log shipping stalls because ARCH cannot write
  • Proactively as part of weekly capacity checks before a scheduled maintenance window
  • Before patching: confirm enough staging space for the patch zip and OPatch working area
  • When a developer reports unexpected slowness — a nearly full temp or undo tablespace filesystem causes waits even before a hard ORA- error

Troubleshooting Common Issues

If du -sk * returns Permission denied on some entries, run as the oracle OS user who owns the Oracle directories, or use sudo. On hardened systems the DBA may need explicit sudo privileges to read /u01/oradata if it is owned by a dedicated oinstall group.

On very large directories (millions of trace files), du -sk and ls -lhS can take 30–60 seconds. Add -x to du (du -skx /u01/*) to restrict to the current filesystem and avoid crossing NFS mount points, which can slow the scan by an order of magnitude.

If df -h shows free space but Oracle reports the tablespace is full, the tablespace has hit its MAXSIZE limit (autoextend ceiling) or the datafiles are at their 32 GB individual file limit. These are Oracle-level constraints, not OS-level — check DBA_DATA_FILES.MAXBYTES and AUTOEXTENSIBLE.

References

Posts in this series