Recursive find and grep to Search Oracle Trace Files for Any String

Recursive find and grep to Search Oracle Trace Files for Any String

Purpose

Oracle's Automatic Diagnostic Repository (ADR) accumulates hundreds or thousands of trace files — one per background process event, one per incident, one per session that hits an error deep enough to dump. Finding the trace file that contains evidence of a specific error or a specific SQL statement requires searching the content of all of them, not just their names. The two-command pipeline below — from the shutdownabort.com DBA Quick Guides (Andrew Barry, 2007–2013, preserved via the Wayback Machine) — uses find to enumerate trace files and grep -l to identify which ones contain the target string.

The grep -l (list-only) flag is the critical detail: it tells grep to print only the filename of matching files rather than every matching line. For a directory with 2000 trace files, this suppresses the noise and gives you a short list of files worth reading. Once you have that shortlist, open each file or run a targeted grep with context flags to read the surrounding lines.

Code

 1# Find trace files containing a specific ORA- error code
 2find $ORACLE_BASE/diag -name "*.trc" -exec grep -l "ORA-04031" {} \;
 3
 4# General form — replace the search string with any keyword
 5find $ORACLE_BASE/diag/rdbms/<db>/<sid>/trace -name "*.trc" \
 6  -exec grep -l "target_string" {} \;
 7
 8# Faster: batch mode with + instead of \; (one grep call per batch of files)
 9find $ORACLE_BASE/diag -name "*.trc" -exec grep -l "ORA-04031" {} +
10
11# Print filename AND matching lines with context (drop -l)
12find $ORACLE_BASE/diag/rdbms -name "*.trc" -exec grep -n "ORA-04031" {} /dev/null \;

Code Breakdown

find $ORACLE_BASE/diag -name "*.trc"

Walks the entire ADR tree looking for files with the .trc extension. Oracle writes trace files for background process events, foreground session errors, and incidents. They all end in .trc regardless of the process that created them. Starting from $ORACLE_BASE/diag catches traces from all databases on the server and incident traces under the incident/incdir_*/ subdirectory as well as the standard trace/ directory.

-exec grep -l "ORA-04031" {} \;

Runs grep -l on each .trc file found by find.

  • -l — Print only the filename of files that match. If a file has 50 occurrences of the string, grep -l still prints it exactly once. This converts a potentially overwhelming multi-thousand-line grep output into a short list of actionable filenames.
  • {} — The placeholder that find replaces with the current filename on each execution.
  • \; — Terminates the -exec expression and forks one grep process per file. This is the simple form.

-exec grep -l "ORA-04031" {} +

The + terminator batches multiple filenames into a single grep call, significantly reducing process-creation overhead. On a directory with 5000 trace files, + can be 10× faster than \;. The output is identical; the difference is purely performance.

grep -n "ORA-04031" {} /dev/null

The variant without -l prints the matching lines. -n adds line numbers. The /dev/null trick (feeding a second path) forces grep to print the filename on every output line — same portability technique used in the grep-ORA- alert-log post.

Key Points

  • .trc vs .trm files. Oracle 11g and later writes .trm (trace map) binary index files alongside each .trc. They are not text-searchable. The *.trc pattern correctly excludes them.
  • -exec {} + is much faster than -exec {} \; on large directories. The \; form forks a new grep process for every file. The + form batches file arguments, reducing process overhead by an order of magnitude on directories with thousands of files.
  • Incident traces live under a separate path. When Oracle raises an ORA-00600 or ORA-07445 internal error, it creates an incident trace under diag/rdbms/<db>/<sid>/incident/incdir_<N>/. Starting the find from $ORACLE_BASE/diag rather than just trace/ catches these automatically.
  • Alert logs are searchable with the same pattern. Alert logs end in .log not .trc. Use -name "alert_*.log" to search only alert logs, or omit the -name filter entirely to search all text files in the ADR.
  • grep is case-sensitive by default. Add -i for case-insensitive search — useful when hunting SQL text where the case is unknown (Oracle writes SQL to trace files in the case it received it, which may be mixed).

Insights and Best Practices

Narrow the time window with -mtime

On a busy system the trace directory grows continuously. Limit the find to recently modified files to eliminate historical noise:

1# Files modified in the last 24 hours only
2find $ORACLE_BASE/diag/rdbms/<db>/<sid>/trace -name "*.trc" -mtime -1 \
3  -exec grep -l "ORA-04031" {} +

-mtime -1 means modified less than one day ago. This typically cuts the search set from thousands of historical traces to a handful written since yesterday — dramatically faster on old servers with large trace accumulations.

Read context around the match after finding the file

grep -l finds the file; you still need to read the relevant section. The most useful follow-up:

1# Show 10 lines before and after the match
2grep -n -B10 -A10 "ORA-04031" /path/to/found.trc | head -80

-B10 (before) and -A10 (after) give the error context without opening the full multi-megabyte trace file in an editor.

Hunt SQL text in trace files

When you know a specific table or SQL statement caused a problem but you do not know which trace file captured it:

1# Create a reference file just before reproducing the problem
2touch /tmp/checkpoint
3
4# Reproduce the problem, then search files newer than the checkpoint
5find $ORACLE_BASE/diag/rdbms -name "*.trc" -newer /tmp/checkpoint \
6  -exec grep -il "problematic_table_name" {} +

The -newer /tmp/checkpoint trick limits to files newer than a reference file. This beats -mtime for reproducible tests because it is not rounded to day boundaries.

Use ADRCI for structured queries

When you need to correlate trace files with incident IDs or query them by time range rather than content, ADRCI is the right tool:

1adrci
2ADRCI> show incident -mode detail
3ADRCI> show tracefile -i <incident_id>

The find+grep approach complements ADRCI: use grep first to identify which files contain the evidence, then ADRCI to get the structured metadata (incident ID, problem key, time).

When to Run This

  • After a reported error that produced a trace (ORA-04031 shared pool, ORA-04030 out-of-process memory, ORA-00600 internal)
  • When support asks "which trace file contains the incident?"
  • When hunting a specific SQL statement in a session trace (after ALTER SESSION SET SQL_TRACE = TRUE)
  • When auditing across multiple databases: start from $ORACLE_BASE/diag/rdbms/ to search all databases in one pass
  • When ADRCI is slow or unavailable — grep is always faster for bulk text search across many files

Troubleshooting Common Issues

If find produces no output, verify $ORACLE_BASE is set and the trace/ directory exists: ls $ORACLE_BASE/diag/rdbms/. On Oracle 10g and earlier the ADR directory structure does not exist — trace files go to $ORACLE_BASE/admin/<sid>/udump/ (user) and bdump/ (background). Adjust the find path accordingly.

If grep outputs Binary file matches for a .trc file, the file is corrupted or was truncated mid-write. Add -I to the grep call (grep -lI "ORA-04031") to silently skip binary files and continue searching the rest. Report the corrupted file to Oracle Support along with the incident number.

References

Posts in this series