Scan Every Oracle Alert and Trace Log for ORA- Errors with grep

Scan Every Oracle Alert and Trace Log for ORA- Errors with grep

Purpose

Every Oracle environment generates alert logs and process trace files recording instance events, errors, and diagnostics. When a support call comes in — "the database was slow last night, can you check the logs?" — the fastest first pass is a single grep across every log file in the diagnostic directory. This one-liner, drawn from the shutdownabort.com DBA Quick Guides (Andrew Barry, 2007–2013, preserved via the Wayback Machine), scans all alert logs under the Oracle diagnostic root and prints every line containing an ORA- error code.

On a single-instance database the diagnostic root is $ORACLE_BASE/diag/rdbms/<db_unique_name>/<sid>/. On a server hosting multiple databases, each database has its own subdirectory. The commands below work across all of them in a single pass by starting the search from $ORACLE_BASE/diag.

The result is raw grep output — one line per match, one match per ORA- occurrence. Pipe through sort -u to collapse repeated occurrences of the same message and awk to extract just the error codes for a fast triage summary.

Code

1# Scan all alert logs under the Oracle diagnostic directory
2find $ORACLE_BASE/diag -name "alert_*.log" -exec grep "ORA-" {} /dev/null \;
3
4# Unique ORA- error codes only — fast triage summary
5grep -r "ORA-" $ORACLE_BASE/diag/rdbms/ | grep -o "ORA-[0-9]*" | sort -u

Code Breakdown

find $ORACLE_BASE/diag -name "alert_*.log" -exec grep "ORA-" {} /dev/null \;

find $ORACLE_BASE/diag -name "alert_*.log" — Walks the entire ADR (Automatic Diagnostic Repository) tree looking for files matching the alert log naming pattern. Oracle 11g and later place the text alert log at diag/rdbms/<db_name>/<sid>/trace/alert_<sid>.log. A server hosting four SIDs produces four alert log paths; find hits them all.

-exec grep "ORA-" {} /dev/null \; — Runs grep "ORA-" on each matched file. The /dev/null argument is a portability trick: when grep receives only one file argument, many older builds omit the filename from the output. Adding /dev/null as a second argument forces grep to treat the real file as one member of a multi-file match, which prints the filename on every hit line.

grep -r "ORA-" $ORACLE_BASE/diag/rdbms/ | grep -o "ORA-[0-9]*" | sort -u

grep -r "ORA-" — Recursive grep descending from the rdbms subdirectory. Searches alert logs, trace files, and incident logs together. Broader than the find variant but also slower on environments with thousands of trace files.

grep -o "ORA-[0-9]*"-o prints only the matching portion rather than the whole line. Each ORA- code (e.g., ORA-04031) appears once per match regardless of surrounding text.

sort -u — Collapses thousands of repeated error lines into a deduplicated list of distinct ORA- codes — useful for a two-minute triage where you need to know which errors are present, not how many times they appeared.

Key Points

  • $ORACLE_BASE must be set. On most Oracle installations it is exported from the oracle OS user's .bash_profile. If it is not set, substitute the literal path (commonly /u01/app/oracle or /oracle).
  • Alert logs grow unbounded. On busy systems the current alert log can be tens of megabytes. grep on a 50 MB file is fast; grep across a hundred 50 MB files on NFS is not. Limit to recently modified files with find -mtime -7 if the search is slow.
  • ADR directories do not require DB connectivity. This command works when the instance is down, crashed, or in mount mode — the files are plain text on the OS filesystem.
  • ORA-00600 and ORA-07445 warrant immediate escalation. These are internal errors that always produce an incident directory under diag/rdbms/<db>/<sid>/incident/. Use find $ORACLE_BASE/diag -name "*.trc" -path "*/incident/*" to locate the supporting incident traces.
  • Timezone matters. Alert log timestamps are in the database server's local time. If you are correlating alert log entries with application logs in a different timezone, verify with SELECT DBTIMEZONE FROM DUAL.

Insights and Best Practices

Scope the search by time first

On a well-established database the alert log may span years. Before scanning the whole file, narrow the window:

1# Files modified in the last 24 hours only
2find $ORACLE_BASE/diag -name "alert_*.log" -mtime -1 -exec grep "ORA-" {} /dev/null \;

-mtime -1 means modified less than one day ago. This cuts the search set from potentially gigabytes of historical logs to the files written since yesterday.

Build a daily ORA- digest cron

The one-liner pairs well with a nightly cron that mails the distinct ORA- code list to the DBA on-call:

10 6 * * * find $ORACLE_BASE/diag -name "alert_*.log" -mtime -1 \
2  -exec grep "ORA-" {} /dev/null \; \
3  | grep -o "ORA-[0-9]*" | sort -u \
4  | mail -s "ORA- digest $(date +%Y-%m-%d)" dba@company.com

This gives one email per day listing every distinct error code that appeared in the previous 24 hours — a lightweight alternative to Oracle Enterprise Manager for shops without OEM licensed.

Filter out expected noise

Some ORA- codes are benign and appear routinely: ORA-12012 (autotask job failure on development DBs), ORA-01013 (user-requested cancel), ORA-28000 (account locked). Filter them with grep -v:

1grep -r "ORA-" $ORACLE_BASE/diag/rdbms/ \
2  | grep -v "ORA-01013\|ORA-12012\|ORA-28000" \
3  | grep -o "ORA-[0-9]*" | sort -u

Build this exclusion list over time as you learn which codes are routine on your specific databases.

Correlate with V$DIAG_ALERT_EXT

Once you have the ORA- code list and the database is available, cross-reference:

1SELECT message_text, originating_timestamp
2FROM   v$diag_alert_ext
3WHERE  message_text LIKE '%ORA-04031%'
4ORDER  BY originating_timestamp DESC;

V$DIAG_ALERT_EXT exposes the alert log content as structured rows, making it easier to filter by time range or message content than grepping the raw text file.

When to Run This

  • After a user-reported slowdown or outage: first grep before opening a support ticket
  • Before rotating or archiving the alert log: capture a snapshot of the ORA- history
  • As part of daily health checks on unmonitored instances
  • When OEM or Cloud Control is unavailable and you need a quick error summary
  • When the database is down: check for the cause in the alert log without needing a database connection

Troubleshooting Common Issues

If $ORACLE_BASE is not set, the find command produces "No such file or directory." Set it explicitly: export ORACLE_BASE=/u01/app/oracle or substitute the literal path. You can verify the correct path from /etc/oratab — the third field is the ORACLE_HOME, and ORACLE_BASE is typically two levels up.

On systems with thousands of trace files, the recursive grep may run for 30 seconds or more. In that case, narrow the path to diag/rdbms/<specific_db>/<sid>/trace/ and add -name "alert_*.log" to skip the trace files and search only alert logs.

If grep returns Binary file matches, one of the matched files is not plain text. Add --include="*.log" to the recursive grep to restrict to .log files only, or use grep -I to silently skip binary files.

References

Posts in this series