Hi Team,
We are currently working on a migration project from DB2 to PostgreSQL. Post-migration, we’re observing several performance issues such as long-running queries and occasional instance crashes. It also appears that some application-side workloads may not be optimized for PostgreSQL.
From a DBA perspective, I’m looking to proactively identify problem areas—such as:
- Long-running queries
- Jobs/stored procedures consuming high temp space
- Queries resulting in sequential scans due to missing indexes
- Lock waits, deadlocks, and memory-heavy operations
We already have key parameters enabled (pg_stat_statements, pg_buffercache, etc.), and PostgreSQL is generating logs in .csv format. However, the main challenge is efficiently analyzing these logs and identifying performance bottlenecks at scale (databases ranging from ~1TB to 15TB).
We currently don’t have third-party monitoring tools like Datadog, so I’m looking for recommendations on free or lightweight tools and best practices to:
- Parse and analyze PostgreSQL logs (especially CSV logs)
- Identify top resource-consuming queries and patterns
- Correlate temp usage, memory pressure, and query behavior
- Generate actionable insights for the engineering team
Any suggestions on tools, scripts, or approaches that have worked well in similar large-scale environments would be greatly appreciated.
Have you set log_min_duration_statement to some number of milliseconds? When you do that, the query and its parameters show up in the log file. Grep for "duration:" to find statements taking longer than threshold milliseconds.
Does it require some manual effort? Sure. But it's free.
Barring that, try installing pgbadger.
-- Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.