We are currently working on a migration project from DB2 to PostgreSQL. Post-migration, we’re observing several performance issues such as long-running queries and occasional instance crashes. It also appears that some application-side workloads may not be optimized for PostgreSQL.
From a DBA perspective, I’m looking to proactively identify problem areas—such as:
Long-running queries
Jobs/stored procedures consuming high temp space
Queries resulting in sequential scans due to missing indexes
Lock waits, deadlocks, and memory-heavy operations
We already have key parameters enabled (pg_stat_statements, pg_buffercache, etc.), and PostgreSQL is generating logs in .csv format. However, the main challenge is efficiently analyzing these logs and identifying performance bottlenecks at scale (databases ranging from ~1TB to 15TB).
We currently don’t have third-party monitoring tools like Datadog, so I’m looking for recommendations on free or lightweight tools and best practices to:
Parse and analyze PostgreSQL logs (especially CSV logs)
Identify top resource-consuming queries and patterns
Correlate temp usage, memory pressure, and query behavior
Generate actionable insights for the engineering team
Any suggestions on tools, scripts, or approaches that have worked well in similar large-scale environments would be greatly appreciated.
Have you set log_min_duration_statement to some number of milliseconds? When you do that, the query and its parameters show up in the log file. Grep for "duration:" to find statements taking longer than threshold milliseconds.
Does it require some manual effort? Sure. But it's free.