Hi Team,
We are currently working on a migration project from DB2 to PostgreSQL. Post-migration, we’re observing several performance issues such as long-running queries and occasional instance crashes. It also appears that some application-side workloads may not be optimized for PostgreSQL.
From a DBA perspective, I’m looking to proactively identify problem areas—such as:
- Long-running queries
- Jobs/stored procedures consuming high temp space
- Queries resulting in sequential scans due to missing indexes
- Lock waits, deadlocks, and memory-heavy operations
We already have key parameters enabled (pg_stat_statements, pg_buffercache, etc.), and PostgreSQL is generating logs in .csv format. However, the main challenge is efficiently analyzing these logs and identifying performance bottlenecks at scale (databases ranging from ~1TB to 15TB).
We currently don’t have third-party monitoring tools like Datadog, so I’m looking for recommendations on free or lightweight tools and best practices to:
- Parse and analyze PostgreSQL logs (especially CSV logs)
- Identify top resource-consuming queries and patterns
- Correlate temp usage, memory pressure, and query behavior
- Generate actionable insights for the engineering team
Any suggestions on tools, scripts, or approaches that have worked well in similar large-scale environments would be greatly appreciated.