On Thu, 2025-12-18 at 13:48 -0500, Matthew Planchard wrote:
> In a table with high insert frequency (~1.5k rows/s) and high query
> frequency (~1k queries/s), partitioned by record creation time, we have
> observed the following behavior:
>
> * When the current time crosses a partition boundary, all new records
> are written to the new partition, which was previously empty, as
> expected
>
> * Because the planner's latest knowledge of the partition was based on
> its state prior to the cutover, it assumes the partition is empty and
> creates plans that use sequential scans
>
> * The table accumulates tens to hundreds of thousands of rows, and the
> sequentail scans start to use nearly 100% of available database CPU
>
> * Eventually the planner updates thee stats and all is well, but the
> cycle repeats the next time the partitions cut over.
>
> We have tried setting up a cron job that runs ANALYZE on the most recent
> partition of the table every 15 seconds at the start of the hour, and
> while this does help in reducing the magnitude and duration of the
> problem, it is insufficient to fully resolve it (our engineers are still
> getting daily pages for high DB CPU utilization).
>
> We have considered maintaining a separate connection pool with
> connections that have `enable_seqscan` set to `off`, and updating the
> application to use that pool for these queries, but I was hoping the
> community might have some better suggestions.
I would try to tune autovacuum to check more often:
autovacuum_naptime = 5s # perhaps even less
Then hopefully the new partitions get analyzed early enough.
Yours,
Laurenz Albe