Home > mailing lists

Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H - Mailing list pgsql-hackers

From	Jim Nasby
Subject	Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H
Date	June 22, 2015 05:16:12
Msg-id	55875BA4.1060706@BlueTreble.com Whole thread Raw
In response to	Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List	pgsql-hackers

Tree view

On 6/20/15 12:55 PM, Tomas Vondra wrote:
> Well, actually I think it would be even more appropriate for very large
> tables. With a 2.5TB table, you don't really care whether analyze
> collects 5GB or 8GB sample, the difference is rather minor compared to
> I/O generated by the other queries etc. The current sample is already
> random enough not to work well with read-ahead, and it scans only a
> slightly lower number of blocks.

Have we ever looked at generating new stats as part of a seqscan? I 
don't know how expensive the math is but if it's too much to push to a 
backend perhaps a bgworker could follow behind the seqscan.
-- 
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

pgsql-hackers by date:

From: Jim Nasby
Date: 22 June 2015, 05:15:54
Subject: Re: Extension support for postgres_fdw

From: Jeff Janes
Date: 22 June 2015, 05:21:28
Subject: Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

Re: pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H - Mailing list pgsql-hackers

Previous

Next