Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Improving N-Distinct estimation by ANALYZE
Date
Msg-id 87fynx1ifd.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Improving N-Distinct estimation by ANALYZE  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: Improving N-Distinct estimation by ANALYZE  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
> > These numbers don't make much sense to me. It seems like 5% is about as slow
> > as reading the whole file which is even worse than I expected. I thought I was
> > being a bit pessimistic to think reading 5% would be as slow as reading 20% of
> > the table.

I have a theory. My test program, like Postgres, is reading in 8k chunks.
Perhaps that's fooling Linux into thinking it's a sequential read and reading
in 32k chunks internally. That would effectively make a 25% scan a full table
scan. And a 5% scan would be a 20% scan which is about where I would have
expected the breakeven point to be.

-- 
greg



pgsql-hackers by date:

Previous
From: Kurt Roeckx
Date:
Subject: Re: Fw: Is anyone interested in getting PostgreSQL working
Next
From: Andrew Dunstan
Date:
Subject: Re: plperl vs LC_COLLATE (was Re: Possible savepoint bug)