Re: pg_upgrade and statistics - Mailing list pgsql-hackers

From Greg Stark
Subject Re: pg_upgrade and statistics
Date
Msg-id CAM-w4HPU2Pf2BB08kRkS+Sp91nYvgrGoYVh5TH38JuucS=L99Q@mail.gmail.com
Whole thread Raw
In response to Re: pg_upgrade and statistics  (Bruce Momjian <bruce@momjian.us>)
Responses Re: pg_upgrade and statistics
List pgsql-hackers
On Tue, Mar 13, 2012 at 7:30 PM, Bruce Momjian <bruce@momjian.us> wrote:
> OK, so a single 44GB tables took 2.5 minutes to analyze;  that is not
> good.  It would require 11 such tables to reach 500GB (0.5 TB), and
> would take 27 minutes.  The report I had was twice as long, but still in
> the ballpark of "too long".  :-(

Well my claim was that it shouldn't scale that way.

Assuming you have the same statistics target the sample size is the
same regardless of the size of the table. When the table is small
there might be a significant number of tuples on the same pages so the
number of pages to read might be much smaller than the sample size. As
the table grows it would eventually reach the point where each page
has about 1 sampled tuple on it but from that point on increasing the
size of the table won't increase the amount of i/o Postgres does.

However on thinking about it further hard drives don't really work
that way. Each page that's read in causes a seek -- if the head is on
a different track. Reading two pages on the same page is about the
same speed as reading a single page on that track. So even though
Postgres is doing the same amount of i/o the hard drive is actually
doing more and more seeks up to the point where each sample is a
separate seek.

analyze.c samples 300 * statistics_target rows -- that means the
default is it to 30,000 rows. So the point where each row is on a
separate page would be around 8kB*30,000 or 240MB. And if each row
causes a seek, and each seek takes 5ms then that should be about 150s
per table. Once your table is large enough to take 150s to analyze it
shouldn't really take any longer when it grows larger. I'm not sure at
what size that would happen but I'm guessing it would be at about 8MB
* 30,000 or about 240GB per table plus or minus an order of magnitude.

So in short, I guess the time to analyze does increase as the table
grows larger. Not because the number of rows being sample grows but
because the rows are farther apart and that causes more iops.

hmph. One thing that could speed up analyze on raid arrays would be
doing prefetching so more than one spindle can be busy. Sacrificing
statistical accuracy by reading a less random sample on contiguous
blocks of rows would also be faster but less accurate -- we've gone
around on that topic at least once in the past.

+1 to the idea that pg_upgrade could run analyze with a
statistics_target of 10 and then let autovacuum analyze it again later
with the normal targets at its leisure.

--
greg


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: about EncodeDateTime() arguments
Next
From: Bruce Momjian
Date:
Subject: Re: pg_upgrade and statistics