Re: FYI: Load times for a largish DB in 8.2 vs. 8.3 vs. 8.4 - Mailing list pgsql-general

From Tom Lane
Subject Re: FYI: Load times for a largish DB in 8.2 vs. 8.3 vs. 8.4
Date
Msg-id 18470.1245363496@sss.pgh.pa.us
Whole thread Raw
In response to FYI: Load times for a largish DB in 8.2 vs. 8.3 vs. 8.4  ("Todd A. Cook" <tcook@blackducksoftware.com>)
Responses Re: FYI: Load times for a largish DB in 8.2 vs. 8.3 vs. 8.4
Re: FYI: Load times for a largish DB in 8.2 vs. 8.3 vs. 8.4
List pgsql-general
"Todd A. Cook" <tcook@blackducksoftware.com> writes:
> First, the numbers:

>   PG Version    Load time     pg_database_size     autovac
> ----------------------------------------------------------
>    8.2.13        179 min       92,807,992,820        on
>    8.3.7         180 min       84,048,744,044        on (defaults)
>    8.4b2         206 min       84,028,995,344        on (defaults)
>    8.4b2         183 min       84,028,839,696        off

> The bulk of the data is in 16 tables, each having about 55 million rows of
> the form (int, int, smallint, smallint, int, int, int).  Each table has a
> single partial index on one of the integer columns.

Given that it's multiple tables, it'd be possible for autovacuum to
kick in and ANALYZE the data inserted into earlier tables while the
later ones were still being loaded.  If so, the discrepancy might be
explained by 8.4's more-aggressive statistics target, which means that
a background ANALYZE will take about 10x more work than before.

If you have time to repeat the experiments, it would be interesting to
see what happens with consistent default_statistics_target across 8.3
and 8.4.

            regards, tom lane

pgsql-general by date:

Previous
From: Erik Jones
Date:
Subject: Re: running pg_dump from python
Next
From: Andrew Maclean
Date:
Subject: postgresql-8.3.7 unexpected connection closures