Thread: BUG #14054: "create index using gist ..." on large table never completes.

BUG #14054: "create index using gist ..." on large table never completes.

From
tsingle@muddypaddock.com
Date:
The following bug has been logged on the website:

Bug reference:      14054
Logged by:          Tim Singletary
Email address:      tsingle@muddypaddock.com
PostgreSQL version: 9.5.1
Operating system:   osx 10.11.4
Description:

"create index using gist ..." ran for over three days on 100 million row
table; completes within a couple hours on a 10 million row table.

Had previously tried same experiment with 9.2.3 on a Linux box where the 100
million row table's index built within 5 hours.

The table looks like

        CREATE TABLE mock (
            handle INT NOT NULL
            ,lowest INT NOT NULL
            ,highest INT NOT NULL
            ,stuff TEXT
        );

Data was randomly generated.

The command that never completes is:

create index on mock using gist (handle,(int4range(lowest,highest,'[]')));

Re: BUG #14054: "create index using gist ..." on large table never completes.

From
Emre Hasegeli
Date:
> Had previously tried same experiment with 9.2.3 on a Linux box where the 100
> million row table's index built within 5 hours.

It finished in an hour on my laptop for serially generated 100 million
rows.  I guess the regression is caused by too many overlapping
values.  GiST doesn't tend to perform well in this case.  I would give
SP-GiST a try.
On Wed, Mar 30, 2016 at 1:38 AM,  <tsingle@muddypaddock.com> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      14054
> Logged by:          Tim Singletary
> Email address:      tsingle@muddypaddock.com
> PostgreSQL version: 9.5.1
> Operating system:   osx 10.11.4
> Description:
>
> "create index using gist ..." ran for over three days on 100 million row
> table; completes within a couple hours on a 10 million row table.
>
> Had previously tried same experiment with 9.2.3 on a Linux box where the 100
> million row table's index built within 5 hours.
>
> The table looks like
>
>         CREATE TABLE mock (
>             handle INT NOT NULL
>             ,lowest INT NOT NULL
>             ,highest INT NOT NULL
>             ,stuff TEXT
>         );
>
> Data was randomly generated.

"Randomly generated" covers a lot of territory and is not sufficient
to be reproducible.  Please provide the random generator.

Thanks,

Jeff

Re: BUG #14054: "create index using gist ..." on large table never completes.

From
Tim Singletary
Date:
My data generator is at https://github.com/flightlesson/mockdata.git

Here=E2=80=99s how I load 100 million rows:

      createdb mockdb
      mockdatagenerator --create-table --nrows 100000000 | psql mockdb

After loading the data

      CREATE INDEX ON mock USING gist =
(handle,(int4range(range_low,range_high,=E2=80=99[]')));

seems to take forever.

Thanks,

tim

> On Apr 6, 2016, at 10:21 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>=20
> On Wed, Mar 30, 2016 at 1:38 AM,  <tsingle@muddypaddock.com> wrote:
>> The following bug has been logged on the website:
>>=20
>> Bug reference:      14054
>> Logged by:          Tim Singletary
>> Email address:      tsingle@muddypaddock.com
>> PostgreSQL version: 9.5.1
>> Operating system:   osx 10.11.4
>> Description:
>>=20
>> "create index using gist ..." ran for over three days on 100 million =
row
>> table; completes within a couple hours on a 10 million row table.
>>=20
>> Had previously tried same experiment with 9.2.3 on a Linux box where =
the 100
>> million row table's index built within 5 hours.
>>=20
>> The table looks like
>>=20
>>        CREATE TABLE mock (
>>            handle INT NOT NULL
>>            ,lowest INT NOT NULL
>>            ,highest INT NOT NULL
>>            ,stuff TEXT
>>        );
>>=20
>> Data was randomly generated.
>=20
> "Randomly generated" covers a lot of territory and is not sufficient
> to be reproducible.  Please provide the random generator.
>=20
> Thanks,
>=20
> Jeff