Re: temporary tables, indexes, and query plans - Mailing list pgsql-performance

From Jon Nelson
Subject Re: temporary tables, indexes, and query plans
Date
Msg-id AANLkTing+x8XCwGfzkoWUtoqtBBk0gv9eOVPqQ=O36Dn@mail.gmail.com
Whole thread Raw
In response to Re: temporary tables, indexes, and query plans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: temporary tables, indexes, and query plans  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
On Wed, Oct 27, 2010 at 2:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Jon Nelson <jnelson+pgsql@jamponi.net> writes:
>> The most recent experiment shows me that, unless I create whatever
>> indexes I would like to see used *before* the large (first) update,
>> then they just don't get used. At all.
>
> You're making a whole lot of assertions here that don't square with
> usual experience.  I think there is some detail about what you're
> doing that affects the outcome, but since you haven't shown a concrete
> example, it's pretty hard to guess what the critical detail is.

First, let me supply all of the changed (from the default) params:

default_statistics_target = 500
maintenance_work_mem = 240MB
work_mem = 256MB
effective_cache_size = 1GB
checkpoint_segments = 128
shared_buffers = 1GB
max_connections = 30
wal_buffers = 64MB
shared_preload_libraries = 'auto_explain'

The machine is a laptop with 4GB of RAM running my desktop. Kernel is
2.6.36, filesystem is ext4 (for data) and ext2 (for WAL logs). The
disk is a really real disk, not an SSD.

The sequence goes exactly like this:

BEGIN;
CREATE TEMPORARY TABLE (20 columns, mostly text, a few int).
COPY (approx 8 million rows, ~900 MB)[1]
UPDATE (2.8 million of the rows)
UPDATE (7 rows)
UPDATE (250 rows)
UPDATE (3500 rows)
UPDATE (3100 rows)
a bunch of UPDATE (1 row)
...

Experimentally, I noticed that performance was not especially great.
So, I added some indexes (three indexes on one column each). One index
is UNIQUE.
The first UPDATE can't use any of the indexes. The rest should be able to.

In my experiments, I found that:

If I place the index creation *before* the copy, the indexes are used.
If I place the index creation *after* the copy but before first
UPDATE, the indexes are used.
If I place the index creation at any point after the first UPDATE,
regardless of whether ANALYZE is run, the indexes are not used (at
least, according to auto_analyze).

Does that help?


[1] I've been saying 10 million. It's really more like 8 million.
--
Jon

pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: Select count(*), the sequel
Next
From: Kenneth Marshall
Date:
Subject: Re: Select count(*), the sequel