Re: Updates are slow.. - Mailing list pgsql-general

From Tom Lane
Subject Re: Updates are slow..
Date
Msg-id 9762.1023894047@sss.pgh.pa.us
Whole thread Raw
In response to Re: Updates are slow..  ("Tom Burke" <lists@spamex.com>)
List pgsql-general
"Tom Burke" <lists@spamex.com> writes:
> After creating this index, the update time jumped up again to 21 mins!
> real    21m3.536s
> user    0m0.010s
> sys    0m0.000s

> This makes no sense to me because the new index is not related to
> the column being updated at all.

Doesn't matter: we're entering a new tuple (new version of the row) so
new entries must be made in all indexes for the table.

> Actually, it is not unique.
> This is not really an emp table, and in fact ~600K of the email
> addresses are duplicates.

Ah so.  The btree code doesn't really like huge numbers of duplicate
keys --- it still works, but the algorithms degenerate to O(N) instead
of O(log N).  I imagine the previous two-column incarnation with
(email, dept_id) was also quite non-unique?

So the bottom line is not to bother with making indexes on highly
redundant columns; they aren't fast and they don't do you any good
anyway.

If the situation is that you've got, say, 600K empty email addresses and
another 600K that are actually useful, you might consider making a
partial index that excludes the empty addresses; then that could be used
to look up real addresses without paying the overhead to index the empty
ones.  See
http://www.postgresql.org/users-lounge/docs/7.2/postgres/indexes-partial.html

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: dynamic querys
Next
From: Manfred Koizar
Date:
Subject: Re: Updates are slow..