Re: problems maintaining boolean columns in a large table - Mailing list pgsql-general

From Timo Klecker
Subject Re: problems maintaining boolean columns in a large table
Date
Msg-id 000301caaaf3$8fb79b30$af26d190$@de
Whole thread Raw
In response to problems maintaining boolean columns in a large table  (Ben Campbell <ben@scumways.com>)
Responses Re: problems maintaining boolean columns in a large table
List pgsql-general
Hi Ben,

could you post your trigger function? When you need to rebuild the index,
you could disable the trigger setting the flag if the article is modified.
This could speed up your UPDATE.



Mit freundlichen Grüßen
Timo Klecker




-----Ursprüngliche Nachricht-----
Von: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] Im Auftrag von Ben Campbell
Gesendet: Dienstag, 9. Februar 2010 12:26
An: pgsql-general@postgresql.org
Betreff: [GENERAL] problems maintaining boolean columns in a large table

I've got a database that holds a bunch of articles in a table called
'article'. It has a bunch of columns, and each row might hold a few KB
of data, say.
I'm maintaining a separate fulltext database, and so I added a boolean
flag, 'needs_indexing' to my 'article' table to keep track of which
articles have been indexed (and I have some trigger functions on
'article' to automatically set the flag if the article is modified).

It all works fine.
Except when I want to rebuild my index from scratch. I need to set all
those flags, but it takes _ages_  to do "UPDATE article SET
needs_indexing=true;" (many hours at least - I've never let it run to
completion)

I _think_ the reason it takes so long is that postgresql doesn't modify
rows in place - it creates an entry for the modified row and zaps the
old one. So by touching _every_ row I'm basically forcing it to rebuild
my whole database... I've got about 2 million rows in 'articles'.
There are a few indexes on columns in 'articles' which obviously will
slow things down too.

I've had a minor attempt at tuning (increased checkpoint_segments) an d
I'm sure there are a bunch of other tricks I could use to bulk-set that
flag in much less time...

But my gut feeling is that the flag would be better off in it's own
table anyway, eg:

CREATE TABLE needs_indexing (
   article_id integer references article(id)
);

So, if an article is listed in this table, it needs indexing.
(maybe with a constraint to ensure uniqueness - I only need articles
entered once in this table)

Does this sound like a reasonable way to go?
Any advice or insight welcome!

Thanks,
Ben.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


pgsql-general by date:

Previous
From: Marc Lustig
Date:
Subject: recovering fs-data from previous installation
Next
From: "Khin, Gerald"
Date:
Subject: trouble with unique constraint