Fwd: (Re)-indexing on updates - Mailing list pgsql-performance
From | Yves Vindevogel |
---|---|
Subject | Fwd: (Re)-indexing on updates |
Date | |
Msg-id | e0ca8bdefa68e65f3b2a77374a77396d@implements.be Whole thread Raw |
List | pgsql-performance |
My DB is quite simple. It holds data about printjobs that come from the windows eventlog. The data is shown on a website. I have one main table: tblPrintjobs. We add some extra data to it. Like the applicationtype, based on rules we define in other tables. When a rule changes, the updates take place (and take so long). Also, when new records are added, this takes place. For instance, rule 1 and rule 2 are changing positions in importance. (1 was before 2, now 2 before 1) The records that hold reference to rule 1 are reset to null (one field) Rule 2 is assigned, then rule 1 is assigned. What I could do is also: delete all from tblRefRules where rule is 1 insert all from tblPrintjobs that are not yet in RefRules for Rule2, then insert all for rule2 That would be a workaround for the MVCC. Not ? BTW: The good rule is: drop index, update, vacuum, create index ? I think I mistook the purpose of vacuum. If I index before the vacuum, my marked records will still be in the index ? Even if all transactions are finished ? Begin forwarded message: <excerpt><bold><color><param>0000,0000,0000</param>From: </color></bold>"Jeffrey W. Baker" <<jwbaker@acm.org> <bold><color><param>0000,0000,0000</param>Date: </color></bold>Sun 21 Aug 2005 21:36:16 CEST <bold><color><param>0000,0000,0000</param>To: </color></bold>Yves Vindevogel <<yves.vindevogel@implements.be> <bold><color><param>0000,0000,0000</param>Subject: </color>Re: [PERFORM] (Re)-indexing on updates </bold> On Sun, 2005-08-21 at 21:18 +0200, Yves Vindevogel wrote: <excerpt> ______________________________________________________________________ Ok, this is a major setback in some of my procedures. From time to time, I must update one field in about 10% of the records. So this will take time. How can I work around that ? Some personal opinions ... 1) Drop indexes, run update, create indexes, vacuum </excerpt> Drop index, update, vacuum, create index -or- update, vacuum, reindex <excerpt>2) Move the field to another table and use joins ? I could delete the records when needed and add them again </excerpt> I'm not familiar with your application, but you could try it and tell us if this works :) <excerpt> This mechanism, of inserting a new record and marking the old one, is that data kept somewhere where I can "see" it ? </excerpt> This is MVCC: multi-version cuncurrency. The old record is kept because there could be an old transaction that can still see it, and cannot yet see the updated record. And no other transaction can see your record until you commit. The old row isn't removed until you vacuum. <excerpt>I need for one app a trace of all my changes in the database. I have a set of triggers to do that for the moment on each table. Could I use that mechanism somehow to avoid my triggers ? Any documentation on that mechanism (hacker stuff like what tables are used) ? </excerpt> You could search the postgresql documentation (or the web) for MVCC. Regards, jwb </excerpt>Met vriendelijke groeten, Bien à vous, Kind regards, <bold>Yves Vindevogel</bold> <bold>Implements</bold> <smaller> </smaller>My DB is quite simple. It holds data about printjobs that come from the windows eventlog. The data is shown on a website. I have one main table: tblPrintjobs. We add some extra data to it. Like the applicationtype, based on rules we define in other tables. When a rule changes, the updates take place (and take so long). Also, when new records are added, this takes place. For instance, rule 1 and rule 2 are changing positions in importance. (1 was before 2, now 2 before 1) The records that hold reference to rule 1 are reset to null (one field) Rule 2 is assigned, then rule 1 is assigned. What I could do is also: delete all from tblRefRules where rule is 1 insert all from tblPrintjobs that are not yet in RefRules for Rule2, then insert all for rule2 That would be a workaround for the MVCC. Not ? BTW: The good rule is: drop index, update, vacuum, create index ? I think I mistook the purpose of vacuum. If I index before the vacuum, my marked records will still be in the index ? Even if all transactions are finished ? Begin forwarded message: > From: "Jeffrey W. Baker" <jwbaker@acm.org> > Date: Sun 21 Aug 2005 21:36:16 CEST > To: Yves Vindevogel <yves.vindevogel@implements.be> > Subject: Re: [PERFORM] (Re)-indexing on updates > > On Sun, 2005-08-21 at 21:18 +0200, Yves Vindevogel wrote: >> >> >> ______________________________________________________________________ >> >> Ok, this is a major setback in some of my procedures. >> From time to time, I must update one field in about 10% of the >> records. >> So this will take time. >> >> How can I work around that ? >> >> Some personal opinions ... >> 1) Drop indexes, run update, create indexes, vacuum > > Drop index, update, vacuum, create index > > -or- > > update, vacuum, reindex > >> 2) Move the field to another table and use joins ? I could delete the >> records when needed and add them again > > I'm not familiar with your application, but you could try it and tell > us > if this works :) > >> >> This mechanism, of inserting a new record and marking the old one, is >> that data kept somewhere where I can "see" it ? > > This is MVCC: multi-version cuncurrency. The old record is kept > because > there could be an old transaction that can still see it, and cannot yet > see the updated record. And no other transaction can see your record > until you commit. The old row isn't removed until you vacuum. > >> I need for one app a trace of all my changes in the database. I have >> a set of triggers to do that for the moment on each table. >> Could I use that mechanism somehow to avoid my triggers ? >> Any documentation on that mechanism (hacker stuff like what tables are >> used) ? > > You could search the postgresql documentation (or the web) for MVCC. > > Regards, > jwb > > > Met vriendelijke groeten, Bien à vous, Kind regards, Yves Vindevogel Implements <smaller> Mail: yves.vindevogel@implements.be - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be <italic><x-tad-smaller> First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi.</x-tad-smaller></italic></smaller> Mail: yves.vindevogel@implements.be - Mobile: +32 (478) 80 82 91 Kempische Steenweg 206 - 3500 Hasselt - Tel-Fax: +32 (11) 43 55 76 Web: http://www.implements.be First they ignore you. Then they laugh at you. Then they fight you. Then you win. Mahatma Ghandi.
Attachment
pgsql-performance by date: