Thread: Enhancement to pg_dump
Hi,<br /><br />I'm very new to hacking postgresql but am using on a very big site (<a href="http://ojp.nationalrail.co.uk">http://ojp.nationalrail.co.uk</a>).One of the issues that we have is moving data froma live database to a reports one. I've hacked an extra option to pg_dump to delete from tables rather than dropping them.<br/><br />Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?<br /><br />Thanks<br /><br />Rob<br/><br />
On Tue, Nov 25, 2008 at 8:39 PM, Rob Kirkbride <rob.kirkbride@gmail.com> wrote: > Hi, > > I'm very new to hacking postgresql but am using on a very big site > (http://ojp.nationalrail.co.uk). One of the issues that we have is moving > data from a live database to a reports one. I've hacked an extra option to > pg_dump to delete from tables rather than dropping them. National Rail use Postgres for their journey planner? Cool :-) > Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here? Yes (and please add details to http://wiki.postgresql.org/wiki/CommitFestOpen so it doesn't get lost), but please note that we're in the middle of the final phase of the development cycle at the moment, so new patches are unlikely to be looked at for at least a couple of months. -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com
Dave,<br /><br />Ok thanks. Yes, we've got over 1/2 billion rows in one of our tables which is interesting!<br /><br />Willpost back soon.<br /><br />Rob<br /><br /><div class="gmail_quote">2008/11/25 Dave Page <span dir="ltr"><<a href="mailto:dpage@pgadmin.org">dpage@pgadmin.org</a>></span><br/><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">On Tue,Nov 25, 2008 at 8:39 PM, Rob Kirkbride <<a href="mailto:rob.kirkbride@gmail.com">rob.kirkbride@gmail.com</a>> wrote:<br/> > Hi,<br /> ><br /> > I'm very new to hacking postgresql but am using on a very big site<br /> >(<a href="http://ojp.nationalrail.co.uk" target="_blank">http://ojp.nationalrail.co.uk</a>). One of the issues that wehave is moving<br /> > data from a live database to a reports one. I've hacked an extra option to<br /> > pg_dumpto delete from tables rather than dropping them.<br /><br /></div>National Rail use Postgres for their journey planner?Cool :-)<br /><div class="Ih2E3d"><br /> > Once I'm happy with it (I'm a bit rusty at C!), do I post the patchhere?<br /><br /></div>Yes (and please add details to<br /><a href="http://wiki.postgresql.org/wiki/CommitFestOpen"target="_blank">http://wiki.postgresql.org/wiki/CommitFestOpen</a> soit doesn't get<br /> lost), but please note that we're in the middle of the final phase of<br /> the development cycleat the moment, so new patches are unlikely to be<br /> looked at for at least a couple of months.<br /><font color="#888888"><br/><br /> --<br /> Dave Page<br /> EnterpriseDB UK: <a href="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br/></font></blockquote></div><br />
"Rob Kirkbride" <rob.kirkbride@gmail.com> writes: > Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here? I would say you should post *before* you have a patch you're happy with. As soon as you have a specific plan of what you want to do it's best to post an outline of it. That way you at least have a chance of avoiding wasting work in the wrong direction. Sometimes things don't really work out that way -- sometimes the plan sounds good and it only becomes apparent there's a better way later -- but you're best off getting the best chance you can. Incidentally, I don't know exactly what the use case you're trying to cover here is but you should consider looking at TRUNCATE instead of DELETE if you're really deleting all the records in the table and can accept locking the table. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support!
OK thanks for the advice. <br /><br />What I'm trying to overcome is where we've got a long report running and the processthat is taking data from the main database cannot complete because of the drop table. I believe a DELETE (and possiblyTRUNCATE?) doesn't need an exclusive lock on the table and therefore can continue.<br /><br />I've introduced a --delete-not-dropoption which simply does a DELETE FROM % rather than 'DROP and then CREATE'. <br /><br />I hope this soundssensible and I haven't missed something - I'm still learning!<br /><br />Rob<br /><br /><br /><div class="gmail_quote">2008/11/25Gregory Stark <span dir="ltr"><<a href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>></span><br/><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">"RobKirkbride" <<a href="mailto:rob.kirkbride@gmail.com">rob.kirkbride@gmail.com</a>> writes:<br /><br/> > Once I'm happy with it (I'm a bit rusty at C!), do I post the patch here?<br /><br /></div>I would say you shouldpost *before* you have a patch you're happy with. As<br /> soon as you have a specific plan of what you want to doit's best to post an<br /> outline of it. That way you at least have a chance of avoiding wasting work in<br /> the wrongdirection.<br /><br /> Sometimes things don't really work out that way -- sometimes the plan sounds<br /> good and itonly becomes apparent there's a better way later -- but you're<br /> best off getting the best chance you can.<br /><br/> Incidentally, I don't know exactly what the use case you're trying to cover<br /> here is but you should considerlooking at TRUNCATE instead of DELETE if<br /> you're really deleting all the records in the table and can acceptlocking the<br /> table.<br /><font color="#888888"><br /> --<br /> Gregory Stark<br /> EnterpriseDB <ahref="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br /> Ask me about EnterpriseDB'sSlony Replication support!<br /></font></blockquote></div><br />
Rob Kirkbride wrote: > I've introduced a --delete-not-drop option which simply does a DELETE FROM % > rather than 'DROP and then CREATE'. Beware foreign-keys slowing you - TRUNCATE all relevant tables should be the fastest method if possible. > I hope this sounds sensible and I haven't missed something - I'm still > learning! Have you considered restoring to a completely different database (report1/report2) and just switching between them? -- Richard Huxton Archonet Ltd
Richard, Yes, I've changed it use TRUNCATE rather than DELETE and it's working well for us now. The switching of the database is a good idea - thanks. Unfortunately, we've not got enough disk space currently to do that, but if we get problems in the future that will definitely be something we'll consider. Rob Richard Huxton wrote: > Rob Kirkbride wrote: > >> I've introduced a --delete-not-drop option which simply does a DELETE FROM % >> rather than 'DROP and then CREATE'. >> > > Beware foreign-keys slowing you - TRUNCATE all relevant tables should be > the fastest method if possible. > > >> I hope this sounds sensible and I haven't missed something - I'm still >> learning! >> > > Have you considered restoring to a completely different database > (report1/report2) and just switching between them? > >
Rob Kirkbride <rob.kirkbride@gmail.com> writes: > Richard, > > Yes, I've changed it use TRUNCATE rather than DELETE and it's working well for > us now. I'm a bit surprised actually as it sounded like you were aiming to avoid the table lock. A TRUNCATE does require an exclusive lock on the table. It still has advantages over DROP in that there is no window when the table does not exist and any existing references to the table from views or functions will continue to function. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services!
I must admit I've not read up on the various locks that are set so that's a good point. Is there a good reference for meto read and understand these?<br /><br />I'm guessing though that a delete from and then an insert never requires an exclusivelock, what about adding/deleting constraints?<br /><br />Rob<br /><br /><br /><br /><div class="gmail_quote">2008/11/26Gregory Stark <span dir="ltr"><<a href="mailto:stark@enterprisedb.com">stark@enterprisedb.com</a>></span><br/><blockquote class="gmail_quote" style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">Rob Kirkbride<<a href="mailto:rob.kirkbride@gmail.com">rob.kirkbride@gmail.com</a>> writes:<br /><br /></div><div class="Ih2E3d">>Richard,<br /> ><br /> > Yes, I've changed it use TRUNCATE rather than DELETE and it's working wellfor<br /> > us now.<br /><br /></div>I'm a bit surprised actually as it sounded like you were aiming to avoid the<br/> table lock. A TRUNCATE does require an exclusive lock on the table. It still<br /> has advantages over DROP in thatthere is no window when the table does not<br /> exist and any existing references to the table from views or functionswill<br /> continue to function.<br /><br /><br /> --<br /><div class="Ih2E3d"> Gregory Stark<br /> EnterpriseDB <a href="http://www.enterprisedb.com" target="_blank">http://www.enterprisedb.com</a><br /></div> Askme about EnterpriseDB's RemoteDBA services!<br /></blockquote></div><br />
"Rob Kirkbride" <rob.kirkbride@gmail.com> writes: > I must admit I've not read up on the various locks that are set so that's a > good point. Is there a good reference for me to read and understand these? > > I'm guessing though that a delete from and then an insert never requires an > exclusive lock, what about adding/deleting constraints? There is documentation http://www.postgresql.org/docs/8.3/static/explicit-locking.html However I found it very confusing when I was first learning. It's not really the documentation's fault either, there are just a lot of different lock levels with a lot of different combinations possible. All DML, even selects, take a table-level shared lock on the tables involved which blocks the tables from being dropped or truncated while the query is running. DELETE and UPDATE (and SELECT FOR UPDATE) take exclusive row-level locks. A SELECT can read the old version of the record but another UPDATE will block until your transaction finishes so it can update the most recent version. But an update which doesn't need to look at that record won't be affected at all. TRUNCATE and DROP take exclusive table-level locks which blocks anyone else from even selecting from the table. It also means they can't proceed until all queries which have already started reading the table finish. DROP is still a lot heavier than TRUNCATE because it also has to drop (or search for and throw an error) anything else dependent on the table. triggers, views, etc. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!
Gregory Stark wrote: > There is documentation > > http://www.postgresql.org/docs/8.3/static/explicit-locking.html > > However I found it very confusing when I was first learning. It's not really > the documentation's fault either, there are just a lot of different lock > levels with a lot of different combinations possible. > > All DML, even selects, take a table-level shared lock on the tables involved > which blocks the tables from being dropped or truncated while the query is > running. > > DELETE and UPDATE (and SELECT FOR UPDATE) take exclusive row-level locks. A > SELECT can read the old version of the record but another UPDATE will block > until your transaction finishes so it can update the most recent version. But > an update which doesn't need to look at that record won't be affected at all. > > TRUNCATE and DROP take exclusive table-level locks which blocks anyone else > from even selecting from the table. It also means they can't proceed until all > queries which have already started reading the table finish. > > DROP is still a lot heavier than TRUNCATE because it also has to drop (or > search for and throw an error) anything else dependent on the table. triggers, > views, etc. > > Thanks for that - it's very useful. As you say I believe the documentation is pretty good, it's just that we're not dealing in simple issues here. I definitely think I should do a delete rather than a truncate (or drop) in my case. Regards Rob