Thread: Vaccuum Failure w/7.1beta4 on Linux/Sparc
While testing some existing database applications on 7.1beta4 on my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on attempting to do a vacuum of a table: NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 The first line is the error message from pgsql, while the second line is the error message from my application (using perl Pg module) reporting the error message returned. It appears that this should only be a warning (i.e. NOTICE, not FATAL or ERROR), but it caused the Pg module to throw an error anyway. My application of course checks for errors, see the error thrown by Pg and dies assuming the error was fatal.This error occurred after a load of about 50k records into the referenced table, a load of 50k records total into a few other tables, and then a few clean up queries. The part of the application I was testing is a database load from another (old, closed source) database. The vacuum was at the end of the of the database load, as part of final cleanup routines.So, is this a problem with pgsql in general, specific to Linux/Sparc, or a bug in Pg causing it to be too paranoid? Thanks. --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | ---------------------------------------------------------------------------
Ryan Kirkpatrick <pgsql@rkirkpat.net> writes: > While testing some existing database applications on 7.1beta4 on > my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on > attempting to do a vacuum of a table: > NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) > ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 This is undoubtedly a backend bug. Can you generate a reproducible test case? > So, is this a problem with pgsql in general, specific to > Linux/Sparc, or a bug in Pg causing it to be too paranoid? Thanks. Pg did get an ERROR from the vacuum command (note second line). Yes, there is paranoia right up the line here, but I think that's a good thing. Somewhere someone is failing to release a buffer refcount, and we don't know what other consequences that bug might have. Better to err on the side of caution. regards, tom lane
On Mon, 12 Mar 2001, Tom Lane wrote: > Ryan Kirkpatrick <pgsql@rkirkpat.net> writes: > > While testing some existing database applications on 7.1beta4 on > > my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on > > attempting to do a vacuum of a table: > > > NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) > > ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 > > This is undoubtedly a backend bug. Can you generate a reproducible test > case? I will work on it... The code that eventually caused it does a lot of different things so it will take me a little while to pair it down to a small, self-contained test case. I should have it by this weekend.Also, two other details I forgot to put in my first email: a) Running 'vaccumdb -t Jobs {dbname}' about 24 hours after the error (the backend had been completely idle during this time), ran successfully without error. b) The disk space where the pgsql database is located is NFS mounted from my Alpha (running Linux of course :). [0] Might this cause the error? [0] Yes, I know running pgsql on an NFS mount is probably not the greatest idea, but the system only has 1GB of local disk space (almost all used for the system) and is running as development server only. No valuable data is entrusted to it. Hopefully I will have more local disk space in the near future. > Pg did get an ERROR from the vacuum command (note second line). Yes, > there is paranoia right up the line here, but I think that's a good > thing. Somewhere someone is failing to release a buffer refcount, > and we don't know what other consequences that bug might have. Better > to err on the side of caution. A resonable amount of paranoia is indeed always healthy. :) Just wanted to know if this might have been a known and harmless warning. I guess not. I will work on a test case and get back hopefully by the weekend. Thanks for your help. --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | ---------------------------------------------------------------------------
On Mon, 12 Mar 2001, Ryan Kirkpatrick wrote: > While testing some existing database applications on 7.1beta4 on > my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on > attempting to do a vacuum of a table: > > NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) > ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 I moved the data directory to a local parition (from the NFS mounted one it was on) and reran my application. It worked fine this time, vaccuming tables with out errors and the above error was never seen. Looks like pgsql is not NFS safe, or at least with Linux's implementation. This is good news in that it is not a serious issue,but bad news in that now I really do have to hurry up and get more local space for this box to do anything useful with it. :)Thanks for everyone's help. TTYL. --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | ---------------------------------------------------------------------------
Ryan Kirkpatrick <pgsql@rkirkpat.net> writes: > On Mon, 12 Mar 2001, Ryan Kirkpatrick wrote: >> While testing some existing database applications on 7.1beta4 on >> my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on >> attempting to do a vacuum of a table: >> >> NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) >> ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 This is probably explained by the problem we found a few days ago with BufferSync acquiring locks it shouldn't. regards, tom lane
On Mon, 26 Mar 2001, Tom Lane wrote: > Ryan Kirkpatrick <pgsql@rkirkpat.net> writes: > > On Mon, 12 Mar 2001, Ryan Kirkpatrick wrote: > >> While testing some existing database applications on 7.1beta4 on > >> my Sparc 20 running Debian GNU/Linux 2.2, I got the following error on > >> attempting to do a vacuum of a table: > >> > >> NOTICE: FlushRelationBuffers(jobs, 1399): block 953 is referenced (private 0, global 1) > >> ERROR! Can't vacuum table Jobs! ERROR: VACUUM (repair_frag): FlushRelationBuffers returned -2 > > This is probably explained by the problem we found a few days ago with > BufferSync acquiring locks it shouldn't. Yea, it was. I just tried RC1 on the Sparc with my application, with the data directory NFS mounted, and it ran without errors now. Thanks. :) --------------------------------------------------------------------------- | "For to me to live is Christ, and to die is gain." | | --- Philippians 1:21 (KJV) | --------------------------------------------------------------------------- | Ryan Kirkpatrick | Boulder, Colorado | http://www.rkirkpat.net/ | ---------------------------------------------------------------------------