Thread: BUG #5439: Table crash after CLUSTER command

BUG #5439: Table crash after CLUSTER command

From
"Stefan Kirchev"
Date:
The following bug has been logged online:

Bug reference:      5439
Logged by:          Stefan Kirchev
Email address:      stefan.kirchev@gmail.com
PostgreSQL version: 8.3.3
Operating system:   Linux
Description:        Table crash after CLUSTER command
Details:

Hello,
I order to keep good performance on tables CLUSTER is done regularly on each
table every Sunday. Almost every time we loose a table which must be
recreated afterward. The error yield is:
pnp=# select * from  alcatel_bss_kpi_tmp.cs_hourly_kpi limit 1;
ERROR:  could not open relation 1663/16404/2426042: No such file or
directory
It is obvious the engine fails to replace the old relfilenode with the new
one.
Is it possible to recover the data from the pg_toast?
Is it mandatory to lock manually table with ACCESS EXCLUSIVE option or it is
done by the engine along with the CLUSTER command?
Thank you!

Re: BUG #5439: Table crash after CLUSTER command

From
"Kevin Grittner"
Date:
"Stefan Kirchev" <stefan.kirchev@gmail.com> wrote:

> PostgreSQL version: 8.3.3

> Description:        Table crash after CLUSTER command

> I order to keep good performance on tables CLUSTER is done
> regularly on each table every Sunday. Almost every time we loose a
> table which must be recreated afterward. The error yield is:
> pnp=# select * from  alcatel_bss_kpi_tmp.cs_hourly_kpi limit 1;
> ERROR:  could not open relation 1663/16404/2426042: No such file
> or directory

My first recommendation would be to apply the fixes for the bugs
found during the last two years by upgrading your executable to
8.3.10.  This does not require a dump and load, but if you have any
GiST indexes, or if you have hash indexes on intervals, you will
need to rebuild those indexes.  To get more details, see:

http://www.postgresql.org/docs/8.3/static/release

FWIW, we CLUSTER a few very small, very frequently updated tables
daily in about 100 databases to ensure that we recover from bloat
from the occasional long-running transaction, and we've *never*
seen this.

If you actually need to cluster *every* table *every* week, you
should review your vacuum policy.

-Kevin

Re: BUG #5439: Table crash after CLUSTER command

From
Tom Lane
Date:
"Stefan Kirchev" <stefan.kirchev@gmail.com> writes:
> Bug reference:      5439
> Logged by:          Stefan Kirchev
> Email address:      stefan.kirchev@gmail.com
> PostgreSQL version: 8.3.3
> Operating system:   Linux
> Description:        Table crash after CLUSTER command
> Details:

> Hello,
> I order to keep good performance on tables CLUSTER is done regularly on each
> table every Sunday. Almost every time we loose a table which must be
> recreated afterward. The error yield is:
> pnp=# select * from  alcatel_bss_kpi_tmp.cs_hourly_kpi limit 1;
> ERROR:  could not open relation 1663/16404/2426042: No such file or
> directory

Can you generate a self-contained test case for this?  Given the lack of
other reports, it seems like there must be something rather odd about
either your table definition or your clustering procedure.

Also, it would be good to update to a less obsolete 8.3.x release.
I don't recognize this offhand as a previously-fixed bug, but 8.3.3
was almost two years ago.

            regards, tom lane

Re: BUG #5439: Table crash after CLUSTER command

From
Stefan Kirchev
Date:
Thank you for the replay.
I will follow your advice. First will upgrade to 8.4 (or at least to the
latest 8.3 release), than will try to reproduce the error.
For now I will stick to using VACUUM FULL and REINDEX.
Thanks again.
Best Regards
Stefan Kirchev

On Mon, Apr 26, 2010 at 5:06 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov
> wrote:

> "Stefan Kirchev" <stefan.kirchev@gmail.com> wrote:
>
> > PostgreSQL version: 8.3.3
>
> > Description:        Table crash after CLUSTER command
>
> > I order to keep good performance on tables CLUSTER is done
> > regularly on each table every Sunday. Almost every time we loose a
> > table which must be recreated afterward. The error yield is:
> > pnp=# select * from  alcatel_bss_kpi_tmp.cs_hourly_kpi limit 1;
> > ERROR:  could not open relation 1663/16404/2426042: No such file
> > or directory
>
> My first recommendation would be to apply the fixes for the bugs
> found during the last two years by upgrading your executable to
> 8.3.10.  This does not require a dump and load, but if you have any
> GiST indexes, or if you have hash indexes on intervals, you will
> need to rebuild those indexes.  To get more details, see:
>
> http://www.postgresql.org/docs/8.3/static/release
>
> FWIW, we CLUSTER a few very small, very frequently updated tables
> daily in about 100 databases to ensure that we recover from bloat
> from the occasional long-running transaction, and we've *never*
> seen this.
>
> If you actually need to cluster *every* table *every* week, you
> should review your vacuum policy.
>
> -Kevin
>