Thread: Known issue with Reindex-based corruption?

Known issue with Reindex-based corruption?

From
Josh Berkus
Date:
Folks,

Is there any known issues with index file corruption in the event of a
power-out during REINDEX with 7.2.4?

I *think* the problem is this client's peculiar hardware, but wanted to
eliminate any potential known issues.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 651-9224
    and non-profit organizations.     San Francisco

Re: Known issue with Reindex-based corruption?

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Is there any known issues with index file corruption in the event of a
> power-out during REINDEX with 7.2.4?

There's been a lot of water over the dam since 7.2.4.  I'd suggest
offering more details if you want an informed comment.

One question that comes to mind is were you reindexing a system or user
table?  Another is whether you were using disks that lie about write
completion (SCSI vs IDE)?

            regards, tom lane

Re: Known issue with Reindex-based corruption?

From
Josh Berkus
Date:
Tom,

> One question that comes to mind is were you reindexing a system or user
> table?

User.

> Another is whether you were using disks that lie about write
> completion (SCSI vs IDE)?

First thing I thought of.   Haven't been able to verify, yet.

The basic symptoms are:
1) Machine stated scheduled REINDEX.
2) Unexpected power-out
3) On reboot, we have 2 different versions of the index file on disk,
one with 0 bytes.   Attempts to use the index (via SELECT) result in
statement-fatal errors.

I'm waiting on more data.   For now, I was wondering whether there was a known
issue with WAL recovery on indexes in 7.2.4.   Neil thought there was.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 651-9224
    and non-profit organizations.     San Francisco

Re: Known issue with Reindex-based corruption?

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> The basic symptoms are:
> 1) Machine stated scheduled REINDEX.
> 2) Unexpected power-out
> 3) On reboot, we have 2 different versions of the index file on disk,
> one with 0 bytes.   Attempts to use the index (via SELECT) result in
> statement-fatal errors.

Hm.  Unless the REINDEX actually *completed* before the power-out, it
should not have had any effect other than creation of an unreferenced
file.  My guess is that the reindex did complete, and updated the
index's pg_class row to point at the new file, but for some reason only
the pg_class update got down to disk.

> I'm waiting on more data.  For now, I was wondering whether there was
> a known issue with WAL recovery on indexes in 7.2.4.  Neil thought
> there was.

That's a definite possibility.  Before 7.4 we did not emit WAL records
for data written during index build.  What we could have here is that
the transaction completed and synced to WAL, but none of the data-file
writes were sent to disk before power-out.  On restart, WAL replay would
faithfully update the pg_class row, but the index file would still be
empty :-(

            regards, tom lane

Re: Known issue with Reindex-based corruption?

From
Josh Berkus
Date:
Tom,

> That's a definite possibility.  Before 7.4 we did not emit WAL records
> for data written during index build.  What we could have here is that
> the transaction completed and synced to WAL, but none of the data-file
> writes were sent to disk before power-out.  On restart, WAL replay would
> faithfully update the pg_class row, but the index file would still be
> empty :-(

Would this be back-patchable by a good PG hacker?   The client has $$$.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

Re: Known issue with Reindex-based corruption?

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Would this be back-patchable by a good PG hacker?   The client has $$$.

It'd be more productive for them to update to 7.4 ...

            regards, tom lane

Re: Known issue with Reindex-based corruption?

From
Josh Berkus
Date:
Tom,

> It'd be more productive for them to update to 7.4 ...

It's a distributed app, meaning that they have boxes in the field which can
not be practically updated by remote.

They'll be using 7.4 for *new* boxes, sometime around November.   Their
requirements include 6 months of testing before release.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 651-9224
    and non-profit organizations.     San Francisco