Thread: Database possible corruption , unsolvable mystery

Database possible corruption , unsolvable mystery

From

"Eric Lauzon"

Date:

29 March 2006, 13:59:07

Greetings,

We have an issue where we have a database with many tables.
The layout of the database is 3 set of look alike tables with different names.
Each set of tables has some referential integrety that point back to the main
control table.

On two set of tables we are able to efficiently delete referential and main record
without a problems, but on the 3rd set we have an issue where the control table is clugged
and delete seem to take forever , as example on the first two set a delete of 60K record take about
4 second to 10 second but on the 3rd set it can take as long as 3hours.

This seem to be only affecting one database , the schema and way of doing is replicated elsewhere
and if efficient.

The question is, even after droping 3rd set integrity , dumping the table data , deleting the table,
doing a manual checkpoint and recreating the table with the dump data , with or without referential
integrity , the problems still araise.

If we copy the data from the live table and do a create table aaa as select * from problematic_table;
the table aaa operations are normal and efficient.

This is why our investigation brought us to the folowing questions:

1. Are postgresql data file name are hashed references to table name(as oracle)? [~path to data
EX:/var/log/pgsql/data/[arbitraty numbers]/[datafile]]?

2. If the data files are corrupted and we re-create is it possible it uses the same files thus creating the same
issue?

3. Since we know that all the tables has that problems is there an internal table with undisclosed references to
tablesdata files?

I hope the questions were clear.

Have a good day, and thank you in advance.

Eric Lauzon
[Recherche & Développement]
Above Sécurité / Above Security
Tél : (450) 430-8166
Fax : (450) 430-1858

AVERTISSEMENT CONCERNANT LA CONFIDENTIALITÉ

Le présent message est à l'usage exclusif du ou des destinataires mentionnés ci-dessus. Son contenu est confidentiel et
peutêtre assujetti au secret professionnel. Si vous avez reçu le présent message par erreur, veuillez nous en aviser
immédiatementet le détruire en vous abstenant d'en faire une copie, d'en divulguer le contenu ou d'y donner suite.

CONFIDENTIALITY NOTICE

This communication is intended for the exclusive use of the addressee identified above. Its content is confidential and
maycontain privileged information. If you have received this communication by error, please notify the sender and
deletethe message without copying or disclosing it.

Re: Database possible corruption , unsolvable mystery

From

Richard Huxton

Date:

29 March 2006, 18:10:08

Eric Lauzon wrote:
> This is why our investigation brought us to the folowing questions:
>
> 1. Are postgresql data file name are hashed references to table
> name(as oracle)? [~path to data EX:/var/log/pgsql/data/[arbitraty
> numbers]/[datafile]]?

OID numbers - look in the contrib directory/package for the oid2name
utility.

> 2. If the data files are corrupted and we re-create is it possible it
> uses the same files thus creating the same issue?

No

> 3. Since we know that all the tables has that problems is there an
> internal table with undisclosed references to tables data files?  I
> hope the questions were clear.

You mean a system table that could account for your problems since it
refers to some of your tables but not others? No.

The obvious places to start are:
1. vacuum analyse verbose on the tables in question
    This should show whether there are a lot of "dead" rows
2. explain analyse on problem queries
    To see if the query plans are correct
3. SELECT * FROM pg_stat_???
    Assuming you have statistics gathering turned on, this might show
unusual table accesses.

HTH
--
   Richard Huxton
   Archonet Ltd

Re: Database possible corruption , unsolvable mystery

From

Mark Lewis

Date:

29 March 2006, 18:16:55

Can you post an explain analyze for the delete query?  That will at
least tell you if it is the delete itself which is slow, or a trigger /
referential integrity constraint check.  Which version of PG is this?

-- Mark Lewis

On Wed, 2006-03-29 at 12:58 -0500, Eric Lauzon wrote:
> Greetings,
>
>     We have an issue where we have a database with many tables.
>     The layout of the database is 3 set of look alike tables with different names.
>     Each set of tables has some referential integrety that point back to the main
>     control table.
>
>     On two set of tables we are able to efficiently delete referential and main record
>     without a problems, but on the 3rd set we have an issue where the control table is clugged
>     and delete seem to take forever , as example on the first two set a delete of 60K record take about
>     4 second to 10 second but on the 3rd set it can take as long as 3hours.
>
>     This seem to be only affecting one database , the schema and way of doing is replicated elsewhere
>     and if efficient.
>
>     The question is, even after droping 3rd set integrity , dumping the table data , deleting the table,
>     doing a manual checkpoint and recreating the table with the dump data , with or without referential
>     integrity , the problems still araise.
>
>     If we copy the data from the live table and do a create table aaa as select * from problematic_table;
>     the table aaa operations are normal and efficient.
>
>     This is why our investigation brought us to the folowing questions:
>
>     1. Are postgresql data file name are hashed references to table name(as oracle)? [~path to data
EX:/var/log/pgsql/data/[arbitraty                 numbers]/[datafile]]? 
>
>     2. If the data files are corrupted and we re-create is it possible it uses the same files thus creating the same
issue?
>
>      3. Since we know that all the tables has that problems is there an internal table with undisclosed references to
tablesdata files? 
>
>     I hope the questions were clear.
>
>     Have a good day, and thank you in advance.
>
>
> Eric Lauzon
> [Recherche & Développement]
> Above Sécurité / Above Security
> Tél  : (450) 430-8166
> Fax : (450) 430-1858
>
> AVERTISSEMENT CONCERNANT LA CONFIDENTIALITÉ
>
> Le présent message est à l'usage exclusif du ou des destinataires mentionnés ci-dessus. Son contenu est confidentiel
etpeut être assujetti au secret professionnel. Si vous avez reçu le présent message par erreur, veuillez nous en aviser
immédiatementet le détruire en vous abstenant d'en faire une copie, d'en divulguer le contenu ou d'y donner suite. 
>
> CONFIDENTIALITY NOTICE
>
> This communication is intended for the exclusive use of the addressee identified above. Its content is confidential
andmay contain privileged information. If you have received this communication by error, please notify the sender and
deletethe message without copying or disclosing it. 
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

Re: Database possible corruption , unsolvable mystery

From

"Eric Lauzon"

Date:

29 March 2006, 18:31:24

> -----Original Message-----
> From: Richard Huxton [mailto:dev@archonet.com]
> Sent: 29 mars 2006 17:10
> To: Eric Lauzon
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Database possible corruption ,
> unsolvable mystery
>
> Eric Lauzon wrote:
> > This is why our investigation brought us to the folowing questions:
> >
> > 1. Are postgresql data file name are hashed references to table
> > name(as oracle)? [~path to data EX:/var/log/pgsql/data/[arbitraty
> > numbers]/[datafile]]?
>
> OID numbers - look in the contrib directory/package for the
> oid2name utility.

This will give me the location of the databases file for a specific
table or index?

>
> > 2. If the data files are corrupted and we re-create is it
> possible it
> > uses the same files thus creating the same issue?
>
> No
>

humm why would it affect only original table , and copy of that table
renamed back to the original table name
but not the copy.

example:
original table name : table_problem      <issue>
       copy name  : table_problem_copy <no issue>
       renamed copyed table: table_problem <issue>

> > 3. Since we know that all the tables has that problems is there an
> > internal table with undisclosed references to tables data files?  I
> > hope the questions were clear.
>
> You mean a system table that could account for your problems
> since it refers to some of your tables but not others? No.

 Well actualy its affecting only one table in a set of 5 table
(referential integrity)
 and the table affected if the [referenced table] so it might be system
related, but
 as stated if all the data is copied to a create table
copy_of_problematic_table as select * from problematic_table
 there is 0 issue but as soon as copy_of_problematic_table is renamed to
problematic_table the problems is back.

 But we have 2 orther set of 5 table in the same database built exactly
the same way and it dosen't
 seem affected by the same problems, this is why i am wandering why the
problems is recurent if
 internal postgresql data file are name bound ...and i am not taking
about the OID.

>
> The obvious places to start are:
> 1. vacuum analyse verbose on the tables in question
>     This should show whether there are a lot of "dead" rows
> 2. explain analyse on problem queries
>     To see if the query plans are correct 3. SELECT * FROM pg_stat_???
>     Assuming you have statistics gathering turned on, this
> might show unusual table accesses.

Btw i can't give vacuum info right now because the source database is
being dumped for complete re-insertion.

Mabey later if this dosen't fix the problem , and as of information its
7.4.6 [i know its not the most rescent]
but it is the way it is right now and we suspect the problem might have
come from a power outage while there was
a full vacuum and the reason why its only one table that has been
affected is probably because it was the table being vacummed,
but this is only an assumption right now and more info will folow if the
problems persis after a full restore.

Thanks you :)
-elz

AVERTISSEMENT CONCERNANT LA CONFIDENTIALITE

Le present message est a l'usage exclusif du ou des destinataires mentionnes ci-dessus. Son contenu est confidentiel et
peutetre assujetti au secret professionnel. Si vous avez recu le present message par erreur, veuillez nous en aviser
immediatementet le detruire en vous abstenant d'en faire une copie, d'en divulguer le contenu ou d'y donner suite. 

CONFIDENTIALITY NOTICE

This communication is intended for the exclusive use of the addressee identified above. Its content is confidential and
maycontain privileged information. If you have received this communication by error, please notify the sender and
deletethe message without copying or disclosing it.

Re: Database possible corruption , unsolvable mystery

From

stef

Date:

29 March 2006, 18:49:49

Eric Lauzon wrote:

>Mabey later if this dosen't fix the problem , and as of information its
>7.4.6 [i know its not the most rescent]
>but it is the way it is right now and we suspect the problem might have
>come from a power outage while there was
>a full vacuum and the reason why its only one table that has been
>affected is probably because it was the table being vacummed,
>but this is only an assumption right now and more info will folow if the
>problems persis after a full restore.
>
>
>
Hrm, you know that you -should- upgrade to at least the latest 7.4
(7.4.13 I think is the most recent). looking from the changelogs, there
are a few bugs that you could be hitting;

7.4.10
    * Fix race condition in transaction log management There was a
narrow window in which an I/O operation could be initiated for the wrong
page, leading to an Assert failure or data corruption.

7.4.9
    * Improve checking for partially-written WAL pages
    * Fix error that allowed VACUUM to remove ctid chains too soon, and
add more checking in code that follows ctid links. This fixes a
long-standing problem that could cause crashes in very rare circumstances.

7.4.8
    * Repair race condition between relation extension and VACUUMThis
could theoretically have caused loss of a page's worth of
freshly-inserted data, although the scenario seems of very low
probability. There are no known cases of it having caused more than an
Assert failure

    and these are only the ones that appear 'notably' in the changelog.
In short, I -really- -would- -strongly- -advise- you upgrading to
7.4.13. Personally, I would have made this my first step, especially if
your data is important.

    There is no need for a dump/reload between minor point releases.
Although there is a security fix in 7.4.8.

    Since the db is in a state of 'down' or repair, why not do it now ?
two birds, one stone.

    Regards
    Stef

Re: Database possible corruption , unsolvable mystery

From

"Eric Lauzon"

Date:

29 March 2006, 19:33:26

> Hrm, you know that you -should- upgrade to at least the latest 7.4
> (7.4.13 I think is the most recent). looking from the
> changelogs, there are a few bugs that you could be hitting;
>
> 7.4.10
>     * Fix race condition in transaction log management There
> was a narrow window in which an I/O operation could be
> initiated for the wrong page, leading to an Assert failure or
> data corruption.
>
> 7.4.9
>     * Improve checking for partially-written WAL pages
>     * Fix error that allowed VACUUM to remove ctid chains too
> soon, and add more checking in code that follows ctid links.
> This fixes a long-standing problem that could cause crashes
> in very rare circumstances.
>
> 7.4.8
>     * Repair race condition between relation extension and
> VACUUMThis could theoretically have caused loss of a page's
> worth of freshly-inserted data, although the scenario seems
> of very low probability. There are no known cases of it
> having caused more than an Assert failure
>
>     and these are only the ones that appear 'notably' in the
> changelog.
> In short, I -really- -would- -strongly- -advise- you
> upgrading to 7.4.13. Personally, I would have made this my
> first step, especially if your data is important.
>
>     There is no need for a dump/reload between minor point releases.
> Although there is a security fix in 7.4.8.
>
>     Since the db is in a state of 'down' or repair, why not
> do it now ?
> two birds, one stone.

Thank you , this might be a good solution , but we have a bigger upgrade
comming for 8.1.x later on,
but considering that other things out of our hands might occur , we
might seriously look into it after fixing
the current problems :) [because we dont think that upgrading right now
will magicly fix the problem we are having.]
And on about 10 database [all 7.4.6] it is the first time this occur ,
and the symtom is really on one table, considering
a few terabytes of data sparsed accros a few db, we might have been
lucky yet but as of now its the first time
we can see performance hit only on "delete".

But thanks alot for the hint. [even tho we never had some unexpected
data failure/crash] beside this out of control
human power failure that might have been the root of this [the database
is still dumping ...few gigs :)]

Thanks alot all for the help,and if we find the root cause we will give
feed back.

-elz

AVERTISSEMENT CONCERNANT LA CONFIDENTIALITE

Le present message est a l'usage exclusif du ou des destinataires mentionnes ci-dessus. Son contenu est confidentiel et
peutetre assujetti au secret professionnel. Si vous avez recu le present message par erreur, veuillez nous en aviser
immediatementet le detruire en vous abstenant d'en faire une copie, d'en divulguer le contenu ou d'y donner suite. 

CONFIDENTIALITY NOTICE

This communication is intended for the exclusive use of the addressee identified above. Its content is confidential and
maycontain privileged information. If you have received this communication by error, please notify the sender and
deletethe message without copying or disclosing it.

Re: Database possible corruption , unsolvable mystery

From

Josh Berkus

Date:

30 March 2006, 01:14:02

Eric,

> Thank you , this might be a good solution , but we have a bigger upgrade
> comming for 8.1.x later on,
> but considering that other things out of our hands might occur , we
> might seriously look into it after fixing
> the current problems :) [because we dont think that upgrading right now
> will magicly fix the problem we are having.]

It probably won't, but it will prevent a re-occurance before you get around to
the 8.1 upgrade.   How much time have you wasted on this issue already, an
issue which might not have occurred if you'd kept up with patch releases?  A
patch upgrade is what, 5 minutes of downtime?

> And on about 10 database [all 7.4.6] it is the first time this occur ,
> and the symtom is really on one table, considering
> a few terabytes of data sparsed accros a few db, we might have been
> lucky yet but as of now its the first time
> we can see performance hit only on "delete".

Well, that would be in line with the issues 7.4.7-7.4.12.  All of them require
millesecond-timing to hit the bug.  You're not likely to see it more than
once.

--
Josh Berkus
Aglio Database Solutions
San Francisco