Thread: BUG #16540: Possible corrupted file?

BUG #16540: Possible corrupted file?

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      16540
Logged by:          Paul Hatcher
Email address:      paul.hatcher@sentinelpartners.co.uk
PostgreSQL version: 11.8
Operating system:   Windows Server 2012
Description:

We had a situation where a query that normally ran in 10 seconds instead
never completed.  No conflicts or locks could be seen.  All indexes were
valid and in place.  A full shutdown and restart of the server was done with
no effect.
To test we took a full backup of this system and restored to an identical
alternate system.  Having done this, the alternate system ran the query in
10 seconds.
Final test was to backup and restore back into the production system -
having done this the query returned to running in 10 seconds.  Do you have
any idea what could have caused this?

Thanks
Paul


Re: BUG #16540: Possible corrupted file?

From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes:
> We had a situation where a query that normally ran in 10 seconds instead
> never completed.  No conflicts or locks could be seen.  All indexes were
> valid and in place.  A full shutdown and restart of the server was done with
> no effect.
> To test we took a full backup of this system and restored to an identical
> alternate system.  Having done this, the alternate system ran the query in
> 10 seconds.
> Final test was to backup and restore back into the production system -
> having done this the query returned to running in 10 seconds.  Do you have
> any idea what could have caused this?

The most obvious theory is a change of query plan, perhaps due to having
up-to-date ANALYZE statistics in one case and not the other.  I don't
suppose you captured EXPLAIN output for the non-working state?

            regards, tom lane



Re: BUG #16540: Possible corrupted file?

From
Paul Hatcher
Date:
Hi Tom

Many thanks for the feedback.

I did do an explain on both the 10second and the never ending.  The actual structure of the plan was identical - there
werea couple of very minor differences in costs and widths but we're talking cost of 200,000 vs 199,500 kind of
differences.

Thanks

Paul Hatcher
Senior Consultant
Sentinel Partners Limited
W: sentinelpartners.co.uk
M: 07788 673898
T: 0800 612 2116
 
 

On 14/07/2020, 15:01, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

    PG Bug reporting form <noreply@postgresql.org> writes:
    > We had a situation where a query that normally ran in 10 seconds instead
    > never completed.  No conflicts or locks could be seen.  All indexes were
    > valid and in place.  A full shutdown and restart of the server was done with
    > no effect.
    > To test we took a full backup of this system and restored to an identical
    > alternate system.  Having done this, the alternate system ran the query in
    > 10 seconds.
    > Final test was to backup and restore back into the production system -
    > having done this the query returned to running in 10 seconds.  Do you have
    > any idea what could have caused this?
    
    The most obvious theory is a change of query plan, perhaps due to having
    up-to-date ANALYZE statistics in one case and not the other.  I don't
    suppose you captured EXPLAIN output for the non-working state?
    
                regards, tom lane