Re: BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes - Mailing list pgsql-bugs

From Alvaro Herrera
Subject Re: BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes
Date
Msg-id 20131128025653.GD5513@eldon.alvh.no-ip.org
Whole thread Raw
In response to BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes  (strahinjak@nordeus.com)
Responses Re: BUG #8632: file "pg_subtrans/CEC0" doesn't exist, reading as zeroes  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-bugs
strahinjak@nordeus.com wrote:

Uh, this is a bit funny.  It failed to find the file for a long time and
didn't think to error out, instead choosing to read the requested page
as zeroes:

> 2013-11-26 12:24:54 CET [6393]: [7-1]LOG:  file "pg_subtrans/CEC0" doesn't
> exist, reading as zeroes
> 2013-11-26 12:24:54 CET [6393]: [8-1]CONTEXT:  xlog redo xid assignment xtop
> 3468737450: subxacts: 3468738448 3468738450 3468738453 3468738455 3468738457
> 3468738459 3468738461 3468738463 3468738465 3468738467 3468738469 3468738471
> 3468738473 3468738475 3468738477 3468738479 3468738481 3468738483 3468738485
> 3468738487 3468738489 3468738491 3468738493 3468738495 3468738497 3468738499
> 3468738501 3468738503 3468738505 3468738507 3468738530 3468738532 3468738534
> 3468738536 3468738538 3468738540 3468738542 3468738544 3468738546 3468738548
> 3468738550 3468738552 3468738554 3468738556 3468738558 3468738560 3468738562
> 3468738564 3468738566 3468738568 3468738570 3468738572 3468738574 3468738576
> 3468738578 3468738580 3468738582 3468738584 3468738586 3468738588 3468738590
> 3468738592 3468738595 3468738597

But as soon as the pg_subtrans file existed, any other error (seek or
read failure) is fatal:

> 2013-11-26 12:24:57 CET [6393]: [103-1]FATAL:  could not access status of
> transaction 3468818432
> 2013-11-26 12:24:57 CET [6393]: [104-1]DETAIL:  Could not read from file
> "pg_subtrans/CEC1" at offset 253952: Success.

Both these things happen in SlruPhysicalReadPage().  I think those hard
failures are a mistake.  In other words, we should do something like
this, which matches what happens if ENOENT:

    if (lseek(fd, (off_t) offset, SEEK_SET) < 0)
    {
        if (InRecovery)
        {
            ereport(LOG,
                (errmsg("file \"%s\" doesn't contain page %u, reading as zeroes",
                    path, page_number)));
            MemSet(shared->page_buffer[slotno], 0, BLCKSZ);
            return true;
        }

        slru_errcause = SLRU_SEEK_FAILED;
        slru_errno = errno;
        close(fd);
        return false;
    }

And equivalently for the read() failure.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

pgsql-bugs by date:

Previous
From: Tomonari Katsumata
Date:
Subject: Re: BUG #8434: Why does dead lock occur many times ?
Next
From: Tomonari Katsumata
Date:
Subject: Re: BUG #8434: Why does dead lock occur many times ?