Thread: pageinspect get_raw_page() option to return invalid pages

pageinspect get_raw_page() option to return invalid pages

From

Andres Freund

Date:

05 May 2018, 00:41:20

Hi,

Currently there's no convenient way to get a corrupted page (be it a
checksum failure, corrupted pd_upper/lower or whatnot) via pageinspect's
get_raw_page(). Given that pageinspect is kind of our tool to do inspect
issues on a data level that's not so great.

I therefore propose that we add an *option* that bypasses shared buffers
and reads the underlying data directly. And then skips over the
validation.  I'm not 100% sure if this should be done unconditionally,
or only if the item wasn't found in cache. There's some coherency
differences, obviously.

We could also just make it two functions, instead of a parameter.

It's unfortunately not entirely trivial to access specific blocks with
pg_read_binary_file() - one needs a query that deals with block sizes,
segment numbers, segment sizes etc...

Greetings,

Andres Freund

Re: pageinspect get_raw_page() option to return invalid pages

From

Peter Geoghegan

Date:

05 May 2018, 00:42:35

On Fri, May 4, 2018 at 11:41 AM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> Currently there's no convenient way to get a corrupted page (be it a
> checksum failure, corrupted pd_upper/lower or whatnot) via pageinspect's
> get_raw_page(). Given that pageinspect is kind of our tool to do inspect
> issues on a data level that's not so great.
>
> I therefore propose that we add an *option* that bypasses shared buffers
> and reads the underlying data directly. And then skips over the
> validation.  I'm not 100% sure if this should be done unconditionally,
> or only if the item wasn't found in cache. There's some coherency
> differences, obviously.

+1. This would be a good option for amcheck, too.


-- 
Peter Geoghegan

Re: pageinspect get_raw_page() option to return invalid pages

From

Andres Freund

Date:

05 May 2018, 00:46:56

On 2018-05-04 11:42:35 -0700, Peter Geoghegan wrote:
> On Fri, May 4, 2018 at 11:41 AM, Andres Freund <andres@anarazel.de> wrote:
> > Hi,
> >
> > Currently there's no convenient way to get a corrupted page (be it a
> > checksum failure, corrupted pd_upper/lower or whatnot) via pageinspect's
> > get_raw_page(). Given that pageinspect is kind of our tool to do inspect
> > issues on a data level that's not so great.
> >
> > I therefore propose that we add an *option* that bypasses shared buffers
> > and reads the underlying data directly. And then skips over the
> > validation.  I'm not 100% sure if this should be done unconditionally,
> > or only if the item wasn't found in cache. There's some coherency
> > differences, obviously.
> 
> +1. This would be a good option for amcheck, too.

Could you expand on that? Are you envisioning an option to
ReadBufferExtended()? Because that's certainly not what I'm thinking of
- it seems dangerous to populate shared buffers with an invalid
page. Therefore I was more thinking to falling back to smgrread() or
such.

Greetings,

Andres Freund

Re: pageinspect get_raw_page() option to return invalid pages

From

Peter Geoghegan

Date:

05 May 2018, 00:53:25

On Fri, May 4, 2018 at 11:46 AM, Andres Freund <andres@anarazel.de> wrote:
> Could you expand on that? Are you envisioning an option to
> ReadBufferExtended()? Because that's certainly not what I'm thinking of
> - it seems dangerous to populate shared buffers with an invalid
> page. Therefore I was more thinking to falling back to smgrread() or
> such.

I'm not envisaging anything specific just yet. It would be nice if
amcheck had an option that bypassed shared_buffers, because users want
that. That's all.

-- 
Peter Geoghegan

Re: pageinspect get_raw_page() option to return invalid pages

From

Andres Freund

Date:

05 May 2018, 00:56:31

On 2018-05-04 11:53:25 -0700, Peter Geoghegan wrote:
> On Fri, May 4, 2018 at 11:46 AM, Andres Freund <andres@anarazel.de> wrote:
> > Could you expand on that? Are you envisioning an option to
> > ReadBufferExtended()? Because that's certainly not what I'm thinking of
> > - it seems dangerous to populate shared buffers with an invalid
> > page. Therefore I was more thinking to falling back to smgrread() or
> > such.
> 
> I'm not envisaging anything specific just yet. It would be nice if
> amcheck had an option that bypassed shared_buffers, because users want
> that. That's all.

Can you expand on what they want?

- Avoid polluting caches? Why's the ringbuffer logic not good enough?
- Continue after a checksum or similar failure? That seems a bit useless
  for amcheck imo? You know there's corruption at that point after all.
- Read on disk data, bypassing shared buffers? That'd present a lot of
  coherency issues, no?

Greetings,

Andres Freund

Re: pageinspect get_raw_page() option to return invalid pages

From

Peter Geoghegan

Date:

05 May 2018, 04:16:57

On Fri, May 4, 2018 at 11:56 AM, Andres Freund <andres@anarazel.de> wrote:
> Can you expand on what they want?
>
> - Avoid polluting caches? Why's the ringbuffer logic not good enough?
> - Continue after a checksum or similar failure? That seems a bit useless
>   for amcheck imo? You know there's corruption at that point after all.
> - Read on disk data, bypassing shared buffers? That'd present a lot of
>   coherency issues, no?

I think that "Read on-disk data" would be a compelling feature in
certain environments. It would present some concurrency issues, but
those seem solvable. The only thing that the "!readonly &&
!heapallindexed" checks need that complicates things is a snapshot
that prevents concurrent recycling by VACUUM. If we gave up on the
cross-page invariant check, then we wouldn't even need to worry about
concurrent recycling by VACUUM, while still offering almost the same
coverage as "!readonly && !heapallindexed" (I suppose we'd also have
to give up on the level cross-check that you suggested when v1 of
amcheck went in, too, but I think that that's it).

Maybe it would a better use of my time to focus on making this
accessible to backup tools, that should ideally work without needing
to acquire any MVCC snapshot. Probably from a front-end utility. We'd
need to export at least some operator class functionality to make that
work, though.

-- 
Peter Geoghegan