Re: AIO v2.5 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: AIO v2.5
Date
Msg-id o2dtdrlc2xqpjycsfuvv7u6zyrqj7jfxqu32et3lyelkwce2jy@3k3fwdiiu73v
Whole thread Raw
In response to Re: AIO v2.5  (Andres Freund <andres@anarazel.de>)
Responses Re: AIO v2.5
List pgsql-hackers
Hi,

Attached v2.13, with the following changes:
Changes:

- Pushed a fair number of commits

  A lot of thanks goes to Noah's detailed reviews!


- As Noah pointed out, the zero_damaged_pages warning could be emitted in an
  io worker or another backend, but omitted in the backend that started the IO

  To address that:

  1) I added a new commit "aio: Add WARNING result status"
     (itself trivial)

  2) I changed buffer_readv_complete() to encode the warning/error in a more
     detailed way than before (was_zeroed, first_invalid_off, count_invalid)

     As part of that I put the encoding/decoding into a static inline

  3) Tracking the number of invalid buffers was awkward with
     buffer_readv_complete_one() returning a PgAioResult. Now it just
     reports whether it found an invalid page with an out argument.

  4) As discussed there now is a different error messages for the case of
     multiple invalid pages

     The code is a bit awkward to avoid code duplication, curious whether
     that's seen as acceptable?  I could just duplicate the entire ereport()
     instead.

  5) The WARNING in the callback is now a LOG, as it will be sent to the
     client as a WARNING explicitly when the IO's results are processed

     I actually chose LOG_SERVER_ONLY - that seemed slightly better than just
     LOG? But not at all sure.

     There's a comment explaining this now too.


  Noah, I think this set of changes would benefit from another round of
  review. I left these changes in "squash-later: " commits, to make it easier
  to see / think about.


- Added a comment about the pgaio_result_report() in md_readv_complete(). I
  changed it to LOG_SERVER_ONLY as well , but I'm not at all sure about that.


- Previously the buffer completion callback checked zero_damaged_pages - but
  that's not right, the GUC hopefully is only set on a per-session basis

  I solved that by having AsyncReadBuffers() add ZERO_ON_ERROR to the flags if
  zero_damaged_pages is configured.

  Also added a comment explaining that we probably should eventually use a
  separate flag, so we can adjust the errcode etc differently.


- Explicit test for zero_damaged_pages and ZERO_ON_ERROR

  As part of that I made read_rel_block_ll() support reading multiple
  blocks. That makes it a lot easier to verify that we handle cases like a
  4-block read where 2,3 are invalid correctly.


- I removed the code that "localbuf: Track pincount in BufferDesc as well"
  added to ConditionalLockBufferForCleanup() and IsBufferCleanupOK() as discussed

  Right now the situations that the code was worried don't exist yet, as we
  only support reads.

  I added a comment about not needing to worry about that yet to "bufmgr:
  Implement AIO read support". And then changed that comment to a FIXME in the
  write patches.


- Squashed Thomas' change to make io_concurrency=0 really not use AIO


- Lots of other review comments by Noah addressed


- Merged typo fixes by Thom Brown



TODO:


- There are more tests in test_aio that should be expanded to run for temp
  tables as well, not just normal tables


- Add an explicit test for the checksum verification in the completion callback

  There is an existing test for testing an invalid page due to page header
  verification in test_aio, but not for checksum failures.

  I think it's indirectly covered (e.g. in amcheck), but seems better to test
  it explicitly.


- Add error context callbacks for io worker and "foreign" IO completion

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Partition pruning on parameters grouped into an array does not prune properly
Next
From: Michael Paquier
Date:
Subject: Re: A small correction to doc and comment of FSM for indexes