Re: BUG #14999: pg_rewind corrupts control file global/pg_control - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #14999: pg_rewind corrupts control file global/pg_control
Date
Msg-id 15971.1523116292@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #14999: pg_rewind corrupts control file global/pg_control  (Michael Paquier <michael@paquier.xyz>)
Responses Re: BUG #14999: pg_rewind corrupts control file global/pg_control
List pgsql-bugs
Thinking about this some more, I wondered if useful behavior with
read-only config files would be to check whether they have the same
contents on both servers, and give a warning or error if not.

Now, that would only be useful if you suppose that they actually should be
the same on both.  For the particular case of server private keys, maybe
they typically would be different.  So I'm not real sure about this idea,
but wanted to toss it out for discussion.

In any case, after another look at the code, it seems to me that it'd be
pretty easy to get pg_rewind to notice at the start whether a target file
it plans to overwrite is read-only: process_source_file is already doing
an lstat per file, so I think a permissions check there wouldn't add much
overhead.  So at that point we could either bail out without damage, or
decide to skip the file.  We could still lose if permissions change midway
through the run, but that's the sort of situation I think is OK to fail
in.

Also, I'm having second thoughts about the usefulness of adding an ORDER
BY to the fetch query just on (untested) performance grounds.  It looks
to me like the chunks within a file already will be inserted into the
fetchchunks table in order, and I think within-file order is all that's
worth worrying about.  In principle the remote server might read out the
rows in some other order than we stored them, but since fetchchunks is a
temp table and not subject to synchronize_seqscans, I don't think we
really need to worry about that.

Furthermore, the patch as given has got a serious performance gotcha:

     sql =
         "SELECT path, begin,\n"
         "  pg_read_binary_file(path, begin, len, true) AS chunk\n"
-        "FROM fetchchunks\n";
+        "FROM fetchchunks\n"
+        "ORDER BY path, begin\n";

While 9.6 and later will do this in a sane way, older versions will
execute pg_read_binary_file() ahead of the sort step, like so:

# explain verbose SELECT path, begin, pg_read_binary_file(path, begin, len, true) AS chunk FROM fetchchunks ORDER BY
path,begin; 
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Sort  (cost=74639.34..75889.41 rows=500025 width=44)
   Output: path, begin, (pg_read_binary_file(path, begin, (len)::bigint, true))
   Sort Key: fetchchunks.path, fetchchunks.begin
   ->  Seq Scan on pg_temp_2.fetchchunks  (cost=0.00..11925.38 rows=500025 width=44)
         Output: path, begin, pg_read_binary_file(path, begin, (len)::bigint, true)
(5 rows)

with the effect that we're passing the entire contents of the source data
directory (or at least, as much of it as pg_rewind needs to read) through
the sort.  Yipes.  So it'd be safer to do it like this:

# explain verbose SELECT path, begin, pg_read_binary_file(path, begin, len, true) AS chunk FROM (select * from
fetchchunksORDER BY path, begin) ss; 
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Subquery Scan on ss  (cost=72139.22..80889.66 rows=500025 width=44)
   Output: ss.path, ss.begin, pg_read_binary_file(ss.path, ss.begin, (ss.len)::bigint, true)
   ->  Sort  (cost=72139.22..73389.28 rows=500025 width=44)
         Output: fetchchunks.path, fetchchunks.begin, fetchchunks.len
         Sort Key: fetchchunks.path, fetchchunks.begin
         ->  Seq Scan on pg_temp_2.fetchchunks  (cost=0.00..9425.25 rows=500025 width=44)
               Output: fetchchunks.path, fetchchunks.begin, fetchchunks.len
(7 rows)

But the real bottom line here is I don't feel a need to touch the fetch
query at all without some actual performance measurements proving there's
a benefit.

            regards, tom lane


pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #15145: date time default value issues
Next
From: Michael Paquier
Date:
Subject: Re: BUG #14999: pg_rewind corrupts control file global/pg_control