Re: BUG #14999: pg_rewind corrupts control file global/pg_control - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #14999: pg_rewind corrupts control file global/pg_control |
Date | |
Msg-id | 15971.1523116292@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #14999: pg_rewind corrupts control file global/pg_control (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: BUG #14999: pg_rewind corrupts control file global/pg_control
|
List | pgsql-bugs |
Thinking about this some more, I wondered if useful behavior with read-only config files would be to check whether they have the same contents on both servers, and give a warning or error if not. Now, that would only be useful if you suppose that they actually should be the same on both. For the particular case of server private keys, maybe they typically would be different. So I'm not real sure about this idea, but wanted to toss it out for discussion. In any case, after another look at the code, it seems to me that it'd be pretty easy to get pg_rewind to notice at the start whether a target file it plans to overwrite is read-only: process_source_file is already doing an lstat per file, so I think a permissions check there wouldn't add much overhead. So at that point we could either bail out without damage, or decide to skip the file. We could still lose if permissions change midway through the run, but that's the sort of situation I think is OK to fail in. Also, I'm having second thoughts about the usefulness of adding an ORDER BY to the fetch query just on (untested) performance grounds. It looks to me like the chunks within a file already will be inserted into the fetchchunks table in order, and I think within-file order is all that's worth worrying about. In principle the remote server might read out the rows in some other order than we stored them, but since fetchchunks is a temp table and not subject to synchronize_seqscans, I don't think we really need to worry about that. Furthermore, the patch as given has got a serious performance gotcha: sql = "SELECT path, begin,\n" " pg_read_binary_file(path, begin, len, true) AS chunk\n" - "FROM fetchchunks\n"; + "FROM fetchchunks\n" + "ORDER BY path, begin\n"; While 9.6 and later will do this in a sane way, older versions will execute pg_read_binary_file() ahead of the sort step, like so: # explain verbose SELECT path, begin, pg_read_binary_file(path, begin, len, true) AS chunk FROM fetchchunks ORDER BY path,begin; QUERY PLAN ------------------------------------------------------------------------------------- Sort (cost=74639.34..75889.41 rows=500025 width=44) Output: path, begin, (pg_read_binary_file(path, begin, (len)::bigint, true)) Sort Key: fetchchunks.path, fetchchunks.begin -> Seq Scan on pg_temp_2.fetchchunks (cost=0.00..11925.38 rows=500025 width=44) Output: path, begin, pg_read_binary_file(path, begin, (len)::bigint, true) (5 rows) with the effect that we're passing the entire contents of the source data directory (or at least, as much of it as pg_rewind needs to read) through the sort. Yipes. So it'd be safer to do it like this: # explain verbose SELECT path, begin, pg_read_binary_file(path, begin, len, true) AS chunk FROM (select * from fetchchunksORDER BY path, begin) ss; QUERY PLAN --------------------------------------------------------------------------------------------- Subquery Scan on ss (cost=72139.22..80889.66 rows=500025 width=44) Output: ss.path, ss.begin, pg_read_binary_file(ss.path, ss.begin, (ss.len)::bigint, true) -> Sort (cost=72139.22..73389.28 rows=500025 width=44) Output: fetchchunks.path, fetchchunks.begin, fetchchunks.len Sort Key: fetchchunks.path, fetchchunks.begin -> Seq Scan on pg_temp_2.fetchchunks (cost=0.00..9425.25 rows=500025 width=44) Output: fetchchunks.path, fetchchunks.begin, fetchchunks.len (7 rows) But the real bottom line here is I don't feel a need to touch the fetch query at all without some actual performance measurements proving there's a benefit. regards, tom lane
pgsql-bugs by date: