Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL - Mailing list pgsql-hackers
From | David Christensen |
---|---|
Subject | Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL |
Date | |
Msg-id | CAOxo6X+wXEu1QfTFTf-WiDs3jr8ERvgEOddHfUhAH_5eje-G2w@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
|
List | pgsql-hackers |
On Fri, Nov 11, 2022 at 4:57 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Well if it's not the same output then I guess you're right and there's > > not a use for the `--fixup` mode. By the same token, I'd say > > calculating/setting the checksum also wouldn't need to be done, we > > should just include the page as included in the WAL stream. > > Let's hear from others, we may be missing something here. I recommend > keeping the --fixup patch as 0002, in case if we decide to discard > it's easier, however I'll leave that to you. I've whacked in `git` for now; I can resurrect if people consider it useful. > > > > > 5. > > > > > + if (!RestoreBlockImage(record, block_id, page)) > > > > > + continue; > > > > > + > > > > > + /* we have our extracted FPI, let's save it now */ > > > > > After extracting the page from the WAL record, do we need to perform a > > > > > checksum on it? > > > > > > I think you just need to do the following, this will ensure the > > > authenticity of the page that pg_waldump returns. > > > if ((PageHeader) page)->pd_checksum != pg_checksum_page((char *) page, blk)) > > > { > > > pg_fatal("page checksum failed"); > > > } > > > > The WAL already has a checksum, so not certain this makes sense on its > > own. Also I'm inclined to make it a warning if it doesn't match rather > > than a fatal. (I'd also have to verify that the checksum is properly > > set on the page prior to copying the FPI into WAL, which I'm pretty > > sure it is but not certain.) > > How about having it as an Assert()? Based on empirical testing, the checksums don't match, so asserting/alerting on each block extracted seems next to useless, so going to just remove that. > > > 5. > > > + fsync(fileno(OPF)); > > > + fclose(OPF); > > > I think just the fsync() isn't enough, you still need fsync_fname() > > > and/or fsync_parent_path(), perhaps after for (block_id = 0; block_id > > > <= XLogRecMaxBlockId(record); block_id++) loop. > > > > I'm not sure I get the value of the fsyncs here; if you are using this > > tool at this capacity you're by definition doing some sort of > > transient investigative steps. Since the WAL was fsync'd, you could > > always rerun/recreate as needed in the unlikely event of an OS crash > > in the middle of this investigation. Since this is outside the > > purview of the database operations proper (unlike, say, initdb) seems > > like it's unnecessary (or definitely shouldn't need to be selectable). > > My thoughts are that if we're going to fsync, just do the fsyncs > > unconditionally rather than complicate the interface further. > > -1 for fysnc() per file created as it can create a lot of sync load on > production servers impacting performance. How about just syncing the > directory at the end assuming it doesn't cost as much as fsync() per > FPI file created would? I can fsync the dir if that's a useful compromise. > > > 4. > > > +$primary->append_conf('postgresql.conf', "wal_level='replica'"); > > > +$primary->append_conf('postgresql.conf', 'archive_mode=on'); > > > +$primary->append_conf('postgresql.conf', "archive_command='/bin/false'"); > > > Why do you need to set wal_level to replica, out of the box your > > > cluster comes with replica only no? > > > And why do you need archive_mode on and set the command to do nothing? > > > Why archiving is needed for testing your feature firstly? > > > > I think it had shown "minimal" in my testing; I was purposefully > > failing archives so the WAL would stick around. Maybe a custom > > archive command that just copied a single WAL file into a known > > location so I could use that instead of the current approach would > > work, though not sure how Windows support would work with that. Open > > to other ideas to more cleanly get a single WAL file that isn't the > > last one. (Earlier versions of this test were using /all/ of the > > generated WAL files rather than a single one, so maybe I am > > overcomplicating things for a single WAL file case.) > > Typically we create a physical replication slot at the beginning so > that the server keeps the WAL required for you in pg_wal itself, for > instance, please see pg_walinspect: > > -- Make sure checkpoints don't interfere with the test. > SELECT 'init' FROM > pg_create_physical_replication_slot('regress_pg_walinspect_slot', > true, false); Will see if I can get something like this to work; I'm currently stopping the server before running the file-based tests, but I suppose there's no reason to do so, so a temporary slot that holds it around until the test is complete is probably fine. David
pgsql-hackers by date: