Home > mailing lists

concerns around pg_lsn - Mailing list pgsql-hackers

From	Jeevan Ladhe
Subject	concerns around pg_lsn
Date	July 29, 2019 17:25:29
Msg-id	CAOgcT0NOM9oR0Hag_3VpyW0uF3iCU=BDUFSPfk9JrWXRcWQHqw@mail.gmail.com Whole thread Raw
Responses	Re: concerns around pg_lsn
List	pgsql-hackers

Tree view

Hi all,

While reviewing some code around pg_lsn_in() I came across a couple of

(potential?) issues:

Commit 21f428eb moves lsn conversion functionality from pg_lsn_in() to a new

function pg_lsn_in_internal(). It takes two parameters the lsn string and a

pointer to a boolean (*have_error) to indicate if there was an error while

converting string format to XLogRecPtr.

pg_lsn_in_internal() only sets the *have_error to 'true' if there is an error,

but leaves it for the caller to make sure it was passed by initializing as

'false'. Currently it is only getting called from pg_lsn_in() and timestamptz_in()

where it has been taken care that the flag is set to false before making the

call. But I think in general it opens the door for unpredictable bugs if

pg_lsn_in_internal() gets called from other locations in future (if need

maybe) and by mistake, it just checks the return value of the flag without

setting it to false before making a call.

I am attaching a patch that makes sure that *have_error is set to false in

pg_lsn_in_internal() itself, rather than being caller dependent.

Also, I think there might be callers who may not care if there had been an error

while converting and just ok to use InvalidXLogRecPtr against return value, and

may pass just a NULL boolean pointer to this function, but for now, I have left

that untouched. Maybe just adding an Assert would improve the situation for

time being.

I have attached a patch (fix_have_error_flag.patch) to take care of above.

I happened to peep in test case pg_lsn.sql, and I started exploring the macros

around lsn.

Following macros:

{code}

* Zero is used indicate an invalid pointer. Bootstrap skips the first possible

* WAL segment, initializing the first WAL page at WAL segment size, so no XLOG

* record can begin at zero.

#define InvalidXLogRecPtr 0

#define XLogRecPtrIsInvalid(r) ((r) == InvalidXLogRecPtr)

{code}

IIUC, in the comment above we clearly want to mark 0 as an invalid lsn (also

further IIUC the comment states - lsn would start from (walSegSize + 1)). Given

this, should not it be invalid to allow "0/0" as the value of type pg_lsn, or

for that matter any number < walSegSize?

There is a test scenario in test case pg_lsn.sql which tests insertion of "0/0"

in a table having a pg_lsn column. I think this is contradictory to the comment.

I am not sure of thought behind this and might be wrong while making the above

assumption. But, I tried to look around a bit in hackers emails and could not

locate any related discussion.

I have attached a patch (mark_lsn_0_invalid.patch) that makes above changes.

Thoughts?

Regards,

Jeevan Ladhe

Attachment

pgsql-hackers by date:

From: Tom Lane
Date: 29 July 2019, 16:58:17
Subject: Re: SimpleLruTruncate() mutual exclusion

From: Peter Geoghegan
Date: 29 July 2019, 18:24:02
Subject: Re: should there be a hard-limit on the number of transactionspending undo?

concerns around pg_lsn - Mailing list pgsql-hackers

Attachment

Previous

Next