Thread: bug in pageinspect's "tuple data" feature
If you have a sufficiently broken data page, pageinspect throws an error when trying to examine the page: ERROR: invalid memory alloc request size 18446744073709551451 This is pretty unhelpful; it would be better not to try to print the data instead of dying. With that, at least you can know where the problem is. This was introduced in d6061f83a166 (2015). Proposed patch to fix it (by having the code print a null "data" instead of dying) is attached. -- Álvaro Herrera
Attachment
On 21/11/2020 21:32, Alvaro Herrera wrote: > If you have a sufficiently broken data page, pageinspect throws an > error when trying to examine the page: > > ERROR: invalid memory alloc request size 18446744073709551451 > > This is pretty unhelpful; it would be better not to try to print the > data instead of dying. With that, at least you can know where the > problem is. > > This was introduced in d6061f83a166 (2015). Proposed patch to fix it > (by having the code print a null "data" instead of dying) is attached. Null seems misleading. Maybe something like "invalid", or print a warning? - Heikki
On Mon, Nov 23, 2020 at 09:11:26AM +0200, Heikki Linnakangas wrote: > On 21/11/2020 21:32, Alvaro Herrera wrote: >> This is pretty unhelpful; it would be better not to try to print the >> data instead of dying. With that, at least you can know where the >> problem is. >> >> This was introduced in d6061f83a166 (2015). Proposed patch to fix it >> (by having the code print a null "data" instead of dying) is attached. > > Null seems misleading. Maybe something like "invalid", or print a warning? How did you get into this state to begin with? get_raw_page() uses ReadBufferExtended() which gives some level of protection already, so shouldn't it be better to return an ERROR with ERRCODE_DATA_CORRUPTED and the block involved? -- Michael
Attachment
On 2020-Nov-24, Michael Paquier wrote: > On Mon, Nov 23, 2020 at 09:11:26AM +0200, Heikki Linnakangas wrote: > > On 21/11/2020 21:32, Alvaro Herrera wrote: > >> This is pretty unhelpful; it would be better not to try to print the > >> data instead of dying. With that, at least you can know where the > >> problem is. > >> > >> This was introduced in d6061f83a166 (2015). Proposed patch to fix it > >> (by having the code print a null "data" instead of dying) is attached. > > > > Null seems misleading. Maybe something like "invalid", or print a warning? Good idea, thanks. > How did you get into this state to begin with? The data was corrupted for whatever reason. I don't care why or how, I just need to fix it. If the data isn't corrupted, then I don't use pageinspect in the first place. > get_raw_page() uses ReadBufferExtended() which gives some level of > protection already, so shouldn't it be better to return an ERROR with > ERRCODE_DATA_CORRUPTED and the block involved? What would I gain from doing that? It's even more unhelpful, because it is intentional rather than accidental.
On Sat, Nov 21, 2020 at 2:32 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > If you have a sufficiently broken data page, pageinspect throws an > error when trying to examine the page: > > ERROR: invalid memory alloc request size 18446744073709551451 > > This is pretty unhelpful; it would be better not to try to print the > data instead of dying. With that, at least you can know where the > problem is. > > This was introduced in d6061f83a166 (2015). Proposed patch to fix it > (by having the code print a null "data" instead of dying) is attached. So I agree with the problem statement and I don't have any particular objection to the patch as proposed, but I think it's just the tip of the iceberg. These functions are tremendously careless about validating the input data and will crash if you breath on them too hard; we've just band-aided around that problem by making them super-user only (e.g. see 3e1338475ffc2eac25de60a9de9ce689b763aced). While that does address the security concern since superusers can do tons of bad things anyway, it's not much help if you want to actually be able to run them on damaged pages without having the world end, and it's no help at all if you'd like to let them be run by a non-superuser, since a constructed page can easily blow up the world. The patch as proposed fixes one of many problems in this area and may well be useful enough to commit without doing anything else, but I'd actually really like to see us do the same sort of hardening here that is present in the recently-committed amcheck-for-heap support, which is robust against a wide variety of things of this sort rather than just this one particularly. Again, this is not to say that you should be on the hook for that; it's a general statement. -- Robert Haas EDB: http://www.enterprisedb.com