A modest proposal: make parser/rewriter/planner inputs read-only - Mailing list pgsql-hackers

From Tom Lane
Subject A modest proposal: make parser/rewriter/planner inputs read-only
Date
Msg-id 2531459.1743871597@sss.pgh.pa.us
Whole thread Raw
Responses Re: A modest proposal: make parser/rewriter/planner inputs read-only
List pgsql-hackers
The parser, rewriter, and planner all have a bad habit of scribbling
on their input parsetrees.  At this point I've lost count of how many
bugs that's caused (but e33f2335a and 0f43083d1 are the two latest
examples), and of how many places there are that are brute-forcing a
solution to the problem by copying a whole parsetree before letting
one of these subsystems see it.

Years ago we fixed the executor to treat its input Plan trees as
read-only.  It seems like we really ought to do the same for these
upstream subsystems.  Surely, whatever benefit we get from changing
Node contents in-place is lost due to all these other hacks.

While I've had this thought in the back of my head for a long while,
I haven't pursued it because I couldn't think of a practical way
to find all the places that are violating such a policy, much less
prevent future violations.  However, today I thought of this sketch 
of a solution:

1. Invent a way to make a particular memory context read-only
after putting some data into it.

2. In debug builds, after we've built a tree that should be considered
read-only, copy it into such a context and make it read-only.  Or
perhaps build it there in the first place.

3. Fix the resulting crashes.

4. Profit!  (In particular, nuke a lot of no-longer-needed copyObject
calls.)

My first thought about implementing #1 was to seek Valgrind's help,
but so far as I can find out there's no VALGRIND_MAKE_MEM_READ_ONLY.
Step #3 would be pretty tedious anyway if it required running under
Valgrind.  However, all modern hardware has the ability to mark
memory read-only at the page level, and most platforms expose that
in some way or other.  So it doesn't seem unreasonable to invent
a memory context option (or whole new context type, if that seems
easier) that is careful to align its allocation blocks on page
boundaries and then can set or clear the hardware R/O flag on
demand.  It'd be enough if the R/O enforcement worked on popular
development platforms, we don't have to make it work absolutely
everywhere.

I'm not planning to pursue this idea Right Now, but it seems like
something that could happen for v19 or so.  In the meantime I wanted
to get the ideas down on electrons.

Thoughts?

            regards, tom lane



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Back-patch of: avoid multiple hard links to same WAL file after a crash
Next
From: Mahendra Singh Thalor
Date:
Subject: Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote