Thread: sloppy back-patching of column-privilege leak

sloppy back-patching of column-privilege leak

From
Robert Haas
Date:
Branch: master [804b6b6db] 2015-01-28 12:31:30 -0500
Branch: REL9_4_STABLE Release: REL9_4_1 [3cc74a3d6] 2015-01-28 12:32:06 -0500
Branch: REL9_3_STABLE Release: REL9_3_6 [4b9874216] 2015-01-28 12:32:39 -0500
Branch: REL9_2_STABLE Release: REL9_2_10 [d49f84b08] 2015-01-28 12:32:56 -0500
Branch: REL9_1_STABLE Release: REL9_1_15 [9406884af] 2015-01-28 12:33:15 -0500
Branch: REL9_0_STABLE Release: REL9_0_19 [3a2063369] 2015-01-28 12:33:29 -0500

In 9.2 and newer branches, this commit makes substantial changes to
execMain.c.  In 9.1 and 9.0, though, the change to that file is just
this:

--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -95,6 +95,15 @@ static void intorel_receive(TupleTableSlot *slot,
DestReceiver *self);static void intorel_shutdown(DestReceiver *self);static void intorel_destroy(DestReceiver *self);

+/*
+ * Note that this macro also exists in commands/trigger.c.  There does not
+ * appear to be any good header to put it into, given the structures that
+ * it uses, so we let them be duplicated.  Be sure to update both if one needs
+ * to be changed, however.
+ */
+#define GetModifiedColumns(relinfo, estate) \
+       (rt_fetch((relinfo)->ri_RangeTableIndex,
(estate)->es_range_table)->modifiedCols)
+/* end of local decls */

We shouldn't be adding a macro to that file that isn't used anywhere.
Either that file needs more substantial changes, or it doesn't need
this, either.  What gives?  Besides the sloppiness of back-patching in
that one macro and nothing else, this is a huge fraction of the patch
that's just *gone* in the 9.0 and 9.1 branches, and there's not so
much as a hint about it in the commit message, which is pretty
astonishing to me.

What's even more troubling is the new regression tests.  In master,
this commit added this:

+INSERT INTO t1 (c1, c2) VALUES (1, 1); -- fail, but row not shown
+ERROR:  duplicate key value violates unique constraint "t1_pkey"
+UPDATE t1 SET c2 = 1; -- fail, but row not shown
+ERROR:  duplicate key value violates unique constraint "t1_pkey"
+INSERT INTO t1 (c1, c2) VALUES (null, null); -- fail, but see columns
being inserted
+ERROR:  null value in column "c1" violates not-null constraint
+DETAIL:  Failing row contains (c1, c2) = (null, null).
+INSERT INTO t1 (c3) VALUES (null); -- fail, but see columns being
inserted or have SELECT
+ERROR:  null value in column "c1" violates not-null constraint
+DETAIL:  Failing row contains (c1, c3) = (null, null).
+INSERT INTO t1 (c1) VALUES (5); -- fail, but see columns being
inserted or have SELECT
+ERROR:  null value in column "c2" violates not-null constraint
+DETAIL:  Failing row contains (c1) = (5).
+UPDATE t1 SET c3 = 10; -- fail, but see columns with SELECT rights,
or being modified
+ERROR:  new row for relation "t1" violates check constraint "t1_c3_check"
+DETAIL:  Failing row contains (c1, c3) = (1, 10).

Notice how the comments match the actual behavior.  But in 9.0, you've
got the SAME COMMENTS with DIFFERENT BEHAVIOR:

+INSERT INTO t1 (c1, c2) VALUES (1, 1); -- fail, but row not shown
+ERROR:  duplicate key value violates unique constraint "t1_pkey"
+UPDATE t1 SET c2 = 1; -- fail, but row not shown
+ERROR:  duplicate key value violates unique constraint "t1_pkey"
+INSERT INTO t1 (c1, c2) VALUES (null, null); -- fail, but see columns
being inserted
+ERROR:  null value in column "c1" violates not-null constraint
+INSERT INTO t1 (c3) VALUES (null); -- fail, but see columns being
inserted or have SELECT
+ERROR:  null value in column "c1" violates not-null constraint
+INSERT INTO t1 (c1) VALUES (5); -- fail, but see columns being
inserted or have SELECT
+ERROR:  null value in column "c2" violates not-null constraint
+UPDATE t1 SET c3 = 10; -- fail, but see columns with SELECT rights,
or being modified
+ERROR:  new row for relation "t1" violates check constraint "t1_c3_check"

The comments say that you'll see the relevant columns, but you don't.
Since when is it OK to check things into our regression tests where
the comments say one thing and the behavior is something else?

I don't even know what to say about this.  I cannot understand how you
can back-patch something like this and fail to notice that the
regression tests give different output on different branches, and that
that output is inconsistent with the comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: sloppy back-patching of column-privilege leak

From
Stephen Frost
Date:
Robert,

* Robert Haas (robertmhaas@gmail.com) wrote:
> In 9.2 and newer branches, this commit makes substantial changes to
> execMain.c.  In 9.1 and 9.0, though, the change to that file is just
> this:
>
> --- a/src/backend/executor/execMain.c
> +++ b/src/backend/executor/execMain.c
> @@ -95,6 +95,15 @@ static void intorel_receive(TupleTableSlot *slot,
> DestReceiver *self);
>  static void intorel_shutdown(DestReceiver *self);
>  static void intorel_destroy(DestReceiver *self);
>
> +/*
> + * Note that this macro also exists in commands/trigger.c.  There does not
> + * appear to be any good header to put it into, given the structures that
> + * it uses, so we let them be duplicated.  Be sure to update both if one needs
> + * to be changed, however.
> + */
> +#define GetModifiedColumns(relinfo, estate) \
> +       (rt_fetch((relinfo)->ri_RangeTableIndex,
> (estate)->es_range_table)->modifiedCols)
> +
>  /* end of local decls */
>
> We shouldn't be adding a macro to that file that isn't used anywhere.

Ah, yes, agreed, that didn't need to be added in the older branches.

> Either that file needs more substantial changes, or it doesn't need
> this, either.  What gives?

It doesn't need it.  The older branches don't produce as detailed errors
and because of that there weren't any further changes to be made.
Including the extraneous #define in those older branches was certainly a
mistake and I'll correct it.

> Besides the sloppiness of back-patching in
> that one macro and nothing else, this is a huge fraction of the patch
> that's just *gone* in the 9.0 and 9.1 branches, and there's not so
> much as a hint about it in the commit message, which is pretty
> astonishing to me.

Right, 9.0 and 9.1 don't have as detailed messages and so there wasn't
as much to do in those older branches.  I was well aware of that and I
could have sworn that I included something in the commit messages..
Looks like I did, but not in a way which was as clear as it should have
been.  Specifically, in those older branches, the commit message only
talks about the functions which exist in those branches-
BuildIndexValueDescription and ri_ReportViolation.  The commit messages
for 9.2 and beyond also reference ExecBuildSlotValueDescription, which
is what you're getting at.

In hindsight, I agree that doing just that wasn't sufficient and
thinking on it now I realize that, while having different commit
messages for each branch may make sense if you're only looking at that
branch, it ends up being confusing for folks following the overall
project as they likely look at just the subject of each commit and
expect the contents for every branch to be the same.  To that point,
I'll try to be clearer when there's a difference in commit message for
each branch, or simply include everything for every branch in an
identical commit message across all of them.

> Notice how the comments match the actual behavior.  But in 9.0, you've
> got the SAME COMMENTS with DIFFERENT BEHAVIOR:

Good point.  The comments clearly were for the master-based behavior but
the older branches don't have the same detail and so they should have
been updated.  I knew that they were different, and understood and
expected those differences, made the specific changes which were
appropriate for each branch, but missed updating the comments.

> The comments say that you'll see the relevant columns, but you don't.
> Since when is it OK to check things into our regression tests where
> the comments say one thing and the behavior is something else?

I'm a bit confused by this- I certainly didn't intend this mistake to be
implying that our policy on comments and the code they're associated
with should be changed to allow that they don't match up, nor do I
advocate such a position now.

> I don't even know what to say about this.  I cannot understand how you
> can back-patch something like this and fail to notice that the
> regression tests give different output on different branches, and that
> that output is inconsistent with the comments.

I was quite aware that the regression tests were different on different
branches, as I knew (which the commit messages show) that only a subset
of the patch was relevant for the older branches since they didn't
include the detail where the security issue originated from.  I
certainly reviewed the regression tests but obviously missed that the
comments ended up out-dated due to the back-branch capabilities.

Thanks for the review and comments.  I'll remove the extraneous #define,
the comment change which was made to the other #define (as it's not
relevant since the #define is only located in one place) and fix the
regression test comments to match the behavior in the older branches.
Thanks again!
    Stephen

Re: sloppy back-patching of column-privilege leak

From
Alvaro Herrera
Date:
Stephen Frost wrote:

> > Besides the sloppiness of back-patching in
> > that one macro and nothing else, this is a huge fraction of the patch
> > that's just *gone* in the 9.0 and 9.1 branches, and there's not so
> > much as a hint about it in the commit message, which is pretty
> > astonishing to me.
> 
> Right, 9.0 and 9.1 don't have as detailed messages and so there wasn't
> as much to do in those older branches.  I was well aware of that and I
> could have sworn that I included something in the commit messages..
> Looks like I did, but not in a way which was as clear as it should have
> been.  Specifically, in those older branches, the commit message only
> talks about the functions which exist in those branches-
> BuildIndexValueDescription and ri_ReportViolation.  The commit messages
> for 9.2 and beyond also reference ExecBuildSlotValueDescription, which
> is what you're getting at.
> 
> In hindsight, I agree that doing just that wasn't sufficient and
> thinking on it now I realize that, while having different commit
> messages for each branch may make sense if you're only looking at that
> branch, it ends up being confusing for folks following the overall
> project as they likely look at just the subject of each commit and
> expect the contents for every branch to be the same.  To that point,
> I'll try to be clearer when there's a difference in commit message for
> each branch, or simply include everything for every branch in an
> identical commit message across all of them.

FWIW using different commit messages for different branches is a
mistake.  The implicit policy we have is that there is one message,
identical for all branches, that describe how the patch differs in each
branch whenever necessary.  Note that the git_changelog output that
Robert pasted is not verbatim from the tool; what it actually prints is
three separate entries, one for each different message you used, which
is not what is supposed to occur.

I say this policy is implicit because I don't recall it being spelled
out anywhere, but since it's embodied in git_changelog's working
principle it's important we stick to it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: sloppy back-patching of column-privilege leak

From
Stephen Frost
Date:
Alvaro,

* Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
> FWIW using different commit messages for different branches is a
> mistake.  The implicit policy we have is that there is one message,
> identical for all branches, that describe how the patch differs in each
> branch whenever necessary.  Note that the git_changelog output that
> Robert pasted is not verbatim from the tool; what it actually prints is
> three separate entries, one for each different message you used, which
> is not what is supposed to occur.

Ok, thanks.  That's certainly easy enough to do and I'll do so in the
future.  I could have sworn I'd seen cases where further clarification
was done for branch-specific commits but perhaps something else was
different there.

> I say this policy is implicit because I don't recall it being spelled
> out anywhere, but since it's embodied in git_changelog's working
> principle it's important we stick to it.

I have to admit that I've never really used git_changelog.
Thanks!
    Stephen

Re: sloppy back-patching of column-privilege leak

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Alvaro Herrera (alvherre@2ndquadrant.com) wrote:
>> FWIW using different commit messages for different branches is a
>> mistake.  The implicit policy we have is that there is one message,
>> identical for all branches, that describe how the patch differs in each
>> branch whenever necessary.  Note that the git_changelog output that
>> Robert pasted is not verbatim from the tool; what it actually prints is
>> three separate entries, one for each different message you used, which
>> is not what is supposed to occur.

> Ok, thanks.  That's certainly easy enough to do and I'll do so in the
> future.  I could have sworn I'd seen cases where further clarification
> was done for branch-specific commits but perhaps something else was
> different there.

Some people do it differently when the per-branch commits are very much
different, but what Alvaro suggests is certainly handy when it comes time
to make release notes ;-).

>> I say this policy is implicit because I don't recall it being spelled
>> out anywhere, but since it's embodied in git_changelog's working
>> principle it's important we stick to it.

> I have to admit that I've never really used git_changelog.

It's pretty handy.  The output for a couple of recent commits looks like

Author: Noah Misch <noah@leadboat.com>
Branch: master [a7a4adcf8] 2015-02-06 23:14:27 -0500
   Assert(PqCommReadingMsg) in pq_peekbyte().      Interrupting pq_recvbuf() can break protocol sync, so its callers
all  deserve this assertion.  The one pq_peekbyte() caller suffices already.
 

Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Branch: master [ff16b40f8] 2015-02-06 11:26:50 +0200
Branch: REL9_4_STABLE [3bc4c6942] 2015-02-06 11:27:12 +0200
Branch: REL9_3_STABLE [5f0ba4abb] 2015-02-06 11:32:16 +0200
Branch: REL9_2_STABLE [2af568c6b] 2015-02-06 11:32:37 +0200
Branch: REL9_1_STABLE [0d36d9f2b] 2015-02-06 11:32:42 +0200
   Report WAL flush, not insert, position in replication IDENTIFY_SYSTEM      When beginning streaming replication, the
clientusually issues the   IDENTIFY_SYSTEM command, which used to return the current WAL insert   position. That's not
suitablefor the intended purpose of that field,   however. pg_receivexlog uses it to start replication from the
reported  point, but if it hasn't been flushed to disk yet, it will fail. Change   IDENTIFY_SYSTEM to report the flush
positioninstead.      Backpatch to 9.1 and above. 9.0 doesn't report any WAL position.
 

Heikki's five commits got merged into one entry because they had identical
log-message texts and were committed on the same day.  Further back in the
output you find things like

Author: Peter Eisentraut <peter_e@gmx.net>
Branch: master [1332bbb30] 2015-02-01 22:36:44 -0500
Branch: REL9_4_STABLE Release: REL9_4_1 [0ca8cc581] 2015-02-01 22:40:13 -0500
Branch: REL9_3_STABLE Release: REL9_3_6 [6b9b705c9] 2015-02-01 22:40:25 -0500
Branch: REL9_2_STABLE Release: REL9_2_10 [040f5ef50] 2015-02-01 22:40:36 -0500
Branch: REL9_1_STABLE Release: REL9_1_15 [2b0d33599] 2015-02-01 22:40:45 -0500
Branch: REL9_0_STABLE Release: REL9_0_19 [b09ca8834] 2015-02-01 22:40:53 -0500
   doc: Improve claim about location of pg_service.conf      The previous wording claimed that the file was always in
/etc,but of   course this varies with the installation layout.  Write instead that it   can be found via `pg_config
--sysconfdir`. Even though this is still   somewhat incorrect because it doesn't account of moved installations, it
atleast conveys that the location depends on the installation.
 

which show the first actual release incorporating each patch.  So even
if you're not writing release notes, it's mighty handy for identifying
when/whether a given patch has shipped.  I tend to run this once a week
or so and keep the output around in a text file for quick searching.
        regards, tom lane



Re: sloppy back-patching of column-privilege leak

From
Stephen Frost
Date:
* Stephen Frost (sfrost@snowman.net) wrote:
> Thanks for the review and comments.  I'll remove the extraneous #define,
> the comment change which was made to the other #define (as it's not
> relevant since the #define is only located in one place) and fix the
> regression test comments to match the behavior in the older branches.

This has been done, though as noted in the commit, I simply removed the
regression tests as they weren't relevant in those branches.
Thanks!
    Stephen