Andres Freund <andres@2ndquadrant.com> wrote:

>> 0007: Adjust Satisfies* interface: required, mechanical,

> Version v5-01 attached

I'm still working on a review and hope to post something more
substantive by this weekend, but when applying patches in numeric
order, this one did not compile cleanly.

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2   -c -o allpaths.o allpaths.c -MMD -MP -MF
.deps/allpaths.Po
vacuumlazy.c: In function ‘heap_page_is_all_visible’:
vacuumlazy.c:1725:3: warning: passing argument 1 of ‘HeapTupleSatisfiesVacuum’ from incompatible pointer type [enabled
bydefault] 
In file included from vacuumlazy.c:61:0:
../../../src/include/utils/tqual.h:84:20: note: expected ‘HeapTuple’ but argument is of type ‘HeapTupleHeader’

Could you post a new version of that?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

21 June 2013, 01:32:07

Hi Kevin!

On 2013-06-20 15:57:07 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> >> 0007: Adjust Satisfies* interface: required, mechanical,
>
> > Version v5-01 attached
>
> I'm still working on a review and hope to post something more
> substantive by this weekend

Cool!

>, but when applying patches in numeric
> order, this one did not compile cleanly.
>
> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-I../../../../src/include-D_GNU_SOURCE -I/usr/include/libxml2   -c -o allpaths.o allpaths.c -MMD -MP -MF
.deps/allpaths.Po
> vacuumlazy.c: In function ‘heap_page_is_all_visible’:
> vacuumlazy.c:1725:3: warning: passing argument 1 of ‘HeapTupleSatisfiesVacuum’ from incompatible pointer type
[enabledby default] 
> In file included from vacuumlazy.c:61:0:
> ../../../src/include/utils/tqual.h:84:20: note: expected ‘HeapTuple’ but argument is of type ‘HeapTupleHeader’
>
> Could you post a new version of that?

Hrmpf. There was one hunk in 0013 instead of 0007.

I made sure that every commit again applies and compiles cleanly. git
rebase -i --exec to the rescue.

Found two other issues:
* recptr not assigned in 0010
* unsafe use of non-volatile variable across longjmp() 0013

Pushed and attached.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Andres Freund <andres@2ndquadrant.com> wrote:

> Pushed and attached.

The contrib/test_logical_decoding/sql/ddl.sql script is generating
unexpected results.  For both table_with_pkey and
table_with_unique_not_null, updates of the primary key column are
showing:

old-pkey: id[int4]:0

... instead of the expected value of 2 or -2.

See attached.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

regression.diffs

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

23 June 2013, 19:21:42

On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:
> Kevin Grittner <kgrittn@ymail.com> wrote:
>
> > Confirmed that all 17 patch files now apply cleanly, and that `make
> > check-world` builds cleanly after each patch in turn.
>
> Just to be paranoid, I did one last build with all 17 patch files
> applied to 7dfd5cd21c0091e467b16b31a10e20bbedd0a836 using this
> line:
>
> make maintainer-clean ; ./configure --prefix=$PWD/Debug --enable-debug --enable-cassert --enable-depend --with-libxml
--with-libxslt--with-openssl --with-perl --with-python && make -j4 world 
>
> and it died with this:
>
> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
-I../../../src/interfaces/libpq-I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2   -c -o pg_receivexlog.o
pg_receivexlog.c-MMD -MP -MF .deps/pg_receivexlog.Po 
> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g -I. -I.
-I../../../src/interfaces/libpq-I../../../src/bin/pg_dump -I../../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-c-o mainloop.o mainloop.c -MMD -MP -MF .deps/mainloop.Po 
> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g
pg_receivellog.oreceivelog.o streamutil.o  -L../../../src/port -lpgport -L../../../src/common -lpgcommon
-L../../../src/interfaces/libpq-lpq -L../../../src/port -L../../../src/common -L/usr/lib  -Wl,--as-needed
-Wl,-rpath,'/home/kgrittn/pg/master/Debug/lib',--enable-new-dtags -lpgport -lpgcommon -lxslt -lxml2 -lssl -lcrypto -lz
-lreadline-lcrypt -ldl -lm  -o pg_receivellog 
> gcc: error: pg_receivellog.o: No such file or directory
> make[3]: *** [pg_receivellog] Error 1
> make[3]: Leaving directory `/home/kgrittn/pg/master/src/bin/pg_basebackup'
> make[2]: *** [all-pg_basebackup-recurse] Error 2
> make[2]: *** Waiting for unfinished jobs....

I have seen that once as well. It's really rather strange since
pg_receivellog.o is a clear prerequisite for pg_receivellog. I couldn't
reproduce it reliably though, even after doing some dozen rebuilds or so.

> It works with this patch-on-patch:

> diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
> index a41b73c..18d02f3 100644
> --- a/src/bin/pg_basebackup/Makefile
> +++ b/src/bin/pg_basebackup/Makefile
> @@ -42,6 +42,7 @@ installdirs:
>  uninstall:
>     rm -f '$(DESTDIR)$(bindir)/pg_basebackup$(X)'
>     rm -f '$(DESTDIR)$(bindir)/pg_receivexlog$(X)'
> +   rm -f '$(DESTDIR)$(bindir)/pg_receivellog$(X)'
>  
>  clean distclean maintainer-clean:
> -   rm -f pg_basebackup$(X) pg_receivexlog$(X) $(OBJS) pg_basebackup.o pg_receivexlog.o pg_receivellog.o
> +   rm -f pg_basebackup$(X) pg_receivexlog$(X) pg_receivellog$(X) $(OBJS) pg_basebackup.o pg_receivexlog.o
pg_receivellog.o
>
> It appears to be an omission from file 0015.

Yes, both are missing.

> > +   rm -f '$(DESTDIR)$(bindir)/pg_receivellog$(X)'
> Oops.  That part is not needed.

Hm. Why not?

I don't think either hunk has anything to do with that buildfailure
though - can you reproduce the error without?

Thanks,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

23 June 2013, 20:42:41

On 2013-06-23 10:32:05 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> > Pushed and attached.
>
> The contrib/test_logical_decoding/sql/ddl.sql script is generating
> unexpected results.  For both table_with_pkey and
> table_with_unique_not_null, updates of the primary key column are
> showing:
>
> old-pkey: id[int4]:0
>
> ... instead of the expected value of 2 or -2.
>
> See attached.

Hm. Any chance this was an incomplete rebuild? I seem to remember having
seen that once because some header dependency wasn't recognized
correctly after applying some patch.

Otherwise, could you give me:
* the version you aplied the patch on
* os/compiler

Because I can't reproduce it, despite some playing around...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Tom Lane

Date:

23 June 2013, 20:48:49

Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:
>> gcc: error: pg_receivellog.o: No such file or directory
>> make[3]: *** [pg_receivellog] Error 1

> I have seen that once as well. It's really rather strange since
> pg_receivellog.o is a clear prerequisite for pg_receivellog. I couldn't
> reproduce it reliably though, even after doing some dozen rebuilds or so.

What versions of gmake are you guys using?  It wouldn't be the first
time we've tripped over bugs in parallel make.  See for instance
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1fc698cf14d17a3a8ad018cf9ec100198a339447
        regards, tom lane

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

23 June 2013, 20:59:19

On 2013-06-23 16:48:41 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:
> >> gcc: error: pg_receivellog.o: No such file or directory
> >> make[3]: *** [pg_receivellog] Error 1
> 
> > I have seen that once as well. It's really rather strange since
> > pg_receivellog.o is a clear prerequisite for pg_receivellog. I couldn't
> > reproduce it reliably though, even after doing some dozen rebuilds or so.
> 
> What versions of gmake are you guys using?  It wouldn't be the first
> time we've tripped over bugs in parallel make.  See for instance
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1fc698cf14d17a3a8ad018cf9ec100198a339447

3.81 here. That was supposed to be the "safe" one, right? At least to
the bugs seen/fixed recently.

Kevin, any chance you still have more log than in the upthread mail
available?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Kevin Grittner

Date:

24 June 2013, 13:45:02

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:

>> make maintainer-clean ; ./configure --prefix=$PWD/Debug --enable-debug
>> --enable-cassert --enable-depend --with-libxml --with-libxslt --with-openssl
>> --with-perl --with-python && make -j4 world

>> [ build failure referencing pg_receivellog.o ]

> I have seen that once as well. It's really rather strange since
> pg_receivellog.o is a clear prerequisite for pg_receivellog. I couldn't
> reproduce it reliably though, even after doing some dozen rebuilds or so.
>
>> It works with this patch-on-patch:

>>  clean distclean maintainer-clean:
>> -   rm -f pg_basebackup$(X) pg_receivexlog$(X) $(OBJS) pg_basebackup.o
>> pg_receivexlog.o pg_receivellog.o
>> +   rm -f pg_basebackup$(X) pg_receivexlog$(X) pg_receivellog$(X) $(OBJS)
>> pg_basebackup.o pg_receivexlog.o pg_receivellog.o

>> > +  rm -f '$(DESTDIR)$(bindir)/pg_receivellog$(X)'
>> Oops.  That part is not needed.
>
> Hm. Why not?

Well, I could easily be wrong on just about anything to do with
make files, but on a second look that appeared to be dealing with
eliminating an installed pg_receivellog binary, which is not
created.

> I don't think either hunk has anything to do with that buildfailure
> though - can you reproduce the error without?

I tried that scenario three times and it failed three times.  Then
I made the above changes and it worked.  Then I eliminated the one
on the uninstall target and tried a couple more times and it worked
on both attempts.  The scenario is to have a `make world` build in
the source tree, and run the above line starting with `make
maintainer-clean` and going to `make -j4 world`.

I did notice that without that change to the maintainer-clean
target I did not get a pg_receivellog.Po file in
src/bin/pg_basebackup/.deps/ -- and with it I do.  I admit to being
at about a 1.5 on a 10 point scale of make file competence -- I
just look for patterns used for things similar to what I want to do
and copy without much understanding of what it all means.  :-(  So
when I got an error on pg_receivellog which didn't happen on
pg_receivexlog, I looked for differences -- my suggestion has no
more basis than that and the fact that empirical testing seemed to
show that it worked.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: changeset generation v5-01 - Patches & git tree

From

Kevin Grittner

Date:

24 June 2013, 14:05:41

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-06-23 16:48:41 -0400, Tom Lane wrote:
>> Andres Freund <andres@2ndquadrant.com> writes:
>>> On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:
>>>> gcc: error: pg_receivellog.o: No such file or directory
>>>> make[3]: *** [pg_receivellog] Error 1
>>
>>> I have seen that once as well. It's really rather strange since
>>> pg_receivellog.o is a clear prerequisite for pg_receivellog. I
>>> couldn't reproduce it reliably though, even after doing some
>>> dozen rebuilds or so.
>>
>> What versions of gmake are you guys using?  It wouldn't be the
>> first time we've tripped over bugs in parallel make.  See for
>> instance
>> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=1fc698cf14d17a3a8ad018cf9ec100198a339447
>
> 3.81 here. That was supposed to be the "safe" one, right? At
> least to the bugs seen/fixed recently.

There is no executable named gmake in my distro, but...

kgrittn@Kevin-Desktop:~/pg/master$ make --version
GNU Make 3.81

Which is what I'm using.

> Kevin, any chance you still have more log than in the upthread
> mail available?

Well, I just copied from the console, and that was gone; but
reverting my change I get the same thing.  All console output
attached.  Let me know if you need something else.

Note that the dependency file disappeared:

kgrittn@Kevin-Desktop:~/pg/master$ ll src/bin/pg_basebackup/.deps/
total 24
drwxrwxr-x 2 kgrittn kgrittn 4096 Jun 24 08:57 ./
drwxrwxr-x 4 kgrittn kgrittn 4096 Jun 24 08:57 ../
-rw-rw-r-- 1 kgrittn kgrittn 1298 Jun 24 08:57 pg_basebackup.Po
-rw-rw-r-- 1 kgrittn kgrittn 1729 Jun 24 08:57 pg_receivexlog.Po
-rw-rw-r-- 1 kgrittn kgrittn 1646 Jun 24 08:57 receivelog.Po
-rw-rw-r-- 1 kgrittn kgrittn  953 Jun 24 08:57 streamutil.Po

It was there from the build with the change I made to the
maintainer-clean target, and went away when I built without it.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pg_receivellog-build-failure.txt.gz

Re: changeset generation v5-01 - Patches & git tree

From

Kevin Grittner

Date:

24 June 2013, 14:29:47

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-06-23 10:32:05 -0700, Kevin Grittner wrote:

>> The contrib/test_logical_decoding/sql/ddl.sql script is generating
>> unexpected results.  For both table_with_pkey and
>> table_with_unique_not_null, updates of the primary key column are
>> showing:
>>
>> old-pkey: id[int4]:0
>>
>> ... instead of the expected value of 2 or -2.
>>
>> See attached.
>
> Hm. Any chance this was an incomplete rebuild?

With my hack on the pg_basebackup Makefile, `make -j4 world` is
finishing with no errors and:

PostgreSQL, contrib, and documentation successfully made. Ready to install.

> I seem to remember having seen that once because some header
> dependency wasn't recognized correctly after applying some patch.

I wonder whether this is related to the build problems we've been
discussing on the other fork of this thread.  I was surprised to
see this error when I got past the maintainer-clean full build
problems, because I thought I had seen clean `make check-world`
regression tests after applying each incremental patch file.  Until
I read this I had been assuming that somehow I missed the error on
the 16th and 17th iterations; but now I'm suspecting that I didn't
miss anything after all -- it may just be another symptom of the
build problems.

> Otherwise, could you give me:
> * the version you aplied the patch on

7dfd5cd21c0091e467b16b31a10e20bbedd0a836

> * os/compiler

Linux Kevin-Desktop 3.5.0-34-generic #55-Ubuntu SMP Thu Jun 6 20:18:19 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2

> Because I can't reproduce it, despite some playing around...

Maybe if you can reproduce the build problems I'm seeing....

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

24 June 2013, 14:29:49

On 2013-06-24 06:44:53 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:
>
> >> make maintainer-clean ; ./configure --prefix=$PWD/Debug --enable-debug
> >> --enable-cassert --enable-depend --with-libxml --with-libxslt --with-openssl
> >> --with-perl --with-python && make -j4 world
>
> >> [ build failure referencing pg_receivellog.o ]
>
> > I have seen that once as well. It's really rather strange since
> > pg_receivellog.o is a clear prerequisite for pg_receivellog. I couldn't
> > reproduce it reliably though, even after doing some dozen rebuilds or so.
> >
> >> It works with this patch-on-patch:
>
> >>  clean distclean maintainer-clean:
> >> -   rm -f pg_basebackup$(X) pg_receivexlog$(X) $(OBJS) pg_basebackup.o
> >> pg_receivexlog.o pg_receivellog.o
> >> +   rm -f pg_basebackup$(X) pg_receivexlog$(X) pg_receivellog$(X) $(OBJS)
> >> pg_basebackup.o pg_receivexlog.o pg_receivellog.o
>
> >> > +  rm -f '$(DESTDIR)$(bindir)/pg_receivellog$(X)'
> >> Oops.  That part is not needed.
> >
> > Hm. Why not?
>
> Well, I could easily be wrong on just about anything to do with
> make files, but on a second look that appeared to be dealing with
> eliminating an installed pg_receivellog binary, which is not
> created.

I think it actually is?

install: all installdirs$(INSTALL_PROGRAM) pg_basebackup$(X) '$(DESTDIR)$(bindir)/pg_basebackup$(X)'$(INSTALL_PROGRAM)
pg_receivexlog$(X)'$(DESTDIR)$(bindir)/pg_receivexlog$(X)'$(INSTALL_PROGRAM) pg_receivellog$(X)
'$(DESTDIR)$(bindir)/pg_receivellog$(X)'

> > I don't think either hunk has anything to do with that buildfailure
> > though - can you reproduce the error without?
>
> I tried that scenario three times and it failed three times.  Then
> I made the above changes and it worked.  Then I eliminated the one
> on the uninstall target and tried a couple more times and it worked
> on both attempts.  The scenario is to have a `make world` build in
> the source tree, and run the above line starting with `make
> maintainer-clean` and going to `make -j4 world`.

Hm. I think it might be something in makes intermediate target logic
biting us. Anyway, if the patch fixes that: Great ;). Merged it logally
since it's obviously missing.

> I did notice that without that change to the maintainer-clean
> target I did not get a pg_receivellog.Po file in
> src/bin/pg_basebackup/.deps/ -- and with it I do.

Yea, according to your log it's not even built before pg_receivellog is
linked.

Thanks,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Kevin Grittner

Date:

24 June 2013, 14:48:23

Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-06-24 06:44:53 -0700, Kevin Grittner wrote:
>> Andres Freund <andres@2ndquadrant.com> wrote:
>>> On 2013-06-23 08:27:32 -0700, Kevin Grittner wrote:

>>>>> +  rm -f '$(DESTDIR)$(bindir)/pg_receivellog$(X)'
>>>> Oops.  That part is not needed.
>>>
>>> Hm. Why not?
>>
>> Well, I could easily be wrong on just about anything to do with
>> make files, but on a second look that appeared to be dealing with
>> eliminating an installed pg_receivellog binary, which is not
>> created.
>
> I think it actually is?

Oh, yeah....  I see it now.  I warned you I could be wrong.  :-/

I just had a thought thought -- perhaps the dependency information
is being calculated incorrectly.  Attached is the dependency file
from the successful build (with the adjusted Makefile), which still
fails the test_logical_decoding regression test, with the same diff.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pg_receivellog.Po

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

24 June 2013, 15:39:32

On 2013-06-24 07:29:43 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-06-23 10:32:05 -0700, Kevin Grittner wrote:
>
> >> The contrib/test_logical_decoding/sql/ddl.sql script is generating
> >> unexpected results.  For both table_with_pkey and
> >> table_with_unique_not_null, updates of the primary key column are
> >> showing:
> >>
> >> old-pkey: id[int4]:0
> >>
> >> ... instead of the expected value of 2 or -2.
> >>
> >> See attached.
> >
> > Hm. Any chance this was an incomplete rebuild?
>
> With my hack on the pg_basebackup Makefile, `make -j4 world` is
> finishing with no errors and:

Hm. There were some issues with the test_logical_decoding Makefile not
cleaning up the regression installation properly. Which might have
caused the issue.

Could you try after applying the patches and executing a clean and then
rebuild?

Otherwise, could you try applying my git tree so we are sure we test the
same thing?

$ git remote add af git://git.postgresql.org/git/users/andresfreund/postgres.git
$ git fetch af
$ git checkout -b xlog-decoding af/xlog-decoding-rebasing-cf4
$ ./configure ...
$ make

> > Because I can't reproduce it, despite some playing around...
>
> Maybe if you can reproduce the build problems I'm seeing....

Tried your recipe but still couldn't...

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

On 28 June 2013 17:10, Robert Haas <robertmhaas@gmail.com> wrote:

> But to tell the truth, I'm mostly exercised about the non-unique
> syscache. I think that's simply a *bad* idea.

+1.

I don't think the extra index on pg_class is going to hurt that much,
even if we create it always, as long as we use a purpose-built caching
mechanism for it rather than forcing it through catcache.

Hmm, does seem like that would be better.

The people
who are going to suffer are the ones who create and drop a lot of
temporary tables, but even there I'm not sure how visible the overhead
will be on real-world workloads, and maybe the solution is to work
towards not having permanent catalog entries for temporary tables in
the first place. In any case, hurting people who use temporary tables
heavily seems better than adding overhead to every
insert/update/delete operation, which will hit all users who are not
read-only.

Thinks...

If we added a trigger that fired a NOTIFY for any new rows in pg_class that relate to non-temporary relations that would optimise away any overhead for temporary tables or when no changeset extraction was in progress.

The changeset extraction could build a private hash table to perform the lookup and then LISTEN on a specific channel for changes.

That might work better than an index-plus-syscache.

On the other hand, I can't entirely shake the feeling that adding the
information into WAL would be more reliable.

I don't really like the idea of requiring the relid on the WAL record. WAL is big enough already and we want people to turn this on, not avoid it.

This is just an index lookup. We do them all the time without any fear of reliability issues.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

28 June 2013, 19:47:56

On 2013-06-28 12:26:52 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On the other hand, I can't entirely shake the feeling that adding the
> > information into WAL would be more reliable.
> 
> That feeling has been nagging at me too.  I can't demonstrate that
> there's a problem when an ALTER TABLE is in process of rewriting a table
> into a new relfilenode number, but I don't have a warm fuzzy feeling
> about the reliability of reverse lookups for this.

I am pretty sure the mapping thing works, but it indeed requires some
complexity. And it's harder to debug because when you want to understand
what's going on the relfilenodes involved aren't in the catalog anymore.

> At the very least
> it's going to require some hard-to-verify restriction about how we
> can't start doing changeset reconstruction in the middle of a
> transaction that's doing DDL.

Currently changeset extraction needs to wait (and does so) till it found
a point where it has seen the start of all in-progress transactions. All
transaction that *commit* after the last partiall observed in-progress
transaction finished can be decoded.
To make that point visible for external tools to synchronize -
e.g. pg_dump - it exports the snapshot of exactly the moment when that
last in-progress transaction committed.

So, from what I gather there's a slight leaning towards *not* storing
the relation's oid in the WAL. Which means the concerns about the
uniqueness issues with the syscaches need to be addressed. So far I know
of three solutions:
1) develop a custom caching/mapping module
2) Make sure InvalidOid's (the only possible duplicate) can't end up the  syscache by adding a hook that prevents that
onthe catcache level

3) Make sure that there can't be any duplicates by storing the oid of  the relation in a mapped relations relfilenode

Opinions?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Alvaro Herrera

Date:

01 July 2013, 17:57:00

Since this discussion seems to have stalled, let me do a quick summary.
The goal of this subset of patches is to allow retroactive look up of
relations starting from a WAL record.  Currently, the WAL record only
tracks the relfilenode that it affects, so there are two possibilities:

1. we add some way to find out the relation OID from the relfilenode,
2. we augment the WAL record with the relation OID.

Each solution has its drawbacks.  For the former,
* we need a new cache
* we need a new pg_class index
* looking up the relation OID still requires some CPU runtime and memory to keep the caches in; run invalidations,
etc.

For the latter,
* each WAL record would become somewhat bigger.  For WAL records with a payload of 25 bytes (say insert a tuple which
is25 bytes long) this means about 7% overhead.
 

There are some other issues, but these can be solved.  For instance Tom
doesn't want a syscache on top of a non-unique index, and I agree on
that.  But if we agree on this way forward, we can just go a different
route by keeping a separate cache layer.

So the question is, do we take the overhead of the new index (which
means overhead on DML operations -- supposedly rare) or do we take the
overhead of larger WAL records (which means overhead on all DDL
operations)?

Note we can make either thing apply to only people running logical
replication.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: changeset generation v5-01 - Patches & git tree

From

Tom Lane

Date:

01 July 2013, 18:17:05

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> So the question is, do we take the overhead of the new index (which
> means overhead on DML operations -- supposedly rare) or do we take the
> overhead of larger WAL records (which means overhead on all DDL
> operations)?

> Note we can make either thing apply to only people running logical
> replication.

I don't believe you can have or not have an index on pg_class as easily
as all that.  The choice would have to be frozen at initdb time, so
people would have to pay the overhead if they thought there was even a
small possibility that they'd want logical replication later.

Flipping the content of WAL records might not be a terribly simple thing
to do either, but at least in principle it could be done during a
postmaster restart, without initdb.
        regards, tom lane

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

01 July 2013, 18:41:01

On 2013-07-01 14:16:55 -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > So the question is, do we take the overhead of the new index (which
> > means overhead on DML operations -- supposedly rare) or do we take the
> > overhead of larger WAL records (which means overhead on all DDL
> > operations)?
> 
> > Note we can make either thing apply to only people running logical
> > replication.
> 
> I don't believe you can have or not have an index on pg_class as easily
> as all that.  The choice would have to be frozen at initdb time, so
> people would have to pay the overhead if they thought there was even a
> small possibility that they'd want logical replication later.

It should be possible to create the index in a single database when we
start logical replication in that database? Running the index creation
with a fixed oid shouldn't require too much code. The oid won't be
reused by other pg_class entries since it would be a system one.
Alternatively we could always create the index's pg_class/index entry
but mark it as !indislive when logical replication isn't active for that
database. Then activating it would just require rebuilding that
index.

But then, I am not fully convinced that's worth the trouble since I
don't think pg_class index maintenance is the painspot in DDL atm.

> Flipping the content of WAL records might not be a terribly simple thing
> to do either, but at least in principle it could be done during a
> postmaster restart, without initdb.

The main patch combines various booleans in the heap wal records into a
flags variable, so there should be enough space to keep track of it
without increasing size. Makes size calculations a bit more annoying
though as we use the xlog record length to calculate the heap tuple's
length, but that's not a large problem.
So we could just set the XLOG_HEAP_CONTAINS_CLASSOID flag if wal_level
>= WAL_LEVEL_LOGICAL. Wal decoding then can throw a tantrum if it finds
a record without it and we're done.

We could even make that per database, but that seems to be something for
the future.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Simon Riggs

Date:

01 July 2013, 21:07:32

On 27 June 2013 23:18, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Exactly what is the argument that says performance of this
function is sufficiently critical to justify adding both the maintenance
overhead of a new pg_class index, *and* a broken-by-design syscache?

I think we all agree on changing the syscache.

I'm not clear why adding a new permanent index to pg_class is such a problem. It's going to be a very thin index. I'm trying to imagine a use case that has pg_class index maintenance as a major part of its workload and I can't. An extra index on pg_attribute and I might agree with you. The pg_class index would only be a noticeable % of catalog rows for very thin temp tables, but would still even then be small; that isn't even necessary work since we all agree that temp table overheads could and should be optimised away somwhere. So blocking a new index because of that sounds strange.

What issues do you foresee? How can we test them?

Or perhaps we should just add the index and see if we later discover a measurable problem workload?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

05 July 2013, 12:02:38

On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
> Andres Freund <andres@2ndquadrant.com> wrote:
>
> > Hm. There were some issues with the test_logical_decoding
> > Makefile not cleaning up the regression installation properly.
> > Which might have caused the issue.
> >
> > Could you try after applying the patches and executing a clean
> > and then rebuild?
>
> Tried, and problem persists.
>
> > Otherwise, could you try applying my git tree so we are sure we
> > test the same thing?
> >
> > $ git remote add af git://git.postgresql.org/git/users/andresfreund/postgres.git
> > $ git fetch af
> > $ git checkout -b xlog-decoding af/xlog-decoding-rebasing-cf4
> > $ ./configure ...
> > $ make
>
> Tried that, too, and problem persists.  The log shows the last
> commit on your branch as 022c2da1873de2fbc93ae524819932719ca41bdb.

Ok. I think I have a slight idea what's going on. Could you check
whether recompiling with -O0 "fixes" the issue?

There's something strange going on here, not sure whether it's just a
bug that's hidden, by either not doing optimizations or by adding more
elog()s, or wheter it's a compiler bug.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

05 July 2013, 13:23:56

On 2013-07-05 14:03:56 +0200, Andres Freund wrote:
> On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
> > Andres Freund <andres@2ndquadrant.com> wrote:
> >
> > > Hm. There were some issues with the test_logical_decoding
> > > Makefile not cleaning up the regression installation properly.
> > > Which might have caused the issue.
> > >
> > > Could you try after applying the patches and executing a clean
> > > and then rebuild?
> >
> > Tried, and problem persists.
> >
> > > Otherwise, could you try applying my git tree so we are sure we
> > > test the same thing?
> > >
> > > $ git remote add af git://git.postgresql.org/git/users/andresfreund/postgres.git
> > > $ git fetch af
> > > $ git checkout -b xlog-decoding af/xlog-decoding-rebasing-cf4
> > > $ ./configure ...
> > > $ make
> >
> > Tried that, too, and problem persists.  The log shows the last
> > commit on your branch as 022c2da1873de2fbc93ae524819932719ca41bdb.
>
> Ok. I think I have a slight idea what's going on. Could you check
> whether recompiling with -O0 "fixes" the issue?
>
> There's something strange going on here, not sure whether it's just a
> bug that's hidden, by either not doing optimizations or by adding more
> elog()s, or wheter it's a compiler bug.

Ok. It was supreme stupidity on my end. Sorry for the time you spent on
it.

Some versions of gcc (and probably other compilers) were removing
sections of code when optimizing because the code was doing undefined
things. Parts of the rdata chain were allocated locally in an
if (needs_key). Which obviously is utterly bogus... A warning would have
been nice though.

Fix pushed and attached.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-wal_decoding-mergme-Don-t-use-out-of-scope-local-var.patch

Re: changeset generation v5-01 - Patches & git tree

From

Steve Singer

Date:

05 July 2013, 13:27:39

On 07/05/2013 08:03 AM, Andres Freund wrote:
> On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
>> Tried that, too, and problem persists.  The log shows the last commit 
>> on your branch as 022c2da1873de2fbc93ae524819932719ca41bdb. 
> Ok. I think I have a slight idea what's going on. Could you check
> whether recompiling with -O0 "fixes" the issue?
>
> There's something strange going on here, not sure whether it's just a
> bug that's hidden, by either not doing optimizations or by adding more
> elog()s, or wheter it's a compiler bug.

I am getting the same test failure Kevin is seeing.
This is on a x64 Debian wheezy machine with
gcc (Debian 4.7.2-5) 4.7.2

Building with -O0 results in passing tests.



> Greetings,
>
> Andres Freund
>

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

05 July 2013, 13:33:14

On 2013-07-05 09:28:45 -0400, Steve Singer wrote:
> On 07/05/2013 08:03 AM, Andres Freund wrote:
> >On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
> >>Tried that, too, and problem persists.  The log shows the last commit on
> >>your branch as 022c2da1873de2fbc93ae524819932719ca41bdb.
> >Ok. I think I have a slight idea what's going on. Could you check
> >whether recompiling with -O0 "fixes" the issue?
> >
> >There's something strange going on here, not sure whether it's just a
> >bug that's hidden, by either not doing optimizations or by adding more
> >elog()s, or wheter it's a compiler bug.
> 
> I am getting the same test failure Kevin is seeing.
> This is on a x64 Debian wheezy machine with
> gcc (Debian 4.7.2-5) 4.7.2
> 
> Building with -O0 results in passing tests.

Does the patch from
http://archives.postgresql.org/message-id/20130705132513.GB11640%40awork2.anarazel.de
or the git tree (which is rebased ontop of the mvcc catalog commit from
robert which needs some changes) fix it, even with optimizations?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Steve Singer

Date:

05 July 2013, 13:39:04

On 07/05/2013 09:34 AM, Andres Freund wrote:
> On 2013-07-05 09:28:45 -0400, Steve Singer wrote:
>> On 07/05/2013 08:03 AM, Andres Freund wrote:
>>> On 2013-06-27 21:52:03 -0700, Kevin Grittner wrote:
>>>> Tried that, too, and problem persists.  The log shows the last commit on
>>>> your branch as 022c2da1873de2fbc93ae524819932719ca41bdb.
>>> Ok. I think I have a slight idea what's going on. Could you check
>>> whether recompiling with -O0 "fixes" the issue?
>>>
>>> There's something strange going on here, not sure whether it's just a
>>> bug that's hidden, by either not doing optimizations or by adding more
>>> elog()s, or wheter it's a compiler bug.
>> I am getting the same test failure Kevin is seeing.
>> This is on a x64 Debian wheezy machine with
>> gcc (Debian 4.7.2-5) 4.7.2
>>
>> Building with -O0 results in passing tests.
> Does the patch from
> http://archives.postgresql.org/message-id/20130705132513.GB11640%40awork2.anarazel.de
> or the git tree (which is rebased ontop of the mvcc catalog commit from
> robert which needs some changes) fix it, even with optimizations?


Yes your latest git tree the tests pass with -O2


> Greetings,
>
> Andres Freund
>

Re: changeset generation v5-01 - Patches & git tree

From

Steve Singer

Date:

05 July 2013, 15:32:11

On 06/14/2013 06:51 PM, Andres Freund wrote:
> The git tree is at:
> git://git.postgresql.org/git/users/andresfreund/postgres.git branch xlog-decoding-rebasing-cf4
>
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4


We discussed issues related to passing options to the plugins a number 
of months ago ( 
http://www.postgresql.org/message-id/20130129015732.GA24238@awork2.anarazel.de)

I'm still having issues with the syntax you describe there.


START_LOGICAL_REPLICATION "1" 0/0 ("foo","bar") unexpected termination of replication stream: ERROR:  foo requires a 
parameter

START_LOGICAL_REPLICATION "1" 0/0 ("foo" "bar")
"START_LOGICAL_REPLICATION "1" 0/0 ("foo" "bar")": ERROR:  syntax error

START_LOGICAL_REPLICATION "1" 0/0 ("foo")
works okay



Steve




> On 2013-06-15 00:48:17 +0200, Andres Freund wrote:
>> Overview of the attached patches:
>> 0001: indirect toast tuples; required but submitted independently
>> 0002: functions for testing; not required,
>> 0003: (tablespace, filenode) syscache; required
>> 0004: RelationMapFilenodeToOid: required, simple
>> 0005: pg_relation_by_filenode() function; not required but useful
>> 0006: Introduce InvalidCommandId: required, simple
>> 0007: Adjust Satisfies* interface: required, mechanical,
>> 0008: Allow walsender to attach to a database: required, needs review
>> 0009: New GetOldestXmin() parameter; required, pretty boring
>> 0010: Log xl_running_xact regularly in the bgwriter: required
>> 0011: make fsync_fname() public; required, needs to be in a different file
>> 0012: Relcache support for an Relation's primary key: required
>> 0013: Actual changeset extraction; required
>> 0014: Output plugin demo; not required (except for testing) but useful
>> 0015: Add pg_receivellog program: not required but useful
>> 0016: Add test_logical_decoding extension; not required, but contains
>>        the tests for the feature. Uses 0014
>> 0017: Snapshot building docs; not required
> Version v5-01 attached
>
> Greetings,
>
> Andres Freund
>
>
>

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

05 July 2013, 16:04:08

On 2013-07-05 11:33:20 -0400, Steve Singer wrote:
> On 06/14/2013 06:51 PM, Andres Freund wrote:
> >The git tree is at:
> >git://git.postgresql.org/git/users/andresfreund/postgres.git branch xlog-decoding-rebasing-cf4
>
>http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4
> 
> 
> We discussed issues related to passing options to the plugins a number of
> months ago ( http://www.postgresql.org/message-id/20130129015732.GA24238@awork2.anarazel.de)
> 
> I'm still having issues with the syntax you describe there.
> 
> 
> START_LOGICAL_REPLICATION "1" 0/0 ("foo","bar")
>  unexpected termination of replication stream: ERROR:  foo requires a
> parameter

I'd guess that's coming from your output plugin? You're using
defGetString() on DefElem without a value?

> START_LOGICAL_REPLICATION "1" 0/0 ("foo" "bar")

Yes, the option *names* are identifiers, together with plugin & slot
names. The passed values need to be SCONSTs atm
(src/backend/replication/repl_gram.y):

plugin_opt_elem:        IDENT plugin_opt_arg            {                $$ = makeDefElem($1, $2);            }    ;

plugin_opt_arg:        SCONST                            { $$ = (Node *) makeString($1); }        | /* EMPTY */
          { $$ = NULL; }    ;
 

So, it would have to be:
START_LOGICAL_REPLICATION "1" 0/0 ("foo" 'bar blub frob', "sup" 'star', "noarg")

Now that's not completely obvious, I admit :/. Better suggestions?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

07 July 2013, 17:38:19

On 2013-06-28 21:47:47 +0200, Andres Freund wrote:
> So, from what I gather there's a slight leaning towards *not* storing
> the relation's oid in the WAL. Which means the concerns about the
> uniqueness issues with the syscaches need to be addressed. So far I know
> of three solutions:
> 1) develop a custom caching/mapping module
> 2) Make sure InvalidOid's (the only possible duplicate) can't end up the
>    syscache by adding a hook that prevents that on the catcache level
> 3) Make sure that there can't be any duplicates by storing the oid of
>    the relation in a mapped relations relfilenode

So, here's 4 patches:
1) add RelationMapFilenodeToOid()
2) Add pg_class index on (reltablespace, relfilenode)
3a) Add custom cache that maps from filenode to oid
3b) Add catcache 'filter' that ensures the cache stays unique and use
    that for the mapping
4) Add pg_relation_by_filenode() and use it in a regression test

3b) adds an optional 'filter' attribute to struct cachedesc in
    syscache.c which is then passed to catcache.c. If it's existant
    catcache.c uses it - after checking for a match in the cache - to
    check whether the queried-for value possibly should end up in the
    cache. If not it stores a whiteout entry as currently already done
    for nonexistant entries.
    It also reorders some catcache.h struct attributes to make sure
    we're not growing them. Might make sense to apply that
    independently, those are rather heavily used.

I slightly prefer 3b) because it's smaller, what's your opinions?

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Andres Freund wrote:
> The git tree is at:
> git://git.postgresql.org/git/users/andresfreund/postgres.git branch xlog-decoding-rebasing-cf4
>
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4

I gave this recently rebased branch a skim. In general, the separation
between decode.c/reorderbuffer.c/snapbuild.c seems a lot nicer now than
on previous iterations -- good job there.

Here are some quick notes I took while reading the patch itself. I
haven't gone through it really carefully, yet.

- I wonder whether DecodeCommit and DecodeAbort should really be a single routine. Right now, the former might call
thelater; and the latter is aware of this. Seems awkward.

- We skip insert/update/delete if not my database Id; however, we don't skip commit in the same case. If there are two
walrecvrson a cluster, on different databases, does this lead to us trying to remove files twice, if a xact commits
whichdeleted some files? Is this a problem? Should we try to skip such database-specific actions in global WAL
records?

- There's rmgr-specific knowledge in decode.c. I wonder if, similar to redo and desc routines, that shouldn't instead
bepluggable functions for each rmgr.

- What's with ReorderBufferRestoreCleanup()? Shouldn't it be in logical.c?

- reorderbuffer.c does several different things. Can it be split? Perhaps in pieces such as * stuff to manage memory
(slabcache thingies) * TXN iterator * other logically separate parts? * the rest

- Having to expose LocalExecuteInvalidationMessage() looks awkward. Is there another way?

- I think we need a better name for "treat_as_catalog_table" (and RelationIsTreatedAsCatalogTable). Maybe
replication_catalogor something similar?

- Don't do this: + * RecentGlobal(Data)?Xmin is initialized to InvalidTransactionId, to ensure that no because later
grepsfor RecentGlobalDataXmin and RecentGlobalXmin will fail to find it. It seems better to spell both names, so
"RecentGlobalDataXminand RecentGlobalXmin are initialized to ..."

- the pg_receivellog command line is strange. Apparently I need one or more of --start,--init,--stop, but if stop,
thenthe other two must not be present; and if startpos, then init and stop cannot be specified. (There's a typo there
thatsays "cannot combine with --start" when it really means "cannot combine with --stop", BTW). I think this would
makemore sense to have init,start,stop be commands, in pg_ctl's spirit; so there would be no double-dash. IOW
SOMEPATH/pg_receivellog--startpos=123 start and so on. Also, we need SGML docs for this new utility.

Any particular reason for removing this line:
-/* Get a new XLogReader */
+extern XLogReaderState *XLogReaderAllocate(XLogPageReadCB pagereadfunc, void *private_data);

Typo here (2n*d*Quadrant):
+= Snapshot Building =
+:author: Andres Freund, 2nQuadrant Ltd

I don't see the point of XLogRecordBuffer.record_data; we already have a
pointer to the XLogRecord, and the data can readily be obtained using
XLogRecGetData. So why provide the same thing twice? It seems to me
that if instead of passing the XLogRecordBuffer we just provide the
XLogRecord, and separately the "origptr" where needed, we could avoid
having to expose the XLogRecordBuffer stuff unnecessarily.

In this comment:
+ * FIXME: We need something resembling the real SnapshotNow to handle things
+ * like enum lookups from indices correctly.
what do we need consider in light of the new comment proposed by Robert
CA+TgmobvTjRj_doXxQ0wgA1a1JLYPVYqtR3m+Cou_ousabnmXg@mail.gmail.com

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: changeset generation v5-01 - Patches & git tree

From

Andres Freund

Date:

27 August 2013, 17:58:44

On 2013-08-27 11:32:30 -0400, Alvaro Herrera wrote:
> Andres Freund wrote:
> > The git tree is at:
> > git://git.postgresql.org/git/users/andresfreund/postgres.git branch xlog-decoding-rebasing-cf4
> >
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/xlog-decoding-rebasing-cf4
> 
> I gave this recently rebased branch a skim.  In general, the separation 
> between decode.c/reorderbuffer.c/snapbuild.c seems a lot nicer now than
> on previous iterations -- good job there.

Thanks for having a look!

> Here are some quick notes I took while reading the patch itself.  I
> haven't gone through it really carefully, yet.
> 
> 
> - I wonder whether DecodeCommit and DecodeAbort should really be a single
>   routine.  Right now, the former might call the later; and the latter is
>   aware of this.  Seems awkward.

Yes, I am not happy with that either. I'll play with combining them and
check whether that looks beter.

> - We skip insert/update/delete if not my database Id; however, we don't skip
>   commit in the same case.  If there are two walrecvrs on a cluster, on
>   different databases, does this lead to us trying to remove files
>   twice, if a xact commits which deleted some files?  Is this a problem?
>   Should we try to skip such database-specific actions in global
>   WAL records?

Hm. We should be able to skip it for long commit records at least. I
think I lost that code along the unification.

There's no danger of removing anything global afaics since we're not
replaying using the original replay routines and all the slot/sender
specific stuff has unique names.

> - There's rmgr-specific knowledge in decode.c.  I wonder if, similar to
>   redo and desc routines, that shouldn't instead be pluggable functions
>   for each rmgr.

I don't think that's a good idea. I've quickly played with it before and
it doesn't seem to end happy. It would require opening up more
semi-public interfaces and in the end, we're only interested of in-core
stuff. Even if it were possible to add new indexes by plugging new
rmgrs, we wouldn't care.

> - What's with ReorderBufferRestoreCleanup()?  Shouldn't it be in logical.c?

No, that's just for removing ondisk data at the end of a
transaction. I'll improve the comment.

> - reorderbuffer.c does several different things.  Can it be split?
>   Perhaps in pieces such as
>   * stuff to manage memory (slab cache thingies)
>   * TXN iterator
>   * other logically separate parts?
>   * the rest

Hm. I don't really see much point in splitting it along those
lines. None of those really makes sense without the other parts and the
file isn't *that* huge.

> - Having to expose LocalExecuteInvalidationMessage() looks awkward.  Is there
>   another way?

Hm. I don't immediately see any way. We need to execute invalidation
messages just within one backend. There just is no exposed functionality
for that yet since it wasn't needed so far. We could expose something
like LocalExecuteInvalidationMessage*s*() instead of doing the loop in
reorderbuffer.c, but that's about it.

> - I think we need a better name for "treat_as_catalog_table" (and
>   RelationIsTreatedAsCatalogTable).  Maybe replication_catalog or
>   something similar?

I think we're going to end up needing that for more than just
replication, so I'd like to keep replication out of the name. I don't
like the current name either though, so any other ideas?

> - Don't do this:
>   + * RecentGlobal(Data)?Xmin is initialized to InvalidTransactionId, to ensure that no
>   because later greps for RecentGlobalDataXmin and RecentGlobalXmin will
>   fail to find it.  It seems better to spell both names, so
>   "RecentGlobalDataXmin and RecentGlobalXmin are initialized to ..."

Ok.

> - the pg_receivellog command line is strange.  Apparently I need one or
>   more of --start,--init,--stop, but if stop, then the other two must
>   not be present; and if startpos, then init and stop cannot be
>   specified.  (There's a typo there that says "cannot combine with
>   --start" when it really means "cannot combine with --stop", BTW).  I
>   think this would make more sense to have init,start,stop be commands,
>   in pg_ctl's spirit; so there would be no double-dash.  IOW
>     SOMEPATH/pg_receivellog --startpos=123 start
>   and so on.

The reasoning here is somewhat complex and I am not happy with the
status quo, so I like getting input here.

The individual verbs mean:
* init: create a replication slot
* start: continue streaming in an existing replication slot
* stop: remove replication slot

The reason you cannot specify anything with --stop is that a) --start
streams until you abort the utility. So there's no chance of running
--stop after it. b) --init and --stop seems like a pointless combination
since you cannot actually do anything with the slot.
--init and --start combined, on the other hand are useful for testing,
which is why I allow them so far, but I wouldn't have problems removing
that capability.

The reason you cannot combine --init or --init --start with --startpos
is that --startpos has to refer to a location that could have actually
streamed to the client. Before a replication slot is established the
client doesn't know anything about such an address, so --init --start
cannot know any useful --startpos, that's why it's forbidden to pass
one.

The idea behind startpos is that you can tell the server "I have
replayed transactions up to this LSN" and the server will only give you
only transactions that have commited after this.

>  Also, we need SGML docs for this new utility.

And a lot more than only for this utility :(

> Any particular reason for removing this line:
> -/* Get a new XLogReader */
> +
>  extern XLogReaderState *XLogReaderAllocate(XLogPageReadCB pagereadfunc,
>                    void *private_data);

Hrmpf. Merge error. I've integrated too many different versions of too
different xlogreaders ;)

> I don't see the point of XLogRecordBuffer.record_data; we already have a
> pointer to the XLogRecord, and the data can readily be obtained using
> XLogRecGetData.  So why provide the same thing twice?  It seems to me
> that if instead of passing the XLogRecordBuffer we just provide the
> XLogRecord, and separately the "origptr" where needed, we could avoid
> having to expose the XLogRecordBuffer stuff unnecessarily.

By now we also need the end location of a wal record. So we have to pass
three addresses around for everything which isn't very convenient. If
you vastly prefer passing around three parameters I can do that, but I'd
rather not.
The original reason for doing so was, to be honest, that my own
xlogreader's API was different...

> In this comment:
> + * FIXME: We need something resembling the real SnapshotNow to handle things
> + * like enum lookups from indices correctly.
> what do we need consider in light of the new comment proposed by Robert
> CA+TgmobvTjRj_doXxQ0wgA1a1JLYPVYqtR3m+Cou_ousabnmXg@mail.gmail.com

I did most of the code changes for this, but this made me realize that
there are quite some more comments and even a function name to be
adapted. Will work on that.

Thanks!

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: logical changeset generation v5

From

Andres Freund

Date:

30 August 2013, 15:19:35

Hi,

I've attached a couple of the preliminary patches to $subject which I've
recently cleaned up in the hope that we can continue improving on those
in a piecemal fashion.
I am preparing submission of a newer version of the major patch but
unfortunately progress on that is slower than I'd like...

In the order of chance of applying them individuall they are:

0005 wal_decoding: Log xl_running_xact's at a higher frequency than checkpoints are done
* benefits hot standby startup
0003 wal_decoding: Allow walsender's to connect to a specific database
* biggest problem is how to specify the connection we connect
  to. Currently with the patch walsender connects to a database if it's
  not named "replication" (via dbname). Perhaps it's better to invent a
  replication_dbname parameter?
0006 wal_decoding: copydir: move fsync_fname to fd.[c.h] and make it public
* Pretty trivial and boring.
0007 wal_decoding: Add information about a tables primary key to struct RelationData
* Could be used in the matview refresh code
0002 wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for CommandCounterIncrement

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Hi,

On 2013-09-03 11:40:57 -0400, Robert Haas wrote:
> On Fri, Aug 30, 2013 at 11:19 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > 0005 wal_decoding: Log xl_running_xact's at a higher frequency than checkpoints are done
> > * benefits hot standby startup

I tried to update the patch to address the comments you made.

> > 0003 wal_decoding: Allow walsender's to connect to a specific database
> > * biggest problem is how to specify the connection we connect
> >   to. Currently with the patch walsender connects to a database if it's
> >   not named "replication" (via dbname). Perhaps it's better to invent a
> >   replication_dbname parameter?

I've updated the patch so it extends the "replication" startup parameter
to not only specify a boolean but also "database". In the latter case it
will connect to the database specified in "dbname".
As discussed downthread, this patch doesn't have an immediate advantage
for users until the changeset extraction patch itself is
applied. Whether or not it should be applied separately is unclear.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: lcr v5 - introduction of InvalidCommandId

From

Andres Freund

Date:

05 September 2013, 17:00:25

On 2013-09-05 12:44:18 -0400, Robert Haas wrote:
> On Wed, Sep 4, 2013 at 12:07 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-09-03 11:40:57 -0400, Robert Haas wrote:
> >> > 0002 wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for
CommandCounterIncrement
> >>
> >> I'm still unconvinced we want this.
> >
> > Ok, so the reason for the existance of this patch is that currently
> > there is no way to represent a "unset" CommandId. This is a problem for
> > the following patches because we need to log the cmin, cmax of catalog
> > rows and obviously there can be rows where cmax is unset.
> 
> For heap tuples, we solve this problem by using flag bits.  Why not
> adopt the same approach?

We can, while it makes the amount of data stored/logged slightly larger
and it seems to lead to less idiomatic code to me, so if there's another
-1 I'll go that way.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: lcr v5 - primary/candidate key in relcache

From

Andres Freund

Date:

05 September 2013, 17:07:28

Hi Kevin,

On 2013-09-03 11:40:57 -0400, Robert Haas wrote:
> On Fri, Aug 30, 2013 at 11:19 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > 0007 wal_decoding: Add information about a tables primary key to struct RelationData
> > * Could be used in the matview refresh code

> I think you and Kevin should discuss whether this is actually the
> right way to do this.  ISTM that if logical replication and
> materialized views end up selecting different approaches to this
> problem, everybody loses.

The patch we're discussion here adds a new struct RelationData field
called 'rd_primary' (should possibly be renamed) which contains
information about the "best" candidate key available for a table.

>From the header comments:
    /*
     * The 'best' primary or candidate key that has been found, only set
     * correctly if RelationGetIndexList has been called/rd_indexvalid > 0.
     *
     * Indexes are chosen in the following order:
     * * Primary Key
     * * oid index
     * * the first (OID order) unique, immediate, non-partial and
     *   non-expression index over one or more NOT NULL'ed columns
     */
    Oid rd_primary;

I thought we could use that in matview.c:refresh_by_match_merge() to
select a more efficient diff if rd_primary has a valid index. In that
case you only'd need to compare that index's fields which should result
in an more efficient plan.

Maybe it's also useful in other cases for you?

If it's relevant at all, would you like to have a different priority
list than the one above?

Regards,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0004-wal_decoding-Add-information-about-a-tables-primary-.patch

Re: lcr v5 - introduction of InvalidCommandId

From

Robert Haas

Date:

05 September 2013, 17:13:57

On Thu, Sep 5, 2013 at 12:59 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-09-05 12:44:18 -0400, Robert Haas wrote:
>> On Wed, Sep 4, 2013 at 12:07 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > On 2013-09-03 11:40:57 -0400, Robert Haas wrote:
>> >> > 0002 wal_decoding: Introduce InvalidCommandId and declare that to be the new maximum for
CommandCounterIncrement
>> >>
>> >> I'm still unconvinced we want this.
>> >
>> > Ok, so the reason for the existance of this patch is that currently
>> > there is no way to represent a "unset" CommandId. This is a problem for
>> > the following patches because we need to log the cmin, cmax of catalog
>> > rows and obviously there can be rows where cmax is unset.
>>
>> For heap tuples, we solve this problem by using flag bits.  Why not
>> adopt the same approach?
>
> We can, while it makes the amount of data stored/logged slightly larger
> and it seems to lead to less idiomatic code to me, so if there's another
> -1 I'll go that way.

OK.  Consider me more of a -0 than a -1.  Like I say, I don't really
want to block it; I just don't feel comfortable committing it unless a
few other people say something like "I don't see a problem with that".Or maybe point me to relevant changeset
extractioncode that's going
 
to get messier.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: lcr v5 - introduction of InvalidCommandId

From

Tom Lane

Date:

05 September 2013, 18:21:40

Robert Haas <robertmhaas@gmail.com> writes:
> OK.  Consider me more of a -0 than a -1.  Like I say, I don't really
> want to block it; I just don't feel comfortable committing it unless a
> few other people say something like "I don't see a problem with that".

FWIW, I've always thought it was a wart that there wasn't a recognized
InvalidCommandId value.  It was never pressing to fix it before, but
if LCR needs it, let's do so.  I definitely *don't* find it cleaner to
eat up another flag bit to avoid that.  We don't have many to spare.

Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
but I suppose we can't have that without an on-disk compatibility break.
        regards, tom lane

Re: lcr v5 - introduction of InvalidCommandId

From

Andres Freund

Date:

05 September 2013, 18:30:43

Hi,

Thanks for weighin in.

On 2013-09-05 14:21:33 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > OK.  Consider me more of a -0 than a -1.  Like I say, I don't really
> > want to block it; I just don't feel comfortable committing it unless a
> > few other people say something like "I don't see a problem with that".
> 
> FWIW, I've always thought it was a wart that there wasn't a recognized
> InvalidCommandId value.  It was never pressing to fix it before, but
> if LCR needs it, let's do so.

Yes, its a bit anomalous to the other types.

>  I definitely *don't* find it cleaner to
> eat up another flag bit to avoid that.  We don't have many to spare.

It wouldn't need to be a flag bit in any existing struct, so that's not
a problem.

> Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
> but I suppose we can't have that without an on-disk compatibility break.

The patch actually does change it exactly that way. My argument for that
being valid is that CommandIds don't play any role outside of their own
transaction. Now, somebody could argue that SELECT cmin, cmax can be
done outside the transaction, but: Those values are already pretty much
meaningless today since cmin/cmax have been merged. They also don't
check whether the field is initialized at all.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: lcr v5 - introduction of InvalidCommandId

From

Tom Lane

Date:

05 September 2013, 18:37:10

Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-09-05 14:21:33 -0400, Tom Lane wrote:
>> Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
>> but I suppose we can't have that without an on-disk compatibility break.

> The patch actually does change it exactly that way.

Oh.  I hadn't looked at the patch, but I had (mis)read what Robert said
to think that you were proposing introducing InvalidCommandId = 0xFFFFFFFF
while leaving FirstCommandId alone.  That would make more sense to me as
(1) it doesn't change the interpretation of anything that's (likely to be)
on disk; (2) it allows the check for overflow in CommandCounterIncrement
to not involve recovering from an *actual* overflow.  With the horsing
around we've been seeing from the gcc boys lately, I don't have a warm
feeling about whether they won't break that test someday on the grounds
that "overflow is undefined behavior".
        regards, tom lane

Re: lcr v5 - introduction of InvalidCommandId

From

Peter Geoghegan

Date:

05 September 2013, 18:42:30

On Thu, Sep 5, 2013 at 11:30 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
>> but I suppose we can't have that without an on-disk compatibility break.
>
> The patch actually does change it exactly that way. My argument for that
> being valid is that CommandIds don't play any role outside of their own
> transaction.

Right. It seems like this should probably be noted in the
documentation under "5.4. System Columns" -- I just realized that it
isn't.

-- 
Peter Geoghegan

Re: lcr v5 - introduction of InvalidCommandId

From

Andres Freund

Date:

05 September 2013, 19:02:59

On 2013-09-05 14:37:01 -0400, Tom Lane wrote:
> Andres Freund <andres@2ndquadrant.com> writes:
> > On 2013-09-05 14:21:33 -0400, Tom Lane wrote:
> >> Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
> >> but I suppose we can't have that without an on-disk compatibility break.
> 
> > The patch actually does change it exactly that way.
> 
> Oh.  I hadn't looked at the patch, but I had (mis)read what Robert said
> to think that you were proposing introducing InvalidCommandId = 0xFFFFFFFF
> while leaving FirstCommandId alone.  That would make more sense to me as
> (1) it doesn't change the interpretation of anything that's (likely to be)
> on disk; (2) it allows the check for overflow in CommandCounterIncrement
> to not involve recovering from an *actual* overflow.  With the horsing
> around we've been seeing from the gcc boys lately

Ok, I can do it that way. LCR obviously shouldn't care.

> I don't have a warm
> feeling about whether they won't break that test someday on the grounds
> that "overflow is undefined behavior".

Unsigned overflow is pretty strictly defined, so I don't see much danger
there. Also, we'd feel the pain pretty definitely with xids...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: lcr v5 - introduction of InvalidCommandId

From

Andres Freund

Date:

05 September 2013, 19:23:16

On 2013-09-05 21:02:44 +0200, Andres Freund wrote:
> On 2013-09-05 14:37:01 -0400, Tom Lane wrote:
> > Andres Freund <andres@2ndquadrant.com> writes:
> > > On 2013-09-05 14:21:33 -0400, Tom Lane wrote:
> > >> Ideally I'd have made InvalidCommandId = 0 and FirstCommandId = 1,
> > >> but I suppose we can't have that without an on-disk compatibility break.
> >
> > > The patch actually does change it exactly that way.
> >
> > Oh.  I hadn't looked at the patch, but I had (mis)read what Robert said
> > to think that you were proposing introducing InvalidCommandId = 0xFFFFFFFF
> > while leaving FirstCommandId alone.  That would make more sense to me as
> > (1) it doesn't change the interpretation of anything that's (likely to be)
> > on disk; (2) it allows the check for overflow in CommandCounterIncrement
> > to not involve recovering from an *actual* overflow.  With the horsing
> > around we've been seeing from the gcc boys lately
>
> Ok, I can do it that way. LCR obviously shouldn't care.

It doesn't care to the point that the patch already does exactly what
you propose. It's just my memory that remembered things differently.

So, a very slightly updated patch attached.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

0001-Introduce-InvalidCommandId.patch

Re: lcr v5 - introduction of InvalidCommandId

From

Robert Haas

Date:

09 September 2013, 20:30:31

On Thu, Sep 5, 2013 at 3:23 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> > Oh.  I hadn't looked at the patch, but I had (mis)read what Robert said
>> > to think that you were proposing introducing InvalidCommandId = 0xFFFFFFFF
>> > while leaving FirstCommandId alone.  That would make more sense to me as
>> > (1) it doesn't change the interpretation of anything that's (likely to be)
>> > on disk; (2) it allows the check for overflow in CommandCounterIncrement
>> > to not involve recovering from an *actual* overflow.  With the horsing
>> > around we've been seeing from the gcc boys lately
>>
>> Ok, I can do it that way. LCR obviously shouldn't care.
>
> It doesn't care to the point that the patch already does exactly what
> you propose. It's just my memory that remembered things differently.
>
> So, a very slightly updated patch attached.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: lcr v5 - introduction of InvalidCommandId

From

Tom Lane

Date:

09 September 2013, 22:43:59

Robert Haas <robertmhaas@gmail.com> writes:
> On Thu, Sep 5, 2013 at 3:23 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> So, a very slightly updated patch attached.

> Committed.

Hmm ... shouldn't this patch adjust the error messages in
CommandCounterIncrement?  We just took away one possible command.
It's pretty nitpicky, especially since many utility commands do
more than one CommandCounterIncrement, but still ...
        regards, tom lane

Re: lcr v5 - introduction of InvalidCommandId

From

Andres Freund

Date:

09 September 2013, 22:46:56

On 2013-09-09 18:43:51 -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Thu, Sep 5, 2013 at 3:23 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> >> So, a very slightly updated patch attached.
> 
> > Committed.
> 
> Hmm ... shouldn't this patch adjust the error messages in
> CommandCounterIncrement?  We just took away one possible command.
> It's pretty nitpicky, especially since many utility commands do
> more than one CommandCounterIncrement, but still ...

Hm. You're talking about "cannot have more than 2^32-2 commands in a
transaction"? If so, the patch and the commit seem to have adjusted that?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: lcr v5 - introduction of InvalidCommandId

From

Tom Lane

Date:

09 September 2013, 22:53:44

Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-09-09 18:43:51 -0400, Tom Lane wrote:
>> Hmm ... shouldn't this patch adjust the error messages in
>> CommandCounterIncrement?

> Hm. You're talking about "cannot have more than 2^32-2 commands in a
> transaction"? If so, the patch and the commit seem to have adjusted that?

Oh!  That's what I get for going on memory instead of re-reading the
commit.  Sorry, never mind the noise.
        regards, tom lane

Re: lcr v5 - primary/candidate key in relcache

From

Kevin Grittner

Date:

12 September 2013, 19:21:12

Andres Freund <andres@2ndquadrant.com> wrote:
> Robert Haas wrote:
>> Andres Freund <andres@2ndquadrant.com> wrote:
>>> 0007 wal_decoding: Add information about a tables primary key to
>>>  struct RelationData
>>> * Could be used in the matview refresh code
>
>> I think you and Kevin should discuss whether this is actually the
>> right way to do this.  ISTM that if logical replication and
>> materialized views end up selecting different approaches to this
>> problem, everybody loses.
>
> The patch we're discussion here adds a new struct RelationData field
> called 'rd_primary' (should possibly be renamed) which contains
> information about the "best" candidate key available for a table.
>
> From the header comments:
>     /*
>     * The 'best' primary or candidate key that has been found, only set
>     * correctly if RelationGetIndexList has been called/rd_indexvalid > 0.
>     *
>     * Indexes are chosen in the following order:
>     * * Primary Key
>     * * oid index
>     * * the first (OID order) unique, immediate, non-partial and
>     *  non-expression index over one or more NOT NULL'ed columns
>     */
>     Oid rd_primary;
>
> I thought we could use that in matview.c:refresh_by_match_merge() to
> select a more efficient diff if rd_primary has a valid index. In that
> case you only'd need to compare that index's fields which should result
> in an more efficient plan.
>
> Maybe it's also useful in other cases for you?
>
> If it's relevant at all, would you like to have a different priority
> list than the one above?

My first thought was that it was necessary to use all unique,
immediate, non-partial, non-expression indexes to avoid getting
errors on the UPDATE phase of the concurrent refresh for transient
duplicates; but then I remembered that I had to give up on that and
do it all with DELETE followed by INSERT, which eliminates that
risk.  As things now stand the *existence* of any unique,
non-partial, non-expression index (note that immediate is not
needed) is sufficient for correctness.  We could now even drop that,
I think, if we added a duplicate check at the end in the absence of
such an index.

The reason I left it comparing columns from *all* such indexes is
that it gives the optimizer the chance to pick the one that looks
fastest.  With the upcoming patch that can add some extra
"equality" comparisons in addition to the "identical" comparisons
the patch uses, so the mechanism you propose might be a worthwhile
optimization for some cases as long as it does a good job of
picking *the fastest* such index.  The above method of choosing an
index doesn't seem to necessarily ensure that.

Also, if you need to include the "immediate" test, it could not be
used for RMVC without "fallback" code if this mechanism didn't find
an appropriate index.  Of course, that would satisfy those who
would like to relax the requirement for a unique index on the MV to
be able to use RMVC.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company