Thread: Should we automatically run duplicate_oids?

Should we automatically run duplicate_oids?

From

Peter Geoghegan

Date:

09 July 2013, 01:25:58

When rebasing a patch that I'm working on, I occasionally forget to
update the oid of any pg_proc.h entries I may have created. Of course
this isn't a real problem; when I go to initdb, I immediately
recognize what has happened. All the same, it seems like there is a
case to be made for having this run automatically at build time, and
having the build fail on the basis of there being a duplicate - this
is something that fails reliably, but only when someone has added
another pg_proc.h entry, and only when that other person happened to
choose an oid in a range of free-in-git-tip oids that I myself
fancied.

Sure, I ought to remember to check this anyway, but it seems
preferable to make this process more mechanical. I can point to commit
55c1687a as a kind of precedent, where the process of running
check_keywords.pl was made to run automatically any time gram.c is
rebuilt. Granted, that's a more subtle problem than the one I'm
proposing to solve, but I still see this as a modest improvement.

-- 
Peter Geoghegan

Re: Should we automatically run duplicate_oids?

From

Tom Lane

Date:

09 July 2013, 01:33:39

Peter Geoghegan <pg@heroku.com> writes:
> ... All the same, it seems like there is a
> case to be made for having this run automatically at build time, and
> having the build fail on the basis of there being a duplicate

This would require a rather higher standard of portability than
duplicate_oids has heretofore been held to.

> Sure, I ought to remember to check this anyway, but it seems
> preferable to make this process more mechanical. I can point to commit
> 55c1687a as a kind of precedent, where the process of running
> check_keywords.pl was made to run automatically any time gram.c is
> rebuilt.

Note that "any time gram.c is rebuilt" is not "every build".  In fact,
for non-developers (tarball consumers), it's not *any* build.

I'm not necessarily opposed to making this happen, but there's more
legwork involved than just adding a call of the existing script in some
random place in the makefiles.
        regards, tom lane

Re: Should we automatically run duplicate_oids?

From

Peter Geoghegan

Date:

09 July 2013, 02:38:19

On Mon, Jul 8, 2013 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> This would require a rather higher standard of portability than
> duplicate_oids has heretofore been held to.

Ah, yes. I suppose that making this happen would necessitate rewriting
the script in highly portable Perl. Unfortunately, I'm not a likely
candidate for that task.

-- 
Peter Geoghegan

Re: Should we automatically run duplicate_oids?

From

Peter Eisentraut

Date:

09 July 2013, 02:59:26

On Mon, 2013-07-08 at 19:38 -0700, Peter Geoghegan wrote:
> On Mon, Jul 8, 2013 at 6:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > This would require a rather higher standard of portability than
> > duplicate_oids has heretofore been held to.
> 
> Ah, yes. I suppose that making this happen would necessitate rewriting
> the script in highly portable Perl. Unfortunately, I'm not a likely
> candidate for that task.

I don't think rewriting it in Perl is necessary or even desirable.  I
don't see anything particularly unportable in that script as it is.
(Hmm, perhaps egrep should be replaced by grep.)

Re: Should we automatically run duplicate_oids?

From

Peter Geoghegan

Date:

09 July 2013, 03:03:25

On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> I don't think rewriting it in Perl is necessary or even desirable.  I
> don't see anything particularly unportable in that script as it is.

I was under the impression that the final patch ought to work on
Windows too. However, I suppose that since the number of people that
use windows as an everyday development machine is probably zero, we
could reasonably forgo doing anything on that platform.

-- 
Peter Geoghegan

Re: Should we automatically run duplicate_oids?

From

Craig Ringer

Date:

09 July 2013, 04:33:01

On 07/09/2013 11:03 AM, Peter Geoghegan wrote:
> On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> I don't think rewriting it in Perl is necessary or even desirable.  I
>> don't see anything particularly unportable in that script as it is.
> 
> I was under the impression that the final patch ought to work on
> Windows too. However, I suppose that since the number of people that
> use windows as an everyday development machine is probably zero, we
> could reasonably forgo doing anything on that platform.

I'd say that the number who use Windows *and the unix-like build chain*
day to day is going to be zero.

If you're using the Visual Studio based toolchain with vcbuild.pl,
vcregress.pl, etc you're not going to notice or care about this change.

-- Craig Ringer                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services

Re: Should we automatically run duplicate_oids?

From

Atri Sharma

Date:

09 July 2013, 04:40:51

On Tue, Jul 9, 2013 at 6:55 AM, Peter Geoghegan <pg@heroku.com> wrote:
> When rebasing a patch that I'm working on, I occasionally forget to
> update the oid of any pg_proc.h entries I may have created. Of course
> this isn't a real problem; when I go to initdb, I immediately
> recognize what has happened. All the same, it seems like there is a
> case to be made for having this run automatically at build time, and
> having the build fail on the basis of there being a duplicate - this
> is something that fails reliably, but only when someone has added
> another pg_proc.h entry, and only when that other person happened to
> choose an oid in a range of free-in-git-tip oids that I myself
> fancied.
>
> Sure, I ought to remember to check this anyway, but it seems
> preferable to make this process more mechanical. I can point to commit
> 55c1687a as a kind of precedent, where the process of running
> check_keywords.pl was made to run automatically any time gram.c is
> rebuilt. Granted, that's a more subtle problem than the one I'm
> proposing to solve, but I still see this as a modest improvement.

I completely agree with the idea. Doing these checks early in the
build chain would be much more helpful than seeing the logs when
initdb fails.

Regards,

Atri

--
Regards,

Atri
l'apprenant

Re: Should we automatically run duplicate_oids?

From

Andrew Dunstan

Date:

09 July 2013, 14:40:47

On 07/08/2013 11:03 PM, Peter Geoghegan wrote:
> On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
>> I don't think rewriting it in Perl is necessary or even desirable.  I
>> don't see anything particularly unportable in that script as it is.
> I was under the impression that the final patch ought to work on
> Windows too. However, I suppose that since the number of people that
> use windows as an everyday development machine is probably zero, we
> could reasonably forgo doing anything on that platform.
>
>


Why the heck should we? To my certain knowledge there are people using 
Windows as a development platform for PostgreSQL code, albeit not core 
code. If we ever want to get them involved in writing core code we need 
to treat them as first class citizens.

This is actually a pretty trivial task. Here is a simple perl version:


   use strict;
   my @files = (qw( toasting.h indexing.h), glob("pg_*.h"));
   my $handle;   my @lines;   foreach my $file (@files)   {        my $handle;        open($handle,$file) || die "$!";
     my @flines = <$handle>;        close($handle);        chomp @flines;        push(@lines, @flines);   }
 
   my %oidcounts;
   foreach (@lines)   {        next if /^CATALOG\(.*BKI_BOOTSTRAP/;        next unless          /^DATA\(insert *OID *=
*([0-9][0-9]*).*$/||          /^CATALOG\([^,]*,   *([0-9][0-9]*).*BKI_ROWTYPE_OID\(([0-9][0-9]*)\).*$/ ||
/^CATALOG\([^,]*,*([0-9][0-9]*).*$/ ||          /^DECLARE_INDEX\([^,]*, *([0-9][0-9]*).*$/ ||
/^DECLARE_UNIQUE_INDEX\([^,]*,*([0-9][0-9]*).*$/ ||          /^DECLARE_TOAST\([^,]*, *([0-9][0-9]*),
*([0-9][0-9]*).*$/;       $oidcounts{$1}++;        $oidcounts{$2}++ if $2;   }
 
   my $found = 0;
   foreach my $oid (sort {$a <=> $b} keys %oidcounts)   {        next unless $oidcounts{$oid} > 1;        $found = 1;
    print "$oid\n";   }
 
   exit $found;


cheers

andrew

Re: Should we automatically run duplicate_oids?

From

Andrew Dunstan

Date:

09 July 2013, 16:16:36

On 07/09/2013 10:40 AM, Andrew Dunstan wrote:
>
> On 07/08/2013 11:03 PM, Peter Geoghegan wrote:
>> On Mon, Jul 8, 2013 at 7:59 PM, Peter Eisentraut <peter_e@gmx.net> 
>> wrote:
>>> I don't think rewriting it in Perl is necessary or even desirable.  I
>>> don't see anything particularly unportable in that script as it is.
>> I was under the impression that the final patch ought to work on
>> Windows too. However, I suppose that since the number of people that
>> use windows as an everyday development machine is probably zero, we
>> could reasonably forgo doing anything on that platform.
>>
>>
>
>
> Why the heck should we? To my certain knowledge there are people using 
> Windows as a development platform for PostgreSQL code, albeit not core 
> code. If we ever want to get them involved in writing core code we 
> need to treat them as first class citizens.
>
> This is actually a pretty trivial task. Here is a simple perl version:


Slightly cleaner (and shorter) version:
   use strict;
   BEGIN   {        my @files = (qw( toasting.h indexing.h), glob("pg_*.h"));
        @ARGV = @files;   }
   my %oidcounts;
   while(<>)   {        next if /^CATALOG\(.*BKI_BOOTSTRAP/;        next unless          /^DATA\(insert *OID *= *(\d+)/
||         /^CATALOG\([^,]*, *(\d+).*BKI_ROWTYPE_OID\((\d+)\)/ ||          /^CATALOG\([^,]*, *(\d+)/ ||
/^DECLARE_INDEX\([^,]*,*(\d+)/ ||          /^DECLARE_UNIQUE_INDEX\([^,]*, *(\d+)/ ||          /^DECLARE_TOAST\([^,]*,
*(\d+),*(\d+)/;        $oidcounts{$1}++;        $oidcounts{$2}++ if $2;   }
 
   my $found = 0;
   foreach my $oid (sort {$a <=> $b} keys %oidcounts)   {        next unless $oidcounts{$oid} > 1;        $found = 1;
    print "$oid\n";   }
 
   exit $found;

cheers

andrew

Re: Should we automatically run duplicate_oids?

From

Peter Eisentraut

Date:

09 July 2013, 17:55:49

On 7/9/13 10:40 AM, Andrew Dunstan wrote:
> This is actually a pretty trivial task. Here is a simple perl version:

But that would make Perl a hard build requirement.

Re: Should we automatically run duplicate_oids?

From

Andrew Dunstan

Date:

09 July 2013, 18:13:36

On 07/09/2013 01:55 PM, Peter Eisentraut wrote:
> On 7/9/13 10:40 AM, Andrew Dunstan wrote:
>> This is actually a pretty trivial task. Here is a simple perl version:
> But that would make Perl a hard build requirement.
>

Well, it already is unless you're not building from git AND you're not 
building with MSVC AND you're not running a buildfarm animal (and maybe 
a few other conditions too).

In any case, we could make it conditional on $(PERL) not being empty, 
couldn't we?

cheers

andrew

Re: Should we automatically run duplicate_oids?

From

Tom Lane

Date:

09 July 2013, 18:22:11

Peter Eisentraut <peter_e@gmx.net> writes:
> On 7/9/13 10:40 AM, Andrew Dunstan wrote:
>> This is actually a pretty trivial task. Here is a simple perl version:

> But that would make Perl a hard build requirement.

My opinion is that this script should only run if you've changed some
catalog .h files.  Requiring Perl in those cases is no worse than what
we already require it for (cf the keyword crosscheck script mentioned
upthread).
        regards, tom lane

Re: Should we automatically run duplicate_oids?

From

Bruce Momjian

Date:

02 August 2013, 05:00:55

On Mon, Jul  8, 2013 at 06:25:44PM -0700, Peter Geoghegan wrote:
> When rebasing a patch that I'm working on, I occasionally forget to
> update the oid of any pg_proc.h entries I may have created. Of course
> this isn't a real problem; when I go to initdb, I immediately
> recognize what has happened. All the same, it seems like there is a
> case to be made for having this run automatically at build time, and
> having the build fail on the basis of there being a duplicate - this
> is something that fails reliably, but only when someone has added
> another pg_proc.h entry, and only when that other person happened to
> choose an oid in a range of free-in-git-tip oids that I myself
> fancied.
>
> Sure, I ought to remember to check this anyway, but it seems
> preferable to make this process more mechanical. I can point to commit
> 55c1687a as a kind of precedent, where the process of running
> check_keywords.pl was made to run automatically any time gram.c is
> rebuilt. Granted, that's a more subtle problem than the one I'm
> proposing to solve, but I still see this as a modest improvement.

FYI, attached is the pgtest script I always run before I do a commit;
it also calls src/tools/pgtest.  It has saved me from erroneous commits
many times.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

Attachment

pgtest

Re: Should we automatically run duplicate_oids?

From

Atri Sharma

Date:

02 August 2013, 14:16:12


Sent from my iPad

On 02-Aug-2013, at 10:30, Bruce Momjian <bruce@momjian.us> wrote:

> On Mon, Jul  8, 2013 at 06:25:44PM -0700, Peter Geoghegan wrote:
>> When rebasing a patch that I'm working on, I occasionally forget to
>> update the oid of any pg_proc.h entries I may have created. Of course
>> this isn't a real problem; when I go to initdb, I immediately
>> recognize what has happened. All the same, it seems like there is a
>> case to be made for having this run automatically at build time, and
>> having the build fail on the basis of there being a duplicate - this
>> is something that fails reliably, but only when someone has added
>> another pg_proc.h entry, and only when that other person happened to
>> choose an oid in a range of free-in-git-tip oids that I myself
>> fancied.
>>
>> Sure, I ought to remember to check this anyway, but it seems
>> preferable to make this process more mechanical. I can point to commit
>> 55c1687a as a kind of precedent, where the process of running
>> check_keywords.pl was made to run automatically any time gram.c is
>> rebuilt. Granted, that's a more subtle problem than the one I'm
>> proposing to solve, but I still see this as a modest improvement.
>
> FYI, attached is the pgtest script I always run before I do a commit;
> it also calls src/tools/pgtest.  It has saved me from erroneous commits
> many times.
>
+1,a much needed thing.Duplicate oids is a pain enough to deserve its own solution.

Regards,

Atri