Thread: Removing derived files from CVS
I have been looking into what it would take to remove derived files from the CVS repository, and it doesn't look bad at all. I propose we do so before 6.5 beta. In case anyone's forgotten: the issue is derived files, such as gram.c, which we currently keep in the CVS repository even though they are not master source files. Doing so causes a number of headaches, including wasted time to check in and check out updates to both master and derived files, unreasonable bulk of the CVS files for these derived files, errors due to timestamp skew (after checking out, it can look like you have an up-to-date derived file when you do not), etc etc. The only reason for keeping these files in CVS is so that users who obtain the source distribution don't have to have tools that can rebuild these files. But there's a better way to handle that: generate the derived files while preparing tarballs. That way we can remove the derived files from CVS. We'll also eliminate the other time skew problem that's been seen in more than one past release tarball: the derived files will be certain to have newer timestamps than their masters in the tarballs. The most reliable way to do this is just to have a script that doesconfigure"make" all the derived filesmake distclean and invoke this script as part of the tarball generation procedure. Configuring in order to find out which yacc and lex to use may seem a tad expensive ;-) but this way will work, whereas taking shortcuts would have a tendency to break. Doing the make distclean also ensures that the tarball will not contain any extraneous files, which seems like a good idea. I have just tested this procedure and determined that it takes less than 2 minutes on hub.org, which seems well within the realm of acceptability for a nightly batch job. So, a few questions for the list: 1. Does anyone object to removing these files from the CVS repository and handling them as above:src/backend/parser/gram.csrc/backend/parser/parse.hsrc/backend/parser/scan.csrc/interfaces/ecpg/preproc/preproc.csrc/interfaces/ecpg/preproc/preproc.hsrc/interfaces/ecpg/preproc/pgc.c 2. Should we also handle src/configure this way? That would mean that people who obtain the code straight from CVS would have to have autoconf installed. It's probably a good idea but I'm not certain. 3. src/pl/plpgsql/src/ also contains yacc and lex output files that are checked into CVS. We definitely should remove them from CVS, but should we leave them to be generated by recipients of the distribution, or should we handle them like the big grammar files? I don't think they are big enough to break anyone's yacc, but... 4. Currently, a recipient must have at least minimally working yacc/lex capability anyway, because the bootstrap files in src/backend/bootstrap/ are not pre-built in the distribution. If we used the same procedure for the bootstrap and plpgsql files as for the bigger parsers, then it would be possible to build Postgres without a local yacc or lex. Is this worth doing, or would it just bloat the distribution to no purpose? As far as I know we have not gotten complaints about the need for yacc/lex for these files; it's only that the parser and ecpg grammars are too big for some vendor versions... regards, tom lane
> I have been looking into what it would take to remove derived files > from the CVS repository, and it doesn't look bad at all. I propose > we do so before 6.5 beta. > > In case anyone's forgotten: the issue is derived files, such as gram.c, > which we currently keep in the CVS repository even though they are not > master source files. Doing so causes a number of headaches, including > wasted time to check in and check out updates to both master and derived > files, unreasonable bulk of the CVS files for these derived files, > errors due to timestamp skew (after checking out, it can look like you > have an up-to-date derived file when you do not), etc etc. We have not been able to reliably make releases with the proper timestamps on gram.c, which is critical for end-users, so any change that will make this gram.c more automatic is welcomed by me. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Tom Lane wrote: > > I have been looking into what it would take to remove derived files > from the CVS repository, and it doesn't look bad at all. I propose > we do so before 6.5 beta. Sure, as long as it is clear what additional tools are need, what are their versions, and where do I get them for common platforms. :) Clark
On Thu, 18 Mar 1999, Bruce Momjian wrote: > > I have been looking into what it would take to remove derived files > > from the CVS repository, and it doesn't look bad at all. I propose > > we do so before 6.5 beta. > > > > In case anyone's forgotten: the issue is derived files, such as gram.c, > > which we currently keep in the CVS repository even though they are not > > master source files. Doing so causes a number of headaches, including > > wasted time to check in and check out updates to both master and derived > > files, unreasonable bulk of the CVS files for these derived files, > > errors due to timestamp skew (after checking out, it can look like you > > have an up-to-date derived file when you do not), etc etc. > > We have not been able to reliably make releases with the proper > timestamps on gram.c, which is critical for end-users, so any change > that will make this gram.c more automatic is welcomed by me. Agreed here too...someone at one point mentioned that there might be a way, inside of CVS, to have it auto-generate these files as its being checked out (ie. if file is configure.in, run autoconf)... I just scan'd through the cvs info file, and couldn't find anything...anyone know about something like this? Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
Then <scrappy@hub.org> spoke up and said: > Agreed here too...someone at one point mentioned that there might be a > way, inside of CVS, to have it auto-generate these files as its being > checked out (ie. if file is configure.in, run autoconf)... >From the info file: Module options -------------- Either regular modules or ampersand modules can contain options, which supply additional information concerning the module. [snip] `-i PROG' Specify a program PROG to run whenever files in a module are committed. PROG runs with a single argument,the full pathname of the affected directory in a source repository. The `commitinfo', `loginfo', and `verifymsg'files provide other ways to call a program on commit. `-o PROG' Specify a program PROG to run whenever files in a module are checked out. PROG runs with a single argument,the module name. >From my reading, it looks like the easiest thing to do is set up commit rules such that committing gram.y automatically generates gram.c. It looks like it might be difficult to have gram.c generated completely "on the fly" and then passed to the CVS client. -- ===================================================================== | JAVA must have been developed in the wilds of West Virginia. | | After all, why else would it support only single inheritance?? | ===================================================================== | Finger geek@cmu.edu for my public key. | =====================================================================
Clark Evans <clark.evans@manhattanproject.com> writes: > Tom Lane wrote: >> I have been looking into what it would take to remove derived files > Sure, as long as it is clear what additional tools are need, what are > their versions, and where do I get them for common platforms. You already need yacc (or bison) and lex (or flex). The only new thing would be autoconf, and that only if we choose to remove src/configure from the CVS fileset. You get autoconf from any GNU archive site. 2.13 is the current release, I believe. IIRC autoconf depends on GNU m4, so that's actually two tools not one, but the installation is straightforward. If you're on Linux you probably have GNU m4 anyway. regards, tom lane
geek+@cmu.edu writes: > Then <scrappy@hub.org> spoke up and said: >> Agreed here too...someone at one point mentioned that there might be a >> way, inside of CVS, to have it auto-generate these files as its being >> checked out (ie. if file is configure.in, run autoconf)... > From my reading, it looks like the easiest thing to do is set up > commit rules such that committing gram.y automatically generates > gram.c. I thought about that, but it only solves *one* of the problems we've run into: developers forgetting to commit a derived file when they commit the master. We'd still have these problems: * excessive CVS traffic for the derived files (check the version-to-versiondiffs for gram.c or configure to see what I'm talking about: a small change to the master often generateshuge diffs on the derived). That costs everyone who downloads from CVS. It's probably faster to generate gram.cor configure locally than to pull these diffs from CVS. * unreliable timestamps after a "cvs update": the derivedmay or may not look newer than the master, depending on what order cvs updates them in. So you may end up rebuildinglocally anyway. * unreliable timestamps in tarball drops: same as above. If we could run a program during check *out* not check in then we might have something, but I see no facility for that in cvs. There'd be severe portability problems anyway (how do you know what incantation to mutter to run yacc/bison, when you haven't done configure yet?). So I think removing the deriveds from CVS altogether is a much better answer. regards, tom lane
On 19 Mar 1999 geek+@cmu.edu wrote: > Then <scrappy@hub.org> spoke up and said: > > Agreed here too...someone at one point mentioned that there might be a > > way, inside of CVS, to have it auto-generate these files as its being > > checked out (ie. if file is configure.in, run autoconf)... > > >From the info file: > Module options > -------------- > > Either regular modules or ampersand modules can contain options, > which supply additional information concerning the module. > [snip] > `-i PROG' > Specify a program PROG to run whenever files in a module are > committed. PROG runs with a single argument, the full pathname of > the affected directory in a source repository. The `commitinfo', > `loginfo', and `verifymsg' files provide other ways to call a > program on commit. > > `-o PROG' > Specify a program PROG to run whenever files in a module are > checked out. PROG runs with a single argument, the module name. > > >From my reading, it looks like the easiest thing to do is set up > commit rules such that committing gram.y automatically generates > gram.c. It looks like it might be difficult to have gram.c generated > completely "on the fly" and then passed to the CVS client. Can you provide an exampmle of using/doing this? It sounds like the better solution of them all, if it can be done this way.. Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
I have installed a script (src/tools/release_prep) that generates the parser and ecpg/preproc derived files on-the-fly, and removed said files from CVS. (I didn't do anything about src/configure --- how do people feel about that? I'd want to see hub's autoconf updated to 2.13 anyway, if it is going to start generating configure locally.) In order to generate snapshot tarballs that contain these derived files, you need to replace ~pgsql/bin/mk-snapshot at hub.org with the attached script. (You can find a copy in ~tgl/bin/mk-snapshot at hub, if you'd rather copy that file than cut-n-paste.) It doesn't look like I have write permission on that file, so it's up to you. You'll need to make a comparable mod in whatever script you use for preparing releases, too, but I didn't find that one in looking around. BTW: in testing this script, I produced a tarball of 5894631 bytes, whereas last night's snapshot is 5974070 bytes. It would appear that there's 80k (compressed) worth of cruft in the ~pgsql/pgsql tree that CVSup is not cleaning out. Indeed the *,v files in that toplevel directory are not there in a fresh checkout. I'd suggest rm -rf'ing the whole tree and making CVSup do a fresh checkout. regards, tom lane #!/bin/sh PATH=/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin cd /home/projects/pgsql # check out tree /usr/local/bin/cvsup -L 1 -g -Z README.cvsup # perform prerelease cleanup cd pgsql src/tools/release_prep cd .. # make the snapshot tarfile tar czpf tmp/postgresql.snapshot.tar.gz pgsql rm -f ftp/pub/postgresql.snapshot.tar.gz mv -f tmp/postgresql.snapshot.tar.gz ftp/pub/postgresql.snapshot.tar.gz
Script looks good...regenerating a new snapshot right now ... On Sat, 20 Mar 1999, Tom Lane wrote: > I have installed a script (src/tools/release_prep) that generates the > parser and ecpg/preproc derived files on-the-fly, and removed said files > from CVS. > > (I didn't do anything about src/configure --- how do people feel about > that? I'd want to see hub's autoconf updated to 2.13 anyway, if it is > going to start generating configure locally.) > > In order to generate snapshot tarballs that contain these derived files, > you need to replace ~pgsql/bin/mk-snapshot at hub.org with the attached > script. (You can find a copy in ~tgl/bin/mk-snapshot at hub, if you'd > rather copy that file than cut-n-paste.) It doesn't look like I have > write permission on that file, so it's up to you. > > You'll need to make a comparable mod in whatever script you use for > preparing releases, too, but I didn't find that one in looking around. > > BTW: in testing this script, I produced a tarball of 5894631 bytes, > whereas last night's snapshot is 5974070 bytes. It would appear that > there's 80k (compressed) worth of cruft in the ~pgsql/pgsql tree that > CVSup is not cleaning out. Indeed the *,v files in that toplevel > directory are not there in a fresh checkout. I'd suggest rm -rf'ing > the whole tree and making CVSup do a fresh checkout. > > regards, tom lane > > > #!/bin/sh > PATH=/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin > cd /home/projects/pgsql > # check out tree > /usr/local/bin/cvsup -L 1 -g -Z README.cvsup > # perform prerelease cleanup > cd pgsql > src/tools/release_prep > cd .. > # make the snapshot tarfile > tar czpf tmp/postgresql.snapshot.tar.gz pgsql > rm -f ftp/pub/postgresql.snapshot.tar.gz > mv -f tmp/postgresql.snapshot.tar.gz ftp/pub/postgresql.snapshot.tar.gz > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
-----BEGIN PGP SIGNED MESSAGE----- >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> IIRC autoconf depends on GNU m4, so that's actually two tools Tom> not one, but the installation is straightforward. If you're Tom> on Linux you probably have GNU m4 anyway. I actually thought parts of autoconf use Perl, too.... Or maybe that was automake? roland - -- PGP Key ID: 66 BC 3B CD Roland B. Roberts, PhD Custom Software Solutions roberts@panix.com 76-15 113th Street, Apt 3B rbroberts@acm.org Forest Hills, NY 11375 -----BEGIN PGP SIGNATURE----- Version: 2.6.2 Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface iQCVAwUBNwGcneoW38lmvDvNAQHVPQP/V0oR0cvbr7kVjXKqhMm+eeMaV4UpDgAG I1QxjNXoXM/RQC1x7mFglKKm+2T9KV99elAWxWZ9cQpRMBGYsfR+LpO7mwX6CRFq +ePc0rGvLKqjt4PpGLa5+i5186fz40VR3dowS6xSeyCqLLtntV+njJyMX89QH4VM 6LAHK6yGIaY= =JGbl -----END PGP SIGNATURE-----
Roland Roberts <roberts@panix.com> writes: >>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> IIRC autoconf depends on GNU m4, > I actually thought parts of autoconf use Perl, too.... Or maybe that > was automake? Nope, no Perl in autoconf. I'm less sure about automake. regards, tom lane
-----BEGIN PGP SIGNED MESSAGE----- >>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> Roland Roberts <roberts@panix.com> writes: >>>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: Tom> IIRC autoconfdepends on GNU m4, >> I actually thought parts of autoconf use Perl, too.... Or >> maybe that was automake? Tom> Nope, no Perl in autoconf. I'm less sure about automake. I found it; it does use Perl in the optional `autoscan' script. But that's not really relevant for Postgres.... roland - -- PGP Key ID: 66 BC 3B CD Roland B. Roberts, PhD Custom Software Solutions roberts@panix.com 76-15 113th Street, Apt 3B rbroberts@acm.org Forest Hills, NY 11375 -----BEGIN PGP SIGNATURE----- Version: 2.6.2 Comment: Processed by Mailcrypt 3.4, an Emacs/PGP interface iQCVAwUBNwLlCOoW38lmvDvNAQEgfwQAkw/T4BCtEmtl88F+ci2plkvPdPyQdl3u Ta6/hQKDaP11L/mp+DiNjDXtTk+9q0wEdwIVRZlPoxxnlaa2x0itxnETvzLMV24D 7R78iDyxgQ7yf067zblFrPUnp+tp7lrZfpP1TTCrduSGO1vbP8npX4K7Hwo4lj1f 3UdFHtbuc8g= =p63d -----END PGP SIGNATURE-----