Thread: A more useful way to split the distribution

A more useful way to split the distribution

From
Peter Eisentraut
Date:
Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue.  I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz    3.3 MB

Everything not in one of the ones below.

* postgresql-XXX.opt.tar.gz    1.7 MB

Everything not needed unless you use one of the following configure
options:  --with-CXX --with-tcl --with-perl --with-python --with-java
--enable-multibyte --enable-odbc, plus some other not-really-needed
things.

The exact directory list is
src/bin/: pgaccess pgtclsh pg_encoding
src/interfaces: odbc libpq++ libpgtcl perl5 python jdbc
src/pl/: plperl tcl
src/backend/utils/mb contrib/retep src/tools build.xml

* postgresql-XXX.docs.tar.gz    1.9 MB

doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps

(Note man pages are in .base.)

* postgresql-XXX.test.tar.gz    1.0 MB

src/test

All this is proportionally about the same as right now, except that each
tarball except base would now be truly optional.  So someone that only
wants to use, say, PHP and psql only needs to download the base package.

Patch below.  Yes/no/maybe?

--- GNUmakefile.in      Sun Apr  8 01:14:23 2001
+++ GNUmakefile2        Sun Apr  8 01:19:55 2001
@@ -60,7 +60,7 @@
dist: $(distdir).tar.gzifeq ($(split-dist), yes)
-dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).support.tar.gz $(distdir).test.tar.gz
+dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).opt.tar.gz $(distdir).test.tar.gzendifdist:       -rm
-rf$(distdir)
 
@@ -68,15 +68,22 @@$(distdir).tar: distdir       $(TAR) chf $@ $(distdir)

+opt_files := $(addprefix src/bin/, pgaccess pgtclsh pg_encoding) \
+       $(addprefix src/interfaces/, odbc libpq++ libpgtcl perl5 python jdbc) \
+       $(addprefix src/pl/, plperl tcl) \
+       src/backend/utils/mb contrib/retep src/tools build.xml
+
+docs_files := doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps
+$(distdir).base.tar: distdir
-       $(TAR) -c $(addprefix --exclude $(distdir)/, doc src/test src/interfaces src/bin) \
+       $(TAR) -c $(addprefix --exclude $(distdir)/, $(docs_files) $(opt_files) src/test) \         -f $@ $(distdir)
$(distdir).docs.tar: distdir
-       $(TAR) cf $@ $(distdir)/doc
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(docs_files))

-$(distdir).support.tar: distdir
-       $(TAR) cf $@ $(distdir)/src/interfaces $(distdir)/src/bin
+$(distdir).opt.tar: distdir
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(opt_files))
$(distdir).test.tar: distdir       $(TAR) cf $@ $(distdir)/src/test
===snip

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: A more useful way to split the distribution

From
The Hermit Hacker
Date:
Oh, I definitely like this ... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

> Since people suddenly seem to be suffering from bandwidth concerns I have
> devised a new distribution split to address this issue.  I propose the
> following four sub-tarballs:
>
> * postgresql-XXX.base.tar.gz    3.3 MB
>
> Everything not in one of the ones below.
>
> * postgresql-XXX.opt.tar.gz    1.7 MB
>
> Everything not needed unless you use one of the following configure
> options:  --with-CXX --with-tcl --with-perl --with-python --with-java
> --enable-multibyte --enable-odbc, plus some other not-really-needed
> things.
>
> The exact directory list is
> src/bin/: pgaccess pgtclsh pg_encoding
> src/interfaces: odbc libpq++ libpgtcl perl5 python jdbc
> src/pl/: plperl tcl
> src/backend/utils/mb contrib/retep src/tools build.xml
>
> * postgresql-XXX.docs.tar.gz    1.9 MB
>
> doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps
>
> (Note man pages are in .base.)
>
> * postgresql-XXX.test.tar.gz    1.0 MB
>
> src/test
>
> All this is proportionally about the same as right now, except that each
> tarball except base would now be truly optional.  So someone that only
> wants to use, say, PHP and psql only needs to download the base package.
>
> Patch below.  Yes/no/maybe?
>
> --- GNUmakefile.in      Sun Apr  8 01:14:23 2001
> +++ GNUmakefile2        Sun Apr  8 01:19:55 2001
> @@ -60,7 +60,7 @@
>
>  dist: $(distdir).tar.gz
>  ifeq ($(split-dist), yes)
> -dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).support.tar.gz $(distdir).test.tar.gz
> +dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).opt.tar.gz $(distdir).test.tar.gz
>  endif
>  dist:
>         -rm -rf $(distdir)
> @@ -68,15 +68,22 @@
>  $(distdir).tar: distdir
>         $(TAR) chf $@ $(distdir)
>
> +opt_files := $(addprefix src/bin/, pgaccess pgtclsh pg_encoding) \
> +       $(addprefix src/interfaces/, odbc libpq++ libpgtcl perl5 python jdbc) \
> +       $(addprefix src/pl/, plperl tcl) \
> +       src/backend/utils/mb contrib/retep src/tools build.xml
> +
> +docs_files := doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps
> +
>  $(distdir).base.tar: distdir
> -       $(TAR) -c $(addprefix --exclude $(distdir)/, doc src/test src/interfaces src/bin) \
> +       $(TAR) -c $(addprefix --exclude $(distdir)/, $(docs_files) $(opt_files) src/test) \
>           -f $@ $(distdir)
>
>  $(distdir).docs.tar: distdir
> -       $(TAR) cf $@ $(distdir)/doc
> +       $(TAR) cf $@ $(addprefix $(distdir)/, $(docs_files))
>
> -$(distdir).support.tar: distdir
> -       $(TAR) cf $@ $(distdir)/src/interfaces $(distdir)/src/bin
> +$(distdir).opt.tar: distdir
> +       $(TAR) cf $@ $(addprefix $(distdir)/, $(opt_files))
>
>  $(distdir).test.tar: distdir
>         $(TAR) cf $@ $(distdir)/src/test
> ===snip
>
> --
> Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: A more useful way to split the distribution

From
Lamar Owen
Date:
The Hermit Hacker wrote:
> Oh, I definitely like this ... and get rid of the *large* file, which will
> save all the mirrors a good deal of space over time ...

You gonna make a set of RC3 or 4 tarballs along these lines to test? I
want to try a build with this split before doing too much else -- well,
actually, I just want to make sure I get it right before release, as I'd
like to not have but a couple of days before an RPM release after the
announcement.

Sounds like a plan.

I'm going to upload a set of RC3 RPM's tonight -- there are changes that
I need people to comment upon.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11


Re: A more useful way to split the distribution

From
The Hermit Hacker
Date:
as soon as Peter commits the changes, I'll do up an RC4 with the new
format so that everyone can test it ...

On Sat, 7 Apr 2001, Lamar Owen wrote:

> The Hermit Hacker wrote:
> > Oh, I definitely like this ... and get rid of the *large* file, which will
> > save all the mirrors a good deal of space over time ...
>
> You gonna make a set of RC3 or 4 tarballs along these lines to test? I
> want to try a build with this split before doing too much else -- well,
> actually, I just want to make sure I get it right before release, as I'd
> like to not have but a couple of days before an RPM release after the
> announcement.
>
> Sounds like a plan.
>
> I'm going to upload a set of RC3 RPM's tonight -- there are changes that
> I need people to comment upon.
> --
> Lamar Owen
> WGCR Internet Radio
> 1 Peter 4:11
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: A more useful way to split the distribution

From
Vince Vielhaber
Date:
On Sat, 7 Apr 2001, The Hermit Hacker wrote:

>
> Oh, I definitely like this ... and get rid of the *large* file, which will
> save all the mirrors a good deal of space over time ...
>
> On Sun, 8 Apr 2001, Peter Eisentraut wrote:
>
> > Since people suddenly seem to be suffering from bandwidth concerns I have
> > devised a new distribution split to address this issue.  I propose the
> > following four sub-tarballs:
> >
> > * postgresql-XXX.base.tar.gz    3.3 MB
> >
> > Everything not in one of the ones below.
> >
> > * postgresql-XXX.opt.tar.gz    1.7 MB
> >
> > Everything not needed unless you use one of the following configure
> > options:  --with-CXX --with-tcl --with-perl --with-python --with-java
> > --enable-multibyte --enable-odbc, plus some other not-really-needed
> > things.

As long as there's still the FULL tarball with everything in it available.

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net        56K Nationwide Dialup from $16.00/mo
atPop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop Superstore
http://www.cloudninegifts.com
==========================================================================





Re: A more useful way to split the distribution

From
Christopher Sawtell
Date:
On Sun, 08 Apr 2001 11:24, Peter Eisentraut wrote:
> Since people suddenly seem to be suffering from bandwidth concerns I
> have devised a new distribution split to address this issue. 

[  ...  snipping the many tarballs argument  ...  ]

For me and I expect many other folk on the edge of civilization it is a 
total PITA to have to fiddle around and download many separate tarball 
files. What I want is to be able to start a d/l going and then come back 
when it's finished and know that I have _everything_ I actually need to 
have a working and documented product in one shot. 

For developers, contributors and testers and I would like to suggest that 
an exact snapshot of the complete CVS source archive is appropriate.  We 
can then track the changes every day using cvs or cvsup - wonderful tool 
btw - 

What is really _really_ needed is a text README which explains exactly 
what file contains.

Personally I have found that the limitations of the packaging systems to 
be such a nuisence that I always compile everything from source.

-- 
Sincerely etc.,
NAME       Christopher SawtellCELL PHONE 021 257 4451ICQ UIN    45863470EMAIL      csawtell @ xtra . co . nzCNOTES
ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz
-->> Please refrain from using HTML or WORD attachments in e-mails to me 
<<--



Re: A more useful way to split the distribution

From
Peter Eisentraut
Date:
The Hermit Hacker writes:

> ... and get rid of the *large* file, which will
> save all the mirrors a good deal of space over time ...

You will certainly get a furious crowd at your door within hours if you do
that, as the follow-ups show.  Saving download bandwidth is a valid issue,
but saving disk space on the order of perhaps 50 MB for sites that act as
download archives is not worth the drawbacks.

Btw., do we have any download statistics, especially as to how many people
elected to download the "chunks"?

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: A more useful way to split the distribution

From
Peter Eisentraut
Date:
Christopher Sawtell writes:

> For me and I expect many other folk on the edge of civilization it is a
> total PITA to have to fiddle around and download many separate tarball
> files. What I want is to be able to start a d/l going and then come back
> when it's finished and know that I have _everything_ I actually need to
> have a working and documented product in one shot.

Right.  The only reason for splitting the distribution is to cater to the
fictitious crowd with "bandwidth problems" or those that explicitly know
that they don't need the rest.  There will still be a canonical full
tarball with everything, or at least I will not put my name to something
that abolishes it.  In fact, I didn't like the idea of the split tarballs
in the first place, I'm merely changing the split to something more
useful.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: A more useful way to split the distribution

From
Peter Eisentraut
Date:
I wrote:

> Since people suddenly seem to be suffering from bandwidth concerns I have
> devised a new distribution split to address this issue.  I propose the
> following four sub-tarballs:

> * postgresql-XXX.base.tar.gz    3.3 MB
> * postgresql-XXX.opt.tar.gz    1.7 MB
> * postgresql-XXX.docs.tar.gz    1.9 MB
> * postgresql-XXX.test.tar.gz    1.0 MB

Since we're going to make a change, I'd like to change the names to

postgresql-base-XXX.tar.gz

etc. to align them with existing practice (cf. RPMs, GCC download).  Dots
should be used for format-identifying extensions.

-- 
Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/



Re: Re: A more useful way to split the distribution

From
The Hermit Hacker
Date:
On Sun, 8 Apr 2001, Peter Eisentraut wrote:

> I wrote:
>
> > Since people suddenly seem to be suffering from bandwidth concerns I have
> > devised a new distribution split to address this issue.  I propose the
> > following four sub-tarballs:
>
> > * postgresql-XXX.base.tar.gz    3.3 MB
> > * postgresql-XXX.opt.tar.gz    1.7 MB
> > * postgresql-XXX.docs.tar.gz    1.9 MB
> > * postgresql-XXX.test.tar.gz    1.0 MB
>
> Since we're going to make a change, I'd like to change the names to
>
> postgresql-base-XXX.tar.gz
>
> etc. to align them with existing practice (cf. RPMs, GCC download).  Dots
> should be used for format-identifying extensions.

Go for it ... more a visual change then anything ...




Re: A more useful way to split the distribution

From
The Hermit Hacker
Date:
this only represents since 8:30am this morning ...

/source/v7.0.3/postgresql-7.0.3.support.tar.gz => 9
/source/v7.0.3/postgresql-7.0.3.test.tar.gz => 3
/source/v7.0.3/postgresql-7.0.3.docs.tar.gz => 10
/source/v7.0.3/postgresql-7.0.3.tar.gz => 22
/source/v7.0.3/postgresql-7.0.3.base.tar.gz => 9

on a side note, we almost have as many downloads of psqlodbc in that time
period:

/odbc/psqlodbc_home.html => 15
/odbc/versions/psqlodbc-07_01_0002.zip => 4
/odbc/versions/psqlodbc-07_01_0003.zip => 4
/odbc/versions/psqlodbc-07_01_0004.zip => 18

so it isn't a "fictitous crowd" that is going with the smaller chunks ...
its about 30% on a very small sample ...

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

> Christopher Sawtell writes:
>
> > For me and I expect many other folk on the edge of civilization it is a
> > total PITA to have to fiddle around and download many separate tarball
> > files. What I want is to be able to start a d/l going and then come back
> > when it's finished and know that I have _everything_ I actually need to
> > have a working and documented product in one shot.
>
> Right.  The only reason for splitting the distribution is to cater to the
> fictitious crowd with "bandwidth problems" or those that explicitly know
> that they don't need the rest.  There will still be a canonical full
> tarball with everything, or at least I will not put my name to something
> that abolishes it.  In fact, I didn't like the idea of the split tarballs
> in the first place, I'm merely changing the split to something more
> useful.
>
> --
> Peter Eisentraut      peter_e@gmx.net       http://yi.org/peter-e/
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://www.postgresql.org/search.mpl
>

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org



Re: A more useful way to split the distribution

From
Thomas Lockhart
Date:
> so it isn't a "fictitous crowd" that is going with the smaller chunks ...
> its about 30% on a very small sample ...

(back in town from the weekend, to see the PostgreSQL tarball ripped to
shreds ;)

Peter, I'm with you on this. If folks want to help support PostgreSQL by
providing subset-tarballs, then great. But many of us have contributed
to the monolithic tarball, and will continue to do so. So lets make sure
that we have *at least* the big tarball available, and if someone wants
to subset it then I'm sure that would be very useful for a large number
of users, even if percentage-wise they are not the majority.

No point in polarizing it or forcing a choice: certainly the form we
have used for the last 6 years (and for the 6 years before that too,
probably) is a legitimate and useful form, and we can experiment with
subsets as much as anyone cares to.

With the big tarball, Lamar and others (such as Oliver and myself) can
continue their packaging work for 7.1 without having to cope with last
minute subset issues.
                     - Thomas