A more useful way to split the distribution

Started by Peter Eisentrautabout 25 years ago12 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue. I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz 3.3 MB

Everything not in one of the ones below.

* postgresql-XXX.opt.tar.gz 1.7 MB

Everything not needed unless you use one of the following configure
options: --with-CXX --with-tcl --with-perl --with-python --with-java
--enable-multibyte --enable-odbc, plus some other not-really-needed
things.

The exact directory list is
src/bin/: pgaccess pgtclsh pg_encoding
src/interfaces: odbc libpq++ libpgtcl perl5 python jdbc
src/pl/: plperl tcl
src/backend/utils/mb contrib/retep src/tools build.xml

* postgresql-XXX.docs.tar.gz 1.9 MB

doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps

(Note man pages are in .base.)

* postgresql-XXX.test.tar.gz 1.0 MB

src/test

All this is proportionally about the same as right now, except that each
tarball except base would now be truly optional. So someone that only
wants to use, say, PHP and psql only needs to download the base package.

Patch below. Yes/no/maybe?

--- GNUmakefile.in      Sun Apr  8 01:14:23 2001
+++ GNUmakefile2        Sun Apr  8 01:19:55 2001
@@ -60,7 +60,7 @@
 dist: $(distdir).tar.gz
 ifeq ($(split-dist), yes)
-dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).support.tar.gz $(distdir).test.tar.gz
+dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).opt.tar.gz $(distdir).test.tar.gz
 endif
 dist:
        -rm -rf $(distdir)
@@ -68,15 +68,22 @@
 $(distdir).tar: distdir
        $(TAR) chf $@ $(distdir)
+opt_files := $(addprefix src/bin/, pgaccess pgtclsh pg_encoding) \
+       $(addprefix src/interfaces/, odbc libpq++ libpgtcl perl5 python jdbc) \
+       $(addprefix src/pl/, plperl tcl) \
+       src/backend/utils/mb contrib/retep src/tools build.xml
+
+docs_files := doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps
+
 $(distdir).base.tar: distdir
-       $(TAR) -c $(addprefix --exclude $(distdir)/, doc src/test src/interfaces src/bin) \
+       $(TAR) -c $(addprefix --exclude $(distdir)/, $(docs_files) $(opt_files) src/test) \
          -f $@ $(distdir)
 $(distdir).docs.tar: distdir
-       $(TAR) cf $@ $(distdir)/doc
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(docs_files))
-$(distdir).support.tar: distdir
-       $(TAR) cf $@ $(distdir)/src/interfaces $(distdir)/src/bin
+$(distdir).opt.tar: distdir
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(opt_files))

$(distdir).test.tar: distdir
$(TAR) cf $@ $(distdir)/src/test
===snip

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#2The Hermit Hacker
scrappy@hub.org
In reply to: Peter Eisentraut (#1)
Re: A more useful way to split the distribution

Oh, I definitely like this ... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue. I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz 3.3 MB

Everything not in one of the ones below.

* postgresql-XXX.opt.tar.gz 1.7 MB

Everything not needed unless you use one of the following configure
options: --with-CXX --with-tcl --with-perl --with-python --with-java
--enable-multibyte --enable-odbc, plus some other not-really-needed
things.

The exact directory list is
src/bin/: pgaccess pgtclsh pg_encoding
src/interfaces: odbc libpq++ libpgtcl perl5 python jdbc
src/pl/: plperl tcl
src/backend/utils/mb contrib/retep src/tools build.xml

* postgresql-XXX.docs.tar.gz 1.9 MB

doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps

(Note man pages are in .base.)

* postgresql-XXX.test.tar.gz 1.0 MB

src/test

All this is proportionally about the same as right now, except that each
tarball except base would now be truly optional. So someone that only
wants to use, say, PHP and psql only needs to download the base package.

Patch below. Yes/no/maybe?

--- GNUmakefile.in      Sun Apr  8 01:14:23 2001
+++ GNUmakefile2        Sun Apr  8 01:19:55 2001
@@ -60,7 +60,7 @@
dist: $(distdir).tar.gz
ifeq ($(split-dist), yes)
-dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).support.tar.gz $(distdir).test.tar.gz
+dist: $(distdir).base.tar.gz $(distdir).docs.tar.gz $(distdir).opt.tar.gz $(distdir).test.tar.gz
endif
dist:
-rm -rf $(distdir)
@@ -68,15 +68,22 @@
$(distdir).tar: distdir
$(TAR) chf $@ $(distdir)
+opt_files := $(addprefix src/bin/, pgaccess pgtclsh pg_encoding) \
+       $(addprefix src/interfaces/, odbc libpq++ libpgtcl perl5 python jdbc) \
+       $(addprefix src/pl/, plperl tcl) \
+       src/backend/utils/mb contrib/retep src/tools build.xml
+
+docs_files := doc/postgres.tar.gz doc/src doc/TODO.detail doc/internals.ps
+
$(distdir).base.tar: distdir
-       $(TAR) -c $(addprefix --exclude $(distdir)/, doc src/test src/interfaces src/bin) \
+       $(TAR) -c $(addprefix --exclude $(distdir)/, $(docs_files) $(opt_files) src/test) \
-f $@ $(distdir)
$(distdir).docs.tar: distdir
-       $(TAR) cf $@ $(distdir)/doc
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(docs_files))
-$(distdir).support.tar: distdir
-       $(TAR) cf $@ $(distdir)/src/interfaces $(distdir)/src/bin
+$(distdir).opt.tar: distdir
+       $(TAR) cf $@ $(addprefix $(distdir)/, $(opt_files))

$(distdir).test.tar: distdir
$(TAR) cf $@ $(distdir)/src/test
===snip

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#3Lamar Owen
lamar.owen@wgcr.org
In reply to: The Hermit Hacker (#2)
Re: A more useful way to split the distribution

The Hermit Hacker wrote:

Oh, I definitely like this ... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

You gonna make a set of RC3 or 4 tarballs along these lines to test? I
want to try a build with this split before doing too much else -- well,
actually, I just want to make sure I get it right before release, as I'd
like to not have but a couple of days before an RPM release after the
announcement.

Sounds like a plan.

I'm going to upload a set of RC3 RPM's tonight -- there are changes that
I need people to comment upon.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

#4The Hermit Hacker
scrappy@hub.org
In reply to: Lamar Owen (#3)
Re: A more useful way to split the distribution

as soon as Peter commits the changes, I'll do up an RC4 with the new
format so that everyone can test it ...

On Sat, 7 Apr 2001, Lamar Owen wrote:

The Hermit Hacker wrote:

Oh, I definitely like this ... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

You gonna make a set of RC3 or 4 tarballs along these lines to test? I
want to try a build with this split before doing too much else -- well,
actually, I just want to make sure I get it right before release, as I'd
like to not have but a couple of days before an RPM release after the
announcement.

Sounds like a plan.

I'm going to upload a set of RC3 RPM's tonight -- there are changes that
I need people to comment upon.
--
Lamar Owen
WGCR Internet Radio
1 Peter 4:11

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#5Vince Vielhaber
vev@michvhf.com
In reply to: The Hermit Hacker (#2)
Re: A more useful way to split the distribution

On Sat, 7 Apr 2001, The Hermit Hacker wrote:

Oh, I definitely like this ... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue. I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz 3.3 MB

Everything not in one of the ones below.

* postgresql-XXX.opt.tar.gz 1.7 MB

Everything not needed unless you use one of the following configure
options: --with-CXX --with-tcl --with-perl --with-python --with-java
--enable-multibyte --enable-odbc, plus some other not-really-needed
things.

As long as there's still the FULL tarball with everything in it available.

Vince.
--
==========================================================================
Vince Vielhaber -- KA8CSH email: vev@michvhf.com http://www.pop4.net
56K Nationwide Dialup from $16.00/mo at Pop4 Networking
Online Campground Directory http://www.camping-usa.com
Online Giftshop Superstore http://www.cloudninegifts.com
==========================================================================

#6Christopher Sawtell
csawtell@xtra.co.nz
In reply to: Peter Eisentraut (#1)
Re: A more useful way to split the distribution

On Sun, 08 Apr 2001 11:24, Peter Eisentraut wrote:

Since people suddenly seem to be suffering from bandwidth concerns I
have devised a new distribution split to address this issue.

[ ... snipping the many tarballs argument ... ]

For me and I expect many other folk on the edge of civilization it is a
total PITA to have to fiddle around and download many separate tarball
files. What I want is to be able to start a d/l going and then come back
when it's finished and know that I have _everything_ I actually need to
have a working and documented product in one shot.

For developers, contributors and testers and I would like to suggest that
an exact snapshot of the complete CVS source archive is appropriate. We
can then track the changes every day using cvs or cvsup - wonderful tool
btw -

What is really _really_ needed is a text README which explains exactly
what file contains.

Personally I have found that the limitations of the packaging systems to
be such a nuisence that I always compile everything from source.

--
Sincerely etc.,

NAME Christopher Sawtell
CELL PHONE 021 257 4451
ICQ UIN 45863470
EMAIL csawtell @ xtra . co . nz
CNOTES ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz

-->> Please refrain from using HTML or WORD attachments in e-mails to me
<<--

#7Peter Eisentraut
peter_e@gmx.net
In reply to: The Hermit Hacker (#2)
Re: A more useful way to split the distribution

The Hermit Hacker writes:

... and get rid of the *large* file, which will
save all the mirrors a good deal of space over time ...

You will certainly get a furious crowd at your door within hours if you do
that, as the follow-ups show. Saving download bandwidth is a valid issue,
but saving disk space on the order of perhaps 50 MB for sites that act as
download archives is not worth the drawbacks.

Btw., do we have any download statistics, especially as to how many people
elected to download the "chunks"?

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#8Peter Eisentraut
peter_e@gmx.net
In reply to: Christopher Sawtell (#6)
Re: A more useful way to split the distribution

Christopher Sawtell writes:

For me and I expect many other folk on the edge of civilization it is a
total PITA to have to fiddle around and download many separate tarball
files. What I want is to be able to start a d/l going and then come back
when it's finished and know that I have _everything_ I actually need to
have a working and documented product in one shot.

Right. The only reason for splitting the distribution is to cater to the
fictitious crowd with "bandwidth problems" or those that explicitly know
that they don't need the rest. There will still be a canonical full
tarball with everything, or at least I will not put my name to something
that abolishes it. In fact, I didn't like the idea of the split tarballs
in the first place, I'm merely changing the split to something more
useful.

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#1)
Re: A more useful way to split the distribution

I wrote:

Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue. I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz 3.3 MB
* postgresql-XXX.opt.tar.gz 1.7 MB
* postgresql-XXX.docs.tar.gz 1.9 MB
* postgresql-XXX.test.tar.gz 1.0 MB

Since we're going to make a change, I'd like to change the names to

postgresql-base-XXX.tar.gz

etc. to align them with existing practice (cf. RPMs, GCC download). Dots
should be used for format-identifying extensions.

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

#10The Hermit Hacker
scrappy@hub.org
In reply to: Peter Eisentraut (#9)
Re: Re: A more useful way to split the distribution

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

I wrote:

Since people suddenly seem to be suffering from bandwidth concerns I have
devised a new distribution split to address this issue. I propose the
following four sub-tarballs:

* postgresql-XXX.base.tar.gz 3.3 MB
* postgresql-XXX.opt.tar.gz 1.7 MB
* postgresql-XXX.docs.tar.gz 1.9 MB
* postgresql-XXX.test.tar.gz 1.0 MB

Since we're going to make a change, I'd like to change the names to

postgresql-base-XXX.tar.gz

etc. to align them with existing practice (cf. RPMs, GCC download). Dots
should be used for format-identifying extensions.

Go for it ... more a visual change then anything ...

#11The Hermit Hacker
scrappy@hub.org
In reply to: Peter Eisentraut (#8)
Re: A more useful way to split the distribution

this only represents since 8:30am this morning ...

/source/v7.0.3/postgresql-7.0.3.support.tar.gz => 9
/source/v7.0.3/postgresql-7.0.3.test.tar.gz => 3
/source/v7.0.3/postgresql-7.0.3.docs.tar.gz => 10
/source/v7.0.3/postgresql-7.0.3.tar.gz => 22
/source/v7.0.3/postgresql-7.0.3.base.tar.gz => 9

on a side note, we almost have as many downloads of psqlodbc in that time
period:

/odbc/psqlodbc_home.html => 15
/odbc/versions/psqlodbc-07_01_0002.zip => 4
/odbc/versions/psqlodbc-07_01_0003.zip => 4
/odbc/versions/psqlodbc-07_01_0004.zip => 18

so it isn't a "fictitous crowd" that is going with the smaller chunks ...
its about 30% on a very small sample ...

On Sun, 8 Apr 2001, Peter Eisentraut wrote:

Christopher Sawtell writes:

For me and I expect many other folk on the edge of civilization it is a
total PITA to have to fiddle around and download many separate tarball
files. What I want is to be able to start a d/l going and then come back
when it's finished and know that I have _everything_ I actually need to
have a working and documented product in one shot.

Right. The only reason for splitting the distribution is to cater to the
fictitious crowd with "bandwidth problems" or those that explicitly know
that they don't need the rest. There will still be a canonical full
tarball with everything, or at least I will not put my name to something
that abolishes it. In fact, I didn't like the idea of the split tarballs
in the first place, I'm merely changing the split to something more
useful.

--
Peter Eisentraut peter_e@gmx.net http://yi.org/peter-e/

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://www.postgresql.org/search.mpl

Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy
Systems Administrator @ hub.org
primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org

#12Thomas Lockhart
lockhart@alumni.caltech.edu
In reply to: The Hermit Hacker (#11)
Re: A more useful way to split the distribution

so it isn't a "fictitous crowd" that is going with the smaller chunks ...
its about 30% on a very small sample ...

(back in town from the weekend, to see the PostgreSQL tarball ripped to
shreds ;)

Peter, I'm with you on this. If folks want to help support PostgreSQL by
providing subset-tarballs, then great. But many of us have contributed
to the monolithic tarball, and will continue to do so. So lets make sure
that we have *at least* the big tarball available, and if someone wants
to subset it then I'm sure that would be very useful for a large number
of users, even if percentage-wise they are not the majority.

No point in polarizing it or forcing a choice: certainly the form we
have used for the last 6 years (and for the 6 years before that too,
probably) is a legitimate and useful form, and we can experiment with
subsets as much as anyone cares to.

With the big tarball, Lamar and others (such as Oliver and myself) can
continue their packaging work for 7.1 without having to cope with last
minute subset issues.

- Thomas