Why so many out-of-disk-space failures on buildfarm machines?

Started by Tom Laneover 18 years ago12 messages
#1Tom Lane
tgl@sss.pgh.pa.us

It seems like we see a remarkable number of occurrences of $subject.
For instance, right now we have these members failing on various
branches:

echidna No space left on device
asp No space left on device
herring No space left on device (icc seems particularly unable
to cope with this, or at least I suspect that's the
reason for some builds failing with that bizarre message)
kite gcc quoth "Internal compiler error: Segmentation fault"
wildebeest long-standing configuration error (no Tk installed)
wombat long-standing configuration error (no Tk installed)

I realize that a lot of these members are running on old underpowered
machines with not so much disk, but is it possible that the buildfarm
itself is leaking disk space? Not cleaning up log files for instance?

regards, tom lane

#2Kris Jurka
books@ejurka.com
In reply to: Tom Lane (#1)
Re: Why so many out-of-disk-space failures on buildfarm machines?

On Tue, 3 Jul 2007, Tom Lane wrote:

I realize that a lot of these members are running on old underpowered
machines with not so much disk, but is it possible that the buildfarm
itself is leaking disk space? Not cleaning up log files for instance?

No, the buildfarm does not leak disk space. It is possible that members
are configured with --keepall, which keeps the entire directory tree if a
failure occurs. That can fill up a lot of space quickly when you get a
failure.

Kris Jurka

#3Darcy Buskermolen
darcy@dbitech.ca
In reply to: Tom Lane (#1)
Re: Why so many out-of-disk-space failures on buildfarm machines?

On Tuesday 03 July 2007 19:35, Tom Lane wrote:

It seems like we see a remarkable number of occurrences of $subject.
For instance, right now we have these members failing on various
branches:

echidna No space left on device
asp No space left on device
herring No space left on device (icc seems particularly unable
to cope with this, or at least I suspect that's the
reason for some builds failing with that bizarre message)
kite gcc quoth "Internal compiler error: Segmentation fault"
wildebeest long-standing configuration error (no Tk installed)
wombat long-standing configuration error (no Tk installed)

I realize that a lot of these members are running on old underpowered
machines with not so much disk, but is it possible that the buildfarm
itself is leaking disk space? Not cleaning up log files for instance?

in the case of echidna and herring, the "out of space" is actually on a
different partition, it occurs on /var (which I admit is undersized since
loosing a disk) from time to time. Some rather major storm activity on
Friday has taken this system off line for a while I hope to have it back in
the loop by the end of the week.

Show quoted text

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#4Mark Wong
markwkm@gmail.com
In reply to: Tom Lane (#1)
Re: Why so many out-of-disk-space failures on buildfarm machines?

On 7/3/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

wombat long-standing configuration error (no Tk installed)

My apologies for not responding earlier. I see 7.3 contrib problems
for wombat but I don't see a config error for Tk with HEAD or any of
the other 8.x releases. I have the --without-tk flag for the 7.4
release.

Regards,
Mark

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Wong (#4)
Re: Why so many out-of-disk-space failures on buildfarm machines?

"Mark Wong" <markwkm@gmail.com> writes:

On 7/3/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

wombat long-standing configuration error (no Tk installed)

My apologies for not responding earlier. I see 7.3 contrib problems
for wombat but I don't see a config error for Tk with HEAD or any of
the other 8.x releases. I have the --without-tk flag for the 7.4
release.

Sorry about that, I must've confused wombat with some other buildfarm
critter while making that list.

On inspection it looks like a flex compatibility issue. What version of
flex is on that machine?

regards, tom lane

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Mark Wong (#4)
Re: Why so many out-of-disk-space failures on buildfarm machines?

Mark Wong wrote:

On 7/3/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

wombat long-standing configuration error (no Tk installed)

My apologies for not responding earlier. I see 7.3 contrib problems
for wombat but I don't see a config error for Tk with HEAD or any of
the other 8.x releases. I have the --without-tk flag for the 7.4
release.

Mark,

I don't think we're ever going to fix things for the 7.3 error you're
getting - please take it out of your rotation. 7.3 isn't quite as dead
as Joshua suggested earlier, but it's certainly on life support.

cheers

andrew

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: Why so many out-of-disk-space failures on buildfarm machines?

Andrew Dunstan <andrew@dunslane.net> writes:

I don't think we're ever going to fix things for the 7.3 error you're
getting - please take it out of your rotation. 7.3 isn't quite as dead
as Joshua suggested earlier, but it's certainly on life support.

I checked the CVS logs and it appears that we fixed several contrib
modules, not only cube, to work with flex 2.5.31 during the 7.4 devel
cycle. I don't think anyone cares to back-port that much work. Our
position should be "if you want to build 7.3 you need flex 2.5.4 to do
it".

If Mark still wants to test 7.3, he could install flex 2.5.4 someplace
and make sure that's first in the PATH while building 7.3.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#6)
Re: Why so many out-of-disk-space failures on buildfarm machines?

BTW, while I'm thinking of it --- it'd be real nice if the buildfarm
"configuration" printout included the flex and bison version numbers.
Maybe gcc too (I know not every buildfarm member is compiling with gcc,
but it comes in enough different versions that this is likely to be
useful info).

regards, tom lane

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#8)
Re: Why so many out-of-disk-space failures on buildfarm machines?

Tom Lane wrote:

BTW, while I'm thinking of it --- it'd be real nice if the buildfarm
"configuration" printout included the flex and bison version numbers.
Maybe gcc too (I know not every buildfarm member is compiling with gcc,
but it comes in enough different versions that this is likely to be
useful info).

Interestingly, none of our tools actually outputs the bison/flex
versions - perhaps configure should be doing that. We do log the gcc
version (look in the config.log if you want to make sure).

cheers

andrew

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#9)
Re: Why so many out-of-disk-space failures on buildfarm machines?

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

BTW, while I'm thinking of it --- it'd be real nice if the buildfarm
"configuration" printout included the flex and bison version numbers.

Interestingly, none of our tools actually outputs the bison/flex
versions - perhaps configure should be doing that.

Hmm, that's a good solution, especially since it'll start working right
away instead of waiting for buildfarm owners to update their scripts ;-)

I think I can get it to do
checking for flex... /usr/local/bin/flex 2.5.4
... does that seem reasonable?

We do log the gcc
version (look in the config.log if you want to make sure).

Oh, that seems to be something in the base autoconf macros rather than
anything we put in.  I don't especially like hiding it in config.log
--- who reads that?

regards, tom lane

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#10)
Re: Why so many out-of-disk-space failures on buildfarm machines?

Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Tom Lane wrote:

BTW, while I'm thinking of it --- it'd be real nice if the buildfarm
"configuration" printout included the flex and bison version numbers.

Interestingly, none of our tools actually outputs the bison/flex
versions - perhaps configure should be doing that.

Hmm, that's a good solution, especially since it'll start working right
away instead of waiting for buildfarm owners to update their scripts ;-)

I think I can get it to do
checking for flex... /usr/local/bin/flex 2.5.4
... does that seem reasonable?

looks good to me.

We do log the gcc
version (look in the config.log if you want to make sure).

Oh, that seems to be something in the base autoconf macros rather than
anything we put in.  I don't especially like hiding it in config.log
--- who reads that?

Well, we should be able to add something along the lines of what you did
above, though, can't we? We already have a section in configure.in
specifically for gcc (see around line 272).

cheers

andrew

#12Mark Wong
markwkm@gmail.com
In reply to: Tom Lane (#7)
Re: Why so many out-of-disk-space failures on buildfarm machines?

On 7/18/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I don't think we're ever going to fix things for the 7.3 error you're
getting - please take it out of your rotation. 7.3 isn't quite as dead
as Joshua suggested earlier, but it's certainly on life support.

I checked the CVS logs and it appears that we fixed several contrib
modules, not only cube, to work with flex 2.5.31 during the 7.4 devel
cycle. I don't think anyone cares to back-port that much work. Our
position should be "if you want to build 7.3 you need flex 2.5.4 to do
it".

If Mark still wants to test 7.3, he could install flex 2.5.4 someplace
and make sure that's first in the PATH while building 7.3.

I have flex 2.5.33 on the system, but I have decided to take the easy
way out and removed 7.3 out of my rotation.

Regards,
Mark