gcc5: initdb produces gigabytes of _fsm files

Started by Christoph Bergalmost 11 years ago8 messages
#1Christoph Berg
cb@df7cb.de

Hi,

gcc5 is lurking in Debian experimental, and it's breaking initdb.
There's bug reports for 9.1 and 9.4:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778070 (9.1)
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778071 (9.4)
but I could reproduce it with 9.5devel (from last week) as well:

gcc (Debian 5-20150205-1) 5.0.0 20150205 (experimental) [trunk revision 220455]

make[2]: Leaving directory '/srv/projects/postgresql/pg/master/src/common'
../../../src/test/regress/pg_regress --inputdir=. --temp-install=./tmp_check --top-builddir=../../.. --dlpath=. --schedule=./parallel_schedule
============== removing existing temp installation ==============
============== creating temporary installation ==============
============== initializing database system ==============

pg_regress: initdb failed
Examine /srv/projects/postgresql/pg/master/src/test/regress/log/initdb.log for the reason.
Command was: "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb" -D "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data" -L "/srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/install//usr/local/pgsql/share" --noclean --nosync > "/srv/projects/postgresql/pg/master/src/test/regress/log/initdb.log" 2>&1
GNUmakefile:138: recipe for target 'check' failed

src/test/regress $ cat log/initdb.log
Running in noclean mode. Mistakes will not be cleaned up.
The files belonging to this database system will be owned by user "cbe".
This user must also own the server process.

The database cluster will be initialized with locales
COLLATE: de_DE.utf8
CTYPE: de_DE.utf8
MESSAGES: C
MONETARY: de_DE.utf8
NUMERIC: de_DE.utf8
TIME: de_DE.utf8
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "german".

Data page checksums are disabled.

creating directory /srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... sysv
creating configuration files ... ok
creating template1 database in /srv/projects/postgresql/pg/master/src/test/regress/./tmp_check/data/base/1 ... FATAL: could not extend file "base/1/2617_fsm": wrote only 4096 of 8192 bytes at block 46197
HINT: Check free disk space.
PANIC: cannot abort transaction 1, it was already committed
Aborted (core dumped)

src/test/regress $ ls -al tmp_check/data/base/1/
insgesamt 34156376
drwx------ 2 cbe cbe 4096 Feb 12 20:04 ./
drwx------ 3 cbe cbe 4096 Feb 12 19:55 ../
-rw------- 1 cbe cbe 40960 Feb 12 20:04 1247
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.1
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.2
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.3
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.4
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.5
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1247_fsm.6
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1247_fsm.7
-rw------- 1 cbe cbe 59138048 Feb 12 20:04 1247_fsm.8
-rw------- 1 cbe cbe 49152 Feb 12 20:04 1249
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.1
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.2
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.3
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.4
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.5
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.6
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1249_fsm.7
-rw------- 1 cbe cbe 59138048 Feb 12 20:04 1249_fsm.8
-rw------- 1 cbe cbe 540672 Feb 12 20:03 1255
-rw------- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm
-rw------- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.1
-rw------- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.2
-rw------- 1 cbe cbe 1073741824 Feb 12 19:55 1255_fsm.3
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.4
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.5
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.6
-rw------- 1 cbe cbe 1073741824 Feb 12 20:03 1255_fsm.7
-rw------- 1 cbe cbe 59138048 Feb 12 20:03 1255_fsm.8
-rw------- 1 cbe cbe 16384 Feb 12 20:04 1259
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.1
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.2
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.3
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.4
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.5
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.6
-rw------- 1 cbe cbe 1073741824 Feb 12 20:04 1259_fsm.7
-rw------- 1 cbe cbe 59138048 Feb 12 20:04 1259_fsm.8
-rw------- 1 cbe cbe 0 Feb 12 20:04 2604
-rw------- 1 cbe cbe 0 Feb 12 20:04 2606
-rw------- 1 cbe cbe 0 Feb 12 20:04 2610
-rw------- 1 cbe cbe 0 Feb 12 20:04 2611
-rw------- 1 cbe cbe 8192 Feb 12 20:04 2617
-rw------- 1 cbe cbe 378449920 Feb 12 20:04 2617_fsm

Program terminated with signal SIGABRT, Aborted.
#0 0x00007f818524c107 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f818524d4e8 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x000000000074cf1a in errfinish ()
#3 0x000000000074e9aa in elog_finish ()
#4 0x00000000004bcebd in RecordTransactionAbort ()
#5 0x00000000004bcf79 in AbortTransaction ()
#6 0x00000000004c0175 in AbortOutOfAnyTransaction ()
#7 0x00000000007579d9 in ShutdownPostgres ()
#8 0x000000000065628d in shmem_exit ()
#9 0x000000000065636e in proc_exit_prepare ()
#10 0x00000000006563e8 in proc_exit ()
#11 0x000000000074cf46 in errfinish ()
#12 0x000000000066df18 in mdextend ()
#13 0x0000000000653ad4 in fsm_readbuf ()
#14 0x0000000000653bfe in fsm_set_and_search ()
#15 0x000000000065409d in RecordAndGetPageWithFreeSpace ()
#16 0x00000000004949a2 in RelationGetBufferForTuple ()
#17 0x000000000048eed7 in heap_insert ()
#18 0x00000000004d5981 in InsertOneTuple ()
#19 0x00000000004d4589 in boot_yyparse ()
#20 0x00000000004d5007 in AuxiliaryProcessMain ()
#21 0x00000000004634cb in main ()

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#1)
Re: gcc5: initdb produces gigabytes of _fsm files

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

Yeah, I just heard the same about Red Hat as well:

https://bugzilla.redhat.com/show_bug.cgi?id=1190978

Not clear if it's an outright compiler bug or they've just found some
creative new way to make an optimization assumption we're violating.
Either way it seems clear that the find-a-page-with-free-space code is
getting into an infinite loop whereby it keeps extending the FSM till
it runs out of disk space.

There's a more detailed stack trace in the Red Hat report.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: gcc5: initdb produces gigabytes of _fsm files

I wrote:

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

Yeah, I just heard the same about Red Hat as well:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978
Not clear if it's an outright compiler bug or they've just found some
creative new way to make an optimization assumption we're violating.

Apparently, it's the former. See

https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3

I will be unamused if the gcc boys try to make an argument that they
did some valid optimization here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#3)
Re: gcc5: initdb produces gigabytes of _fsm files

On Thu, Feb 12, 2015 at 6:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

Yeah, I just heard the same about Red Hat as well:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978
Not clear if it's an outright compiler bug or they've just found some
creative new way to make an optimization assumption we're violating.

Apparently, it's the former. See

https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3

I will be unamused if the gcc boys try to make an argument that they
did some valid optimization here.

You're new here, aren't you?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Geoff Winkless
pgsqladmin@geoff.dj
In reply to: Tom Lane (#3)
Re: gcc5: initdb produces gigabytes of _fsm files

What does the ASM look like? It's a fairly quick way to tell whether the
fail is optimization or memory corruption.

Apologies if I'm explaining how to extract albumen to your elderly
relative...

On 12 February 2015 at 23:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

I wrote:

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

Yeah, I just heard the same about Red Hat as well:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978
Not clear if it's an outright compiler bug or they've just found some
creative new way to make an optimization assumption we're violating.

Apparently, it's the former. See

https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3

I will be unamused if the gcc boys try to make an argument that they
did some valid optimization here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#3)
Re: gcc5: initdb produces gigabytes of _fsm files

On 2015-02-12 18:16:37 -0500, Tom Lane wrote:

I wrote:

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

Yeah, I just heard the same about Red Hat as well:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978
Not clear if it's an outright compiler bug or they've just found some
creative new way to make an optimization assumption we're violating.

Apparently, it's the former. See

https://bugzilla.redhat.com/show_bug.cgi?id=1190978#c3

I will be unamused if the gcc boys try to make an argument that they
did some valid optimization here.

Fixed in gcc upstream https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65053

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Christoph Berg (#1)
Re: gcc5: initdb produces gigabytes of _fsm files

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

FYI, this is now fixed in Red Hat's rawhide version:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978

Don't know what the update process is like for Debian's copy, but
maybe you could pester the appropriate people to absorb the referenced
upstream fix quickly.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Christoph Berg
cb@df7cb.de
In reply to: Tom Lane (#7)
Re: gcc5: initdb produces gigabytes of _fsm files

Re: Tom Lane 2015-02-15 <21030.1424022091@sss.pgh.pa.us>

Christoph Berg <cb@df7cb.de> writes:

gcc5 is lurking in Debian experimental, and it's breaking initdb.

FYI, this is now fixed in Red Hat's rawhide version:
https://bugzilla.redhat.com/show_bug.cgi?id=1190978

Don't know what the update process is like for Debian's copy, but
maybe you could pester the appropriate people to absorb the referenced
upstream fix quickly.

Thanks for pushing this towards the gcc people. I've updated the
Debian bugs so our gcc maintainers can upload a new version as well.

Christoph
--
cb@df7cb.de | http://www.df7cb.de/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers