Debug strategy for musl Postgres?
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python using musl
as a foundation for Postgres.
I'm using musl to increase the portability of the Postgres binary. I build
on Ubuntu 13.10 but will runs on older Linux boxes.
So far I get better results with the musl Postgres built on modern Ubuntu
and running on an old kernel than building Postgres directly on the old
Linux using standard C library. But the musl Postgres is still not working
fully. I'm not getting responses from the server.
Here's the tail end "strace pg_isready" output for musl Postgres built and
running on Ubuntu 13.10:
clock_gettime(CLOCK_REALTIME, {1397359337, 426941692}) = 0
poll([{fd=4, events=POLLOUT|POLLERR}], 1, 3000) = 1 ([{fd=4,
revents=POLLOUT}])
sendto(4, "\0\0\0=\0\3\0\0user\0mudd\0database\0mudd\0"..., 61,
MSG_NOSIGNAL, NULL, 0) = 61
clock_gettime(CLOCK_REALTIME, {1397359337, 427070343}) = 0
poll([{fd=4, events=POLLIN|POLLERR}], 1, 3000) = 1 ([{fd=4,
revents=POLLIN}])
recvfrom(4, "R\0\0\0\10\0\0\0\0E\0\0\0RSFATAL\0C3D000\0Mdat"..., 16384, 0,
NULL, NULL) = 92
close(4) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo
...}) = 0
writev(1, [{"/tmp:5432 - accepting connection"..., 33}, {"\n", 1}], 2) = 34
exit_group(0) = ?
Here's the tail end "strace pg_isready" output for musl Postgres built on
Ubuntu 13.10 but running on old Linux:
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1
sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000,
NULL, 0) = 63
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0
close(3) = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
writev(1, [{"/tmp:5432 - no response", 23}, {"\n", 1}], 2) = 24
exit_group(2) = ?
For my next step I'll try building musl Postgres with the --enable-cassert
option. What else can I do to debug this?
John
On 13-04-2014 00:40, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python using musl
as a foundation for Postgres.
This is not a bug. This kind of discussion belongs to -hackers.
While reading this email, I give musl a try. I'm using Debian jessie
which contains musl 1.0.0. I compiled the source (git master) using
CC="musl-gcc" and disabled zlib and readline. It passed all regression
tests. I also tried a pgbench which ran like a charm. (After installed
the binaries I had to set the libray path for musl in
/etc/ld-musl-x86_64.d.)
I'm using musl to increase the portability of the Postgres binary. I build
on Ubuntu 13.10 but will runs on older Linux boxes.
Could you give details about your architecture?
For my next step I'll try building musl Postgres with the --enable-cassert
option. What else can I do to debug this?
Is postgres running and listening 5432? Did you try another binaries
(eg. psql) or even postgres in single mode?
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
I agree, not a bug. I was just following the instructions to post as bug
first and then move to hackers if directed. I'll repost on hackers and give
the rest of my reply there.
On Sun, Apr 13, 2014 at 12:04 PM, Euler Taveira <euler@timbira.com.br>wrote:
Show quoted text
On 13-04-2014 00:40, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python usingmusl
as a foundation for Postgres.
This is not a bug. This kind of discussion belongs to -hackers.
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
Reposting from pgsql-bugs since this is not a bug.
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python using musl
as a foundation for Postgres.
I'm using musl to increase the portability of the Postgres binary. I build
on Ubuntu 13.10 but will runs on older Linux boxes.
So far I get better results with the musl Postgres built on modern Ubuntu
and running on an old kernel than building Postgres directly on the old
Linux using standard C library. But the musl Postgres is still not working
fully. I'm not getting responses from the server.
Here's the tail end "strace pg_isready" output for musl Postgres built and
running on Ubuntu 13.10:
clock_gettime(CLOCK_REALTIME, {1397359337, 426941692}) = 0
poll([{fd=4, events=POLLOUT|POLLERR}], 1, 3000) = 1 ([{fd=4,
revents=POLLOUT}])
sendto(4, "\0\0\0=\0\3\0\0user\0mudd\0database\0mudd\0"..., 61,
MSG_NOSIGNAL, NULL, 0) = 61
clock_gettime(CLOCK_REALTIME, {1397359337, 427070343}) = 0
poll([{fd=4, events=POLLIN|POLLERR}], 1, 3000) = 1 ([{fd=4,
revents=POLLIN}])
recvfrom(4, "R\0\0\0\10\0\0\0\0E\0\0\0RSFATAL\0C3D000\0Mdat"..., 16384, 0,
NULL, NULL) = 92
close(4) = 0
ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo
...}) = 0
writev(1, [{"/tmp:5432 - accepting connection"..., 33}, {"\n", 1}], 2) = 34
exit_group(0) = ?
Here's the tail end "strace pg_isready" output for musl Postgres built on
Ubuntu 13.10 but running on old Linux:
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1
sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000,
NULL, 0) = 63
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0
close(3) = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
writev(1, [{"/tmp:5432 - no response", 23}, {"\n", 1}], 2) = 24
exit_group(2) = ?
For my next step I'll try building musl Postgres with the --enable-cassert
option. What else can I do to debug this?
John
On Sun, Apr 13, 2014 at 12:04 PM, Euler Taveira <euler@timbira.com.br>wrote:
On 13-04-2014 00:40, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python usingmusl
as a foundation for Postgres.
This is not a bug. This kind of discussion belongs to -hackers.
While reading this email, I give musl a try. I'm using Debian jessie
which contains musl 1.0.0. I compiled the source (git master) using
CC="musl-gcc" and disabled zlib and readline. It passed all regression
tests. I also tried a pgbench which ran like a charm. (After installed
the binaries I had to set the libray path for musl in
/etc/ld-musl-x86_64.d.)I'm using musl to increase the portability of the Postgres binary. I
build
on Ubuntu 13.10 but will runs on older Linux boxes.
Could you give details about your architecture?
Built on 3.8.0-35-generic #50-Ubuntu SMP Tue Dec 3 01:25:33 UTC 2013 i686
i686 i686 GNU/Linux
Runs fine there.
Moved postgres install directory to 2.4.21-4.EL #1 Fri Oct 3 18:13:58 EDT
2003 i686 i686 i386 GNU/Linux
Not working fully there.
Note: It's says 2.4 kernel but I've been told that's misleading. The kernel
has upgrades that make it effectively 2.6.
For my next step I'll try building musl Postgres with the
--enable-cassert
option. What else can I do to debug this?
Is postgres running and listening 5432? Did you try another binaries
(eg. psql) or even postgres in single mode?
I rebuilt with --enable-cassert, reran and no difference on 2.4 machine.
It's listening even on 2.4 machine. I ran strace on main postgres process
and got the following while running pg_isready.
Process 23811 attached - interrupt to quit
Process 23811 detached
But pg_isready just reports "/tmp:5432 - no response" after a few seconds.
Show quoted text
--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento
On Sun, Apr 13, 2014 at 4:19 PM, John Mudd <johnbmudd@gmail.com> wrote:
It's listening even on 2.4 machine. I ran strace on main postgres process
and got the following while running pg_isready.Process 23811 attached - interrupt to quit
Process 23811 detached
Correction, the main postgres process does not indicate any awareness that
pg_isready is trying to connect. The msgs I listed above are just from
strace attaching.
The same happens if I try psql. Psql just waits indefinitely.
Hi,
On 2014-04-13 16:08:00 -0400, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python using musl
as a foundation for Postgres.I'm using musl to increase the portability of the Postgres binary. I build
on Ubuntu 13.10 but will runs on older Linux boxes.So far I get better results with the musl Postgres built on modern Ubuntu
and running on an old kernel than building Postgres directly on the old
Linux using standard C library. But the musl Postgres is still not working
fully. I'm not getting responses from the server.
I tend to think that this is more a matter for the musl devs than
postgres. Postgres works on a fair numbers of libcs and musl is pretty
new and rough around the edges.
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not implemented)
This looks suspicious.
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1
sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000,
NULL, 0) = 63clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0
Here a poll didn't return anything. You'll likely have to look at
the server side.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Apr 13, 2014 at 4:28 PM, Andres Freund <andres@2ndquadrant.com>wrote:
Hi,
On 2014-04-13 16:08:00 -0400, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Python usingmusl
as a foundation for Postgres.
I'm using musl to increase the portability of the Postgres binary. I
build
on Ubuntu 13.10 but will runs on older Linux boxes.
So far I get better results with the musl Postgres built on modern Ubuntu
and running on an old kernel than building Postgres directly on the old
Linux using standard C library. But the musl Postgres is still notworking
fully. I'm not getting responses from the server.
I tend to think that this is more a matter for the musl devs than
postgres. Postgres works on a fair numbers of libcs and musl is pretty
new and rough around the edges.
Okay. I just wanted to check here too.
clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
This looks suspicious.
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLOUT|POLLERR, revents=POLLOUT}], 1, 3000) = 1
sendto(3, "\0\0\0?\0\3\0\0user\0jmudd\0database\0jmud"..., 63, 0x4000,
NULL, 0) = 63clock_gettime(0, 0xbfffa5a8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
poll([{fd=3, events=POLLIN|POLLERR}], 1, 3000) = 0Here a poll didn't return anything. You'll likely have to look at
the server side.
Yes, the server. It's in a tight loop. This is all it's doing. Thanks, I'll
look into this.
clock_gettime(0, 0xbfffded8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
clock_gettime(0, 0xbfffded8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
clock_gettime(0, 0xbfffded8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
clock_gettime(0, 0xbfffded8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
clock_gettime(0, 0xbfffded8) = -1 ENOSYS (Function not
implemented)
gettimeofday(NULL, {300, 0}) = 0
Show quoted text
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Import Notes
Reply to msg id not found: CAGDMk9HFDZ+FYHK6wQ-Nf4v3FhxMB6FzpSa_7_KpweLj8c2pA@mail.gmail.com
On 04/13/2014 10:19 PM, John Mudd wrote:
On Sun, Apr 13, 2014 at 12:04 PM, Euler Taveira <euler@timbira.com.br
<mailto:euler@timbira.com.br>> wrote:On 13-04-2014 00:40, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Pythonusing musl
as a foundation for Postgres.
This is not a bug. This kind of discussion belongs to -hackers.
While reading this email, I give musl a try. I'm using Debian jessie
which contains musl 1.0.0. I compiled the source (git master) using
CC="musl-gcc" and disabled zlib and readline. It passed all regression
tests. I also tried a pgbench which ran like a charm. (After installed
the binaries I had to set the libray path for musl in
/etc/ld-musl-x86_64.d.)I'm using musl to increase the portability of the Postgres binary.
I build
on Ubuntu 13.10 but will runs on older Linux boxes.
Could you give details about your architecture?
Built on 3.8.0-35-generic #50-Ubuntu SMP Tue Dec 3 01:25:33 UTC 2013
i686 i686 i686 GNU/Linux
Runs fine there.Moved postgres install directory to 2.4.21-4.EL #1 Fri Oct 3 18:13:58
EDT 2003 i686 i686 i386 GNU/Linux
Not working fully there.
Note: It's says 2.4 kernel but I've been told that's misleading. The
kernel has upgrades that make it effectively 2.6.
This looks like a RHEL3 version number, and while that kernel was kind
of creepy thing with a lot of patches (also from the 2.6 era) backport
it is definititly not a 2.6 kernel(also note that 2.6.0 was released in
december of 2003 while RHEL 3 was released in october that year. Juding
from the version number this also seems to be based on the very first
RHEL3 kernel missing all follow up bugfixed during the RHEL3 lifetime.
So I would be very much not surprised if a modern and young C-library
running on a >10 year old kernel that never looked like the upstream
kernel misbehaved with a complex userspace app like postgresql.
Stefan
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, Apr 14, 2014 at 2:06 PM, Stefan Kaltenbrunner <
stefan@kaltenbrunner.cc> wrote:
On 04/13/2014 10:19 PM, John Mudd wrote:
On Sun, Apr 13, 2014 at 12:04 PM, Euler Taveira <euler@timbira.com.br
<mailto:euler@timbira.com.br>> wrote:On 13-04-2014 00:40, John Mudd wrote:
I built Postgres 9.3.4 from source on top of the musl C library,
http://www.musl-libc.org/
I also built zlib, bzip2, ncurses, openssl, readline and Pythonusing musl
as a foundation for Postgres.
This is not a bug. This kind of discussion belongs to -hackers.
While reading this email, I give musl a try. I'm using Debian jessie
which contains musl 1.0.0. I compiled the source (git master) using
CC="musl-gcc" and disabled zlib and readline. It passed allregression
tests. I also tried a pgbench which ran like a charm. (After
installed
the binaries I had to set the libray path for musl in
/etc/ld-musl-x86_64.d.)I'm using musl to increase the portability of the Postgres binary.
I build
on Ubuntu 13.10 but will runs on older Linux boxes.
Could you give details about your architecture?
Built on 3.8.0-35-generic #50-Ubuntu SMP Tue Dec 3 01:25:33 UTC 2013
i686 i686 i686 GNU/Linux
Runs fine there.Moved postgres install directory to 2.4.21-4.EL #1 Fri Oct 3 18:13:58
EDT 2003 i686 i686 i386 GNU/Linux
Not working fully there.
Note: It's says 2.4 kernel but I've been told that's misleading. The
kernel has upgrades that make it effectively 2.6.This looks like a RHEL3 version number, and while that kernel was kind
of creepy thing with a lot of patches (also from the 2.6 era) backport
it is definititly not a 2.6 kernel(also note that 2.6.0 was released in
december of 2003 while RHEL 3 was released in october that year. Juding
from the version number this also seems to be based on the very first
RHEL3 kernel missing all follow up bugfixed during the RHEL3 lifetime.So I would be very much not surprised if a modern and young C-library
running on a >10 year old kernel that never looked like the upstream
kernel misbehaved with a complex userspace app like postgresql.
Update:
I contacted musl developers, received a one line patch to the
gettimeofday() fallback code, rebuilt the musl libc, copied the lib to my
old linux box and Postgres is running well now.
=======================
All 136 tests passed.
=======================
It's interesting that when I built Postgres on this same old Linux but it
fails to run.
============== removing existing temp installation ==============
============== creating temporary installation ==============
============== initializing database system ==============
============== starting postmaster ==============
pg_regress: postmaster did not respond within 60 seconds
Building with musl on a modern Linux works on an old Linux. But building
Postgres on the old Linux with the native libc gives me a broken Postgres.
That's why I'm interested in musl libc.
Show quoted text
Stefan
Import Notes
Reply to msg id not found: CAGDMk9GLJ0ckQ7yHNA2VQKhaSaFMSBXNPop+EgfEpuWHmdzMog@mail.gmail.com