BUG #15964: vacuumdb.c:187:10: error: use of undeclared identifier 'FD_SETSIZE'
The following bug has been logged on the website:
Bug reference: 15964
Logged by: sean
Email address: jungleboogie0@gmail.com
PostgreSQL version: 12beta3
Operating system: kern.version=OpenBSD 6.6-beta (GENERIC.MP) #221: F
Description:
Hi All,
I'm trying to compile Postgresql from master on my OpenBSD machine, but I'm
getting a build failure here:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
if (concurrentCons > FD_SETSIZE - 1)
^
vacuumdb.c:187:10: error: use of undeclared identifier 'FD_SETSIZE'
FD_SETSIZE
- 1);
^
2 errors generated.
gmake[3]: *** [<builtin>: vacuumdb.o] Error 1
gmake[3]: Leaving directory '/home/sean/bin/postgresql/src/bin/scripts'
gmake[2]: *** [Makefile:41: all-scripts-recurse] Error 2
gmake[2]: Leaving directory '/home/sean/bin/postgresql/src/bin'
gmake[1]: *** [Makefile:42: all-bin-recurse] Error 2
gmake[1]: Leaving directory '/home/sean/bin/postgresql/src'
gmake: *** [GNUmakefile:11: all-src-recurse] Error 2
sean@beginning:postgresql$ sysctk kern.version
OS information:
$ sysctl kern.version
kern.version=OpenBSD 6.6-beta (GENERIC.MP) #221: Fri Aug 16 16:00:01 MDT
2019
deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
gmake information:
$ gmake -v
GNU Make 4.2.1
Built for x86_64-unknown-openbsd6.6
Copyright (C) 1988-2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
clang information:
$ cc -v
OpenBSD clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
Target: amd64-unknown-openbsd6.6
Thread model: posix
InstalledDir: /usr/bin
Posgresql commit: d78d452bc5ac46a970e4fca2b31f3d533815c39a
config options:
./configure --enable-cassert --enable-debug --with-perl \
--with-python --with-tcl --with-tclconfig=/usr/local/lib/tcl/tcl8.6/
\
--with-includes=/usr/local/include
What am I doing wrong? What more information would you like?
https://www.postgresql.org/docs/current/bug-reporting.html
5.1 states:
PostgreSQL fails to compile, build, or install according to the instructions
on supported platforms.
Thanks!
P.S. The submit bug page does not list master/current/trunk as a version,
and I needed to make a selection, which doesn't match the commit I
referenced above. Fear not, I am using the git repo here:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=summary and cloned it
locally.
PG Bug reporting form <noreply@postgresql.org> writes:
I'm trying to compile Postgresql from master on my OpenBSD machine, but I'm
getting a build failure here:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
if (concurrentCons > FD_SETSIZE - 1)
^
Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea. But I wonder why
the OpenBSD machines in the buildfarm aren't complaining.
regards, tom lane
Hi,
On 2019-08-17 19:06:33 +0000, PG Bug reporting form wrote:
I'm trying to compile Postgresql from master on my OpenBSD machine, but I'm
getting a build failure here:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
if (concurrentCons > FD_SETSIZE - 1)
^
vacuumdb.c:187:10: error: use of undeclared identifier 'FD_SETSIZE'
FD_SETSIZE
- 1);
^
2 errors generated.
Yea, that file is clearly missing an include for #include
<sys/select.h>. I don't immediately see how that file is included on
other platforms, but it's obviously not enough for your version of
openbsd.
I assume adding
#ifdef HAVE_SYS_SELECT_H
#include <sys/select.h>
#endif
after
#include "postgres_fe.h"
in vacuumdb.c fixes the problem?
Michael, it looks like this is an oversight in
commit 5f3840370b63fdf17f704a285623ccc233fa8d4f
Author: Michael Paquier <michael@paquier.xyz>
Date: 2019-07-19 09:31:58 +0900
Refactor parallelization processing code in src/bin/scripts/
Greetings,
Andres Freund
Hi,
On 2019-08-17 16:31:01 -0400, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
I'm trying to compile Postgresql from master on my OpenBSD machine, but I'm
getting a build failure here:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
if (concurrentCons > FD_SETSIZE - 1)
^Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea.
Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?
But I wonder why the OpenBSD machines in the buildfarm aren't complaining.
Or even why it works on other platforms. On linux/glibc it looks like
sys/select.h is included by sys/types.h under certain conditions:
#ifdef __USE_MISC
/* In BSD <sys/types.h> is expected to define BYTE_ORDER. */
# include <endian.h>
/* It also defines `fd_set' and the FD_* macros for `select'. */
# include <sys/select.h>
#endif /* Use misc. */
which in turn is included by stddef.h under certain conditions:
#if defined __USE_MISC || defined __USE_XOPEN_EXTENDED
# include <sys/types.h> /* we need int32_t... */
stddef.h is included by c.h, so will obviously be included in any of our
.c files.
_USE_MISC and _XOPEN_SOURCE_EXTENDED are defined by default, unless
they're explicitly specified (which we don't).
I assume there's some compiler specific going on. Our animals use an old
gcc version, whereas Sean's uses a modern clang. Not hard to imagine
that the compiler specific bits look different enough to cause such a discrepancy.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2019-08-17 16:31:01 -0400, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea.
Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?
Yeah, that would likely be cleaner than just responding to this directly.
But I wonder why the OpenBSD machines in the buildfarm aren't complaining.
Or even why it works on other platforms.
Indeed. I've confirmed the bug report on a local OpenBSD 6.4 build
(clang 6.0.0), and with "make -k" I can see that reindexdb.c fails
likewise. But this is unsurprising given that POSIX says that
FD_SETSIZE is declared by sys/select.h. And I'm not that astonished
by it not failing on Linux, either; the glibc headers are well known
for #including much more than POSIX says they must. But it's
surprising and worrisome that none of our other buildfarm platforms
complained. Seems like somebody should start running an animal with
a more modern OpenBSD, at least.
regards, tom lane
Hi,
Heh, just discovered
/messages/by-id/20160921171819.1357.29774@wrigleys.postgresql.org
from the same reporter, where we went through this before :/
On 2019-08-17 17:59:05 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2019-08-17 16:31:01 -0400, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea.Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?Yeah, that would likely be cleaner than just responding to this directly.
I'll go and do that.
But I wonder why the OpenBSD machines in the buildfarm aren't complaining.
Or even why it works on other platforms.
Indeed. I've confirmed the bug report on a local OpenBSD 6.4 build
(clang 6.0.0), and with "make -k" I can see that reindexdb.c fails
likewise. But this is unsurprising given that POSIX says that
FD_SETSIZE is declared by sys/select.h.
Right.
And I'm not that astonished by it not failing on Linux, either; the
glibc headers are well known for #including much more than POSIX says
they must. But it's surprising and worrisome that none of our other
buildfarm platforms complained. Seems like somebody should start
running an animal with a more modern OpenBSD, at least.
I don't see an easy option for making glibc less aggressive on that
front, unfortunately. And I don't want to start running a vm with
openbsd or such.
I wonder if it'd be worth setting up a buildfarm animal on linux using
musl as the libc, based on a quick look it includes less. Doesn't appear
to find this issue however [1]The relevant commit's explanation isn't very helpful: commit 2555fe1b6da21119f87d407ef3838648d5fd601d Author: Rich Felker <dalias@aerifal.cx> Date: 2011-04-10 22:47:43 -0400, so it's perhaps not worth it. It fails
with src/bin/pg_upgrade/file.c including linux/fs.h without a proper
configure check:
#ifdef __linux__
#include <sys/ioctl.h>
#include <linux/fs.h>
#endif
Probably worth fixing, even if it can also fixed by just symlinking
/usr/include/linux into /usr/include/x86_64-linux-musl/ (or whatever is
appropriate for the current platform).
[1]: The relevant commit's explanation isn't very helpful: commit 2555fe1b6da21119f87d407ef3838648d5fd601d Author: Rich Felker <dalias@aerifal.cx> Date: 2011-04-10 22:47:43 -0400
commit 2555fe1b6da21119f87d407ef3838648d5fd601d
Author: Rich Felker <dalias@aerifal.cx>
Date: 2011-04-10 22:47:43 -0400
add some ugly legacy type names in sys/types.h (u_char etc.)
diff --git a/include/sys/types.h b/include/sys/types.h
index 216574ad..5c6b2090 100644
--- a/include/sys/types.h
+++ b/include/sys/types.h
@@ -59,6 +59,14 @@ extern "C" {
#ifdef _GNU_SOURCE
typedef unsigned long caddr_t;
+typedef unsigned char u_char;
+typedef unsigned short u_short, ushort;
+typedef unsigned u_int, uint;
+typedef unsigned long u_long, ulong;
+typedef long long quad_t;
+typedef unsigned long long u_quad_t;
+#include <endian.h>
+#include <sys/select.h>
#include <sys/sysmacros.h>
#endif
Greetings,
Andres Freund
On Sat Aug 17, 2019 at 3:41 PM Andres Freund wrote:
Hi,
Heh, just discovered
/messages/by-id/20160921171819.1357.29774@wrigleys.postgresql.org
from the same reporter, where we went through this before :/
Oh, wow! Sorry I didn't remember that. Guess I didn't do a good enough job
searching through the archives.
On 2019-08-17 17:59:05 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2019-08-17 16:31:01 -0400, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea.Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?Yeah, that would likely be cleaner than just responding to this directly.
I'll go and do that.
But I wonder why the OpenBSD machines in the buildfarm aren't complaining.
Or even why it works on other platforms.
Indeed. I've confirmed the bug report on a local OpenBSD 6.4 build
(clang 6.0.0), and with "make -k" I can see that reindexdb.c fails
likewise. But this is unsurprising given that POSIX says that
FD_SETSIZE is declared by sys/select.h.Right.
I noticed all the machines in your buildfarm are running OpenBSD 5.9 from March
2016 and I believe before clang was the default compiler. I'll see what I can
find on local craigslist for inexpensive amd64 machines and then have it build
Postgres.
Thanks for the efforts you two have put into tracking this down.
Hi,
On 2019-08-17 15:41:42 -0700, Andres Freund wrote:
On 2019-08-17 17:59:05 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2019-08-17 16:31:01 -0400, Tom Lane wrote:
PG Bug reporting form <noreply@postgresql.org> writes:
vacuumdb.c:184:26: error: use of undeclared identifier 'FD_SETSIZE'
Hmm, it seems somebody removed the "#include <sys/select.h>" from
that file, which was a pretty not-bright idea.Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?Yeah, that would likely be cleaner than just responding to this directly.
I'll go and do that.
Hm. This made me think: Why is
if (concurrentCons > FD_SETSIZE - 1)
{
pg_log_error("too many parallel jobs requested (maximum: %d)",
FD_SETSIZE - 1);
a useful test / error message? FD_SETSIZE is about the numerical value
of fds. There will usually be at least three fds open, starting at 0 -
but there easily can be more, depending on what the reindexdb/vacuumdb
caller is doing.
I'm prone to off-by-one errors, but I think this will over-estimate the
number of allowed connections by 1 even if there's just
stdin/stdout/stderr open.
Looks like this has been copied forward from
commit a17923204736d8842eade3517d6a8ee81290fca4
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date: 2015-01-23 15:02:45 -0300
vacuumdb: enable parallel mode
This mode allows vacuumdb to open several server connections to vacuum
or analyze several tables simultaneously.
Author: Dilip Kumar. Some reworking by �lvaro Herrera
Reviewed by: Jeff Janes, Amit Kapila, Magnus Hagander, Andres Freund
Alvaro, Dilip?
The pre 12 pgbench at least just subtracted 10 from FD_SETSIZE, to make
room for some pre-existing fds.
What is the reason that this doesn't use poll() in the first place? It
surely can't be - as it is the case for pgbench - that the minimum time
resolution is 1ms?
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
Hm. This made me think: Why is
if (concurrentCons > FD_SETSIZE - 1)
a useful test / error message?
Good point, it's not. Subtracting off 10 or so might be reasonable.
What is the reason that this doesn't use poll() in the first place?
We still support platforms without that, no? Windows, for one.
regards, tom lane
Hi,
On 2019-08-17 20:59:56 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Hm. This made me think: Why is
if (concurrentCons > FD_SETSIZE - 1)
a useful test / error message?Good point, it's not. Subtracting off 10 or so might be reasonable.
I wonder if we shouldn't just do the same as pgbench now does, and just
only error when adding a too large fd. That does reduce the number of
detected cases, true, but it also adds robustness, because larger fds
are properly handled.
What is the reason that this doesn't use poll() in the first place?
We still support platforms without that, no? Windows, for one.
Ah, right. I forgot that because we do rely on poll() in latch.c - but
we do have an alternative windows implementation there...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2019-08-17 20:59:56 -0400, Tom Lane wrote:
Good point, it's not. Subtracting off 10 or so might be reasonable.
I wonder if we shouldn't just do the same as pgbench now does, and just
only error when adding a too large fd.
Works for me.
regards, tom lane
On 2019-Aug-17, Andres Freund wrote:
Hm. This made me think: Why is
if (concurrentCons > FD_SETSIZE - 1)
{
pg_log_error("too many parallel jobs requested (maximum: %d)",
FD_SETSIZE - 1);a useful test / error message? FD_SETSIZE is about the numerical value
of fds. There will usually be at least three fds open, starting at 0 -
but there easily can be more, depending on what the reindexdb/vacuumdb
caller is doing.
Hmm ... yeah, this is clearly not perfect. In my laptop, vacuumdb -j 1021
works; 1022 and 1023 fail like this after opening a number of conns:
vacuumdb: vacuuming database "alvherre"
vacuumdb: error: could not connect to database alvherre: could not look up local user ID 1000: Too many open files
and 1024 fails like this immediately on start:
vacuumdb: error: too many parallel jobs requested (maximum: 1023)
After 'ulimit -n 1200', vacuumdb -j1023 fails like this:
vacuumdb: vacuuming database "alvherre"
*** buffer overflow detected ***: vacuumdb terminated
Aborted
So I agree that we need a fix.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Sun, Aug 18, 2019 at 10:59:54PM -0400, Alvaro Herrera wrote:
Hmm ... yeah, this is clearly not perfect. In my laptop, vacuumdb -j 1021
works; 1022 and 1023 fail like this after opening a number of conns:So I agree that we need a fix.
How would you detect how many fds can be opened by a user in this
case?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Sun, Aug 18, 2019 at 10:59:54PM -0400, Alvaro Herrera wrote:
So I agree that we need a fix.
How would you detect how many fds can be opened by a user in this
case?
I think Andres' suggestion is probably fine: don't try to detect
it in advance. Just open the files, and error out if we need to
put an fd index >= FD_SETSIZE into an fd_set. It'll be a shade
less user-friendly, in that the program might run for a bit before
failing; but I doubt that such cases arise often enough to be worth
working harder.
regards, tom lane
On Sat, Aug 17, 2019 at 03:41:42PM -0700, Andres Freund wrote:
Heh, just discovered
/messages/by-id/20160921171819.1357.29774@wrigleys.postgresql.org
from the same reporter, where we went through this before :/
Ugh.
On 2019-08-17 17:59:05 -0400, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
Most of the parallel code was move into bin/scripts/scripts_parallel.c -
but there's still the above error check. Seems like we ought to add a
ParallelSlotsMax() or such, and use that in the error check, rather than
check FD_SETSIZE directly?Yeah, that would likely be cleaner than just responding to this directly.
I'll go and do that.
Hm. I'd like to keep the dependency to select.h directly in
scripts_parallel.c, so the ParallelSlotsMax sounds like a good thing
to me so as FD_SETSIZE remains localized. That would give the
attached which does not take care of pgbench, and there is an extra
proposal in another part of this thread. Just looking at it now..
Indeed. I've confirmed the bug report on a local OpenBSD 6.4 build
(clang 6.0.0), and with "make -k" I can see that reindexdb.c fails
likewise. But this is unsurprising given that POSIX says that
FD_SETSIZE is declared by sys/select.h.Right.
Okay, then the current code is broken in this sense. It was
surprising to not see the buildfarm complain about that though.
--
Michael
Attachments:
parallel-slot-fdlimit.patchtext/x-diff; charset=us-asciiDownload+20-4
On Mon, Aug 19, 2019 at 12:32:51AM -0400, Tom Lane wrote:
I think Andres' suggestion is probably fine: don't try to detect
it in advance. Just open the files, and error out if we need to
put an fd index >= FD_SETSIZE into an fd_set. It'll be a shade
less user-friendly, in that the program might run for a bit before
failing; but I doubt that such cases arise often enough to be worth
working harder.
Thanks. I have somewhat not catched what Andres was suggesting here.
So attached are two patches:
- 0001 should take care of the compilation failure, by moving
FD_SETSIZE into scripts_parallel.c.
- 0002 makes vacuumdb and reindexdb fail when trying to assign a
socket with an unsupported range. Should this bit be backpatched? We
are doing that for vacuumdb for some time now, and the error is
confusing so I would prefer fixing it on older branches as well.
Thoughts?
--
Michael
Michael Paquier <michael@paquier.xyz> writes:
On Sat, Aug 17, 2019 at 03:41:42PM -0700, Andres Freund wrote:
Indeed. I've confirmed the bug report on a local OpenBSD 6.4 build
(clang 6.0.0), and with "make -k" I can see that reindexdb.c fails
likewise. But this is unsurprising given that POSIX says that
FD_SETSIZE is declared by sys/select.h.
Right.
Okay, then the current code is broken in this sense. It was
surprising to not see the buildfarm complain about that though.
Yeah --- see 51c3e9fad. We should have set up a modern-OpenBSD
buildfarm member back then, but failed to. Let's do that this
time.
regards, tom lane
On 2019-08-19 07:14, Tom Lane wrote:
Okay, then the current code is broken in this sense. It was
surprising to not see the buildfarm complain about that though.Yeah --- see 51c3e9fad. We should have set up a modern-OpenBSD
buildfarm member back then, but failed to. Let's do that this
time.
I could set up a OpenBSD 6.5 animal. I have resources available.
Will do it the next coming days.
/Mikael
On Mon, Aug 19, 2019 at 07:32:21AM +0200, Mikael Kjellström wrote:
I could set up a OpenBSD 6.5 animal. I have resources available.
Will do it the next coming days.
Thanks! That's great.
--
Michael
On Mon, Aug 19, 2019 at 02:12:51PM +0900, Michael Paquier wrote:
Thanks. I have somewhat not catched what Andres was suggesting here.
So attached are two patches:
- 0001 should take care of the compilation failure, by moving
FD_SETSIZE into scripts_parallel.c.
I have been able to work on this one more, and applied a fix that
should address the compilation failure. While on it, I have reduced
the maximum number of parallel slots allowed to give some room for
pre-existing fds as pgbench did before moving to its poll()
implementation.
- 0002 makes vacuumdb and reindexdb fail when trying to assign a
socket with an unsupported range. Should this bit be backpatched? We
are doing that for vacuumdb for some time now, and the error is
confusing so I would prefer fixing it on older branches as well.
For this one, I am still not completely sure which way we would want
to go. The issue exists down to 9.6 for vacuumdb, so could a fix like
the one I previously proposed be acceptable for a back-patch as this
is not worth the complexity? Should we try to move to a poll()-based
implementation like pgbench on HEAD, which most likely would result in
having pgbench also use parallel slots with a bit of refactoring, no?
--
Michael