Solaris source code

Started by Bruce Momjianalmost 25 years ago41 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Naomi Walker
nwalker@eldocomp.com
In reply to: Bruce Momjian (#1)
Re: Solaris source code

At 04:30 PM 7/5/01 -0400, Bruce Momjian wrote:

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

Bruce,

We are about to roll out PostgreSQL on Solaris, and I am interested in any
Solaris specific gotcha's. Do you have some specifics in mind, or was this
just general preventive maintenance type steps?
--
Naomi Walker
Chief Information Officer
Eldorado Computing, Inc.
602-604-3100 ext 242

#3Bruce Momjian
bruce@momjian.us
In reply to: Naomi Walker (#2)
Re: Solaris source code

At 04:30 PM 7/5/01 -0400, Bruce Momjian wrote:

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

Bruce,

We are about to roll out PostgreSQL on Solaris, and I am interested in any
Solaris specific gotcha's. Do you have some specifics in mind, or was this
just general preventive maintenance type steps?

Preventative. I have heard Solaris has higher context switching and that
may effect us because we use processes instead of threads.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#4Nathan Myers
ncm@zembu.com
In reply to: Naomi Walker (#2)
Re: Solaris source code

On Thu, Jul 05, 2001 at 02:03:31PM -0700, Naomi Walker wrote:

We are about to roll out PostgreSQL on Solaris, and I am interested
in any Solaris specific gotcha's. Do you have some specifics in mind,
or was this just general preventive maintenance type steps?

There have been reports of trouble with Unix sockets on Solaris.
You can use TCP sockets, which might be slower; or change, in
src/backend/libpq/pqcomm.c, the line

listen(fd, SOMAXCONN);

to

listen(fd, 1024);

(Cf. Stevens, "Unix Network Programming, Volume 1", pp. 96 and 918.)

I don't know (and Stevens doesn't hint) of any reason not to fold
this change into the mainline sources. However, we haven't heard
from the people who had had trouble with Unix sockets whether this
change actually fixes their problems.

The effect of the change is to make it much less likely for a
connection request to be rejected when connections are being opened
very frequently.

Nathan Myers
ncm@zembu.com

#5Mathijs Brands
mathijs@ilse.nl
In reply to: Bruce Momjian (#1)
Re: Solaris source code

On Thu, Jul 05, 2001 at 04:30:40PM -0400, Bruce Momjian allegedly wrote:

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

Cool. It would be nice to know why the regression tests fail on Solaris when
using a UNIX socket.

Cheers,

Mathijs

#6Mathijs Brands
mathijs@ilse.nl
In reply to: Naomi Walker (#2)
Re: Solaris source code

On Thu, Jul 05, 2001 at 02:03:31PM -0700, Naomi Walker allegedly wrote:

At 04:30 PM 7/5/01 -0400, Bruce Momjian wrote:

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

Bruce,

We are about to roll out PostgreSQL on Solaris, and I am interested in any
Solaris specific gotcha's. Do you have some specifics in mind, or was this
just general preventive maintenance type steps?

PostgreSQL 7.1 fails the regression tests when using a UNIX socket,
which is faster than a TCP/IP socket (when both the client and the
server are running on the same machine). We're running a few small
PostgreSQL databases on Solaris and we're going to implement a bigger
one in the near future. If you connect via TCP/IP sockets, you should be
safe. We're using JDBC to connect to the database and JDBC always uses
a TCP/IP socket. So far we haven't run into any real problems, although
PostgreSQL did crash once, for unknown reasons (probably becase someone
was messing with it).

Not really helpful, I guess. Doing some testing of your own is highly
recommended ;)

Cheers,

Mathijs

#7Mathijs Brands
mathijs@ilse.nl
In reply to: Bruce Momjian (#1)
Re: Solaris source code

On Mon, Jul 09, 2001 at 02:03:16PM -0700, Nathan Myers allegedly wrote:

On Mon, Jul 09, 2001 at 02:24:17PM +0200, Mathijs Brands wrote:

On Thu, Jul 05, 2001 at 02:03:31PM -0700, Naomi Walker allegedly wrote:

At 04:30 PM 7/5/01 -0400, Bruce Momjian wrote:

I have purchased the Solaris source code from Sun for $80. (I could
have downloaded it for free after faxing them an 11 page contract, but I
decided I wanted the CD's.) See the slashdot story at:

http://slashdot.org/article.pl?sid=01/06/30/1224257&mode=thread

My hope is that I can use the source code to help debug Solaris
PostgreSQL problems. It includes source for the kernel and all user
programs. The code is similar to *BSD kernels. It is basically Unix
SvR4 with Sun's enhancements. It has both AT&T and Sun copyrights on
the files.

Bruce,

We are about to roll out PostgreSQL on Solaris, and I am interested in any
Solaris specific gotcha's. Do you have some specifics in mind, or was this
just general preventive maintenance type steps?

PostgreSQL 7.1 fails the regression tests when using a UNIX socket,
which is faster than a TCP/IP socket (when both the client and the
server are running on the same machine).

Have you tried increasing the argument to listen in libpq/pgcomm.c
from SOMAXCONN to 1024? I think many people would be very interested
in your results.

OK, I tried using 1024 (and later 128) instead of SOMAXCONN (defined to
be 5 on Solaris) in src/backend/libpq/pqcomm.c and ran a few regression
tests on two different Sparc boxes (Solaris 7 and 8). The regression
test still fails, but for a different reason. The abstime test fails;
not only on Solaris but also on FreeBSD (4.3-RELEASE).

*** ./expected/abstime.out  Thu May  3 21:00:37 2001
--- ./results/abstime.out Tue Jul 10 10:34:18 2001
***************
*** 47,56 ****
       | Sun Jan 14 03:14:21 1973 PST
       | Mon May 01 00:30:30 1995 PDT
       | epoch
-      | current
       | -infinity
       | Sat May 10 23:59:12 1947 PST
! (6 rows)
  SELECT '' AS six, ABSTIME_TBL.*
     WHERE ABSTIME_TBL.f1 > abstime '-infinity';
--- 47,55 ----
       | Sun Jan 14 03:14:21 1973 PST
       | Mon May 01 00:30:30 1995 PDT
       | epoch
       | -infinity
       | Sat May 10 23:59:12 1947 PST
! (5 rows)

SELECT '' AS six, ABSTIME_TBL.*
WHERE ABSTIME_TBL.f1 > abstime '-infinity';

======================================================================

I've checked the FreeBSD and Linux headers and they've got SOMAXCONN set
to 128.

Here's a snippet from the linux listen(2) manpage:

BUGS
If the socket is of type AF_INET, and the backlog argument
is greater than the constant SOMAXCONN (128 in Linux 2.0 &
2.2), it is silently truncated to SOMAXCONN. Don't rely
on this value in portable applications since BSD (and some
BSD-derived systems) limit the backlog to 5.

I've checked Solaris 2.6, 7 and 8 and the kernels have a default value
of 128 for the number of backlog connections. This number can be
increased to 1000 (maybe even larger). On Solaris 2.4 and 2.5 it is
appearently set to 32. Judging from Adrian Cockcrofts Solaris tuning
guide Sun has been using a default value of 128 from Solaris 2.5.1
on. You do need some patches for 2.5.1: patches 103582 & 103630 (SPARC)
or patches 103581 & 10361 (X86). Later versions of Solaris don't need
any patches. You can check (and set) the number of backlog connections
by using the following command:

Solaris 2.3, 2.4, 2.5 and unpatched 2.5.1:
/usr/sbin/ndd /dev/tcp tcp_conn_req_max (untested)

Solaris 2.5.1 (patched), 2.6, 7 and 8:
/usr/sbin/ndd /dev/tcp tcp_conn_req_max_q

It'd probably be a good idea to use a value of 128 for the number of
backlog connections and not SOMAXCONN. If the requested number of
backlog connections is bigger than the number the kernel allows, it
should be truncated. Of course, there's no guarantee that this won't
cause problems on arcane platforms such as Ultrix (if it is still
supported).

The Apache survival guide has more info on TCP/IP tuning for several
platforms and includes information on the listen backlog.

Cheers,

Mathijs

Ps. Just checking IRIX 6.5 - it's got the backlog set to 1000
connctions.
--
And the beast shall be made legion. Its numbers shall be increased a
thousand thousand fold. The din of a million keyboards like unto a great
storm shall cover the earth, and the followers of Mammon shall tremble.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mathijs Brands (#7)
SOMAXCONN (was Re: Solaris source code)

Mathijs Brands <mathijs@ilse.nl> writes:

OK, I tried using 1024 (and later 128) instead of SOMAXCONN (defined to
be 5 on Solaris) in src/backend/libpq/pqcomm.c and ran a few regression
tests on two different Sparc boxes (Solaris 7 and 8). The regression
test still fails, but for a different reason. The abstime test fails;
not only on Solaris but also on FreeBSD (4.3-RELEASE).

The abstime diff is to be expected (if you look closely, the test is
comparing 'current' to 'June 30, 2001'. Ooops). If that's the only
diff then you are in good shape.

Based on this and previous discussions, I am strongly tempted to remove
the use of SOMAXCONN and instead use, say,

#define PG_SOMAXCONN 1000

defined in config.h.in. That would leave room for configure to twiddle
it, if that proves necessary. Does anyone know of a platform where this
would cause problems? AFAICT, all versions of listen(2) are claimed to
be willing to reduce the passed parameter to whatever they can handle.

regards, tom lane

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: SOMAXCONN (was Re: Solaris source code)

Mathijs Brands <mathijs@ilse.nl> writes:

OK, I tried using 1024 (and later 128) instead of SOMAXCONN (defined to
be 5 on Solaris) in src/backend/libpq/pqcomm.c and ran a few regression
tests on two different Sparc boxes (Solaris 7 and 8). The regression
test still fails, but for a different reason. The abstime test fails;
not only on Solaris but also on FreeBSD (4.3-RELEASE).

The abstime diff is to be expected (if you look closely, the test is
comparing 'current' to 'June 30, 2001'. Ooops). If that's the only
diff then you are in good shape.

Based on this and previous discussions, I am strongly tempted to remove
the use of SOMAXCONN and instead use, say,

#define PG_SOMAXCONN 1000

defined in config.h.in. That would leave room for configure to twiddle
it, if that proves necessary. Does anyone know of a platform where this
would cause problems? AFAICT, all versions of listen(2) are claimed to
be willing to reduce the passed parameter to whatever they can handle.

Could we test SOMAXCONN and set PG_SOMAXCONN to 1000 only if SOMAXCONN1
is less than 1000?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#10Nathan Myers
ncm@zembu.com
In reply to: Bruce Momjian (#9)
Re: SOMAXCONN (was Re: Solaris source code)

On Tue, Jul 10, 2001 at 05:06:28PM -0400, Bruce Momjian wrote:

Mathijs Brands <mathijs@ilse.nl> writes:

OK, I tried using 1024 (and later 128) instead of SOMAXCONN (defined to
be 5 on Solaris) in src/backend/libpq/pqcomm.c and ran a few regression
tests on two different Sparc boxes (Solaris 7 and 8). The regression
test still fails, but for a different reason. The abstime test fails;
not only on Solaris but also on FreeBSD (4.3-RELEASE).

The abstime diff is to be expected (if you look closely, the test is
comparing 'current' to 'June 30, 2001'. Ooops). If that's the only
diff then you are in good shape.

Based on this and previous discussions, I am strongly tempted to remove
the use of SOMAXCONN and instead use, say,

#define PG_SOMAXCONN 1000

defined in config.h.in. That would leave room for configure to twiddle
it, if that proves necessary. Does anyone know of a platform where this
would cause problems? AFAICT, all versions of listen(2) are claimed to
be willing to reduce the passed parameter to whatever they can handle.

Could we test SOMAXCONN and set PG_SOMAXCONN to 1000 only if SOMAXCONN
is less than 1000?

All the OSes we know of fold it to 128, currently. We can jump it
to 10240 now, or later when there are 20GHz CPUs.

If you want to make it more complicated, it would be more useful to
be able to set the value lower for runtime environments where PG is
competing for OS resources with another daemon that deserves higher
priority.

Nathan Myers
ncm@zembu.com

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#9)
Re: SOMAXCONN (was Re: Solaris source code)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we test SOMAXCONN and set PG_SOMAXCONN to 1000 only if SOMAXCONN1
is less than 1000?

Why bother?

If you've got some plausible scenario where 1000 is too small, we could
just as easily make it 10000. I don't see the need for yet another
configure test for this.

regards, tom lane

#12Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#11)
Re: SOMAXCONN (was Re: Solaris source code)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Could we test SOMAXCONN and set PG_SOMAXCONN to 1000 only if SOMAXCONN1
is less than 1000?

Why bother?

If you've got some plausible scenario where 1000 is too small, we could
just as easily make it 10000. I don't see the need for yet another
configure test for this.

I was thinking:

#if SOMAXCONN >= 1000
#define PG_SOMAXCONN SOMAXCONN
#else
#define PG_SOMAXCONN 1000
#endif

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#12)
Re: SOMAXCONN (was Re: Solaris source code)

Bruce Momjian <pgman@candle.pha.pa.us> writes:

I was thinking:

#if SOMAXCONN >= 1000
#define PG_SOMAXCONN SOMAXCONN
#else
#define PG_SOMAXCONN 1000
#endif

Not in config.h, you don't. Unless you want <sys/socket.h> (or
whichever header defines SOMAXCONN; how consistent is that across
platforms, anyway?) to be included by everything in the system ...

regards, tom lane

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Myers (#10)
Re: SOMAXCONN (was Re: Solaris source code)

ncm@zembu.com (Nathan Myers) writes:

All the OSes we know of fold it to 128, currently. We can jump it
to 10240 now, or later when there are 20GHz CPUs.

If you want to make it more complicated, it would be more useful to
be able to set the value lower for runtime environments where PG is
competing for OS resources with another daemon that deserves higher
priority.

Hmm, good point. Does anyone have a feeling for the amount of kernel
resources that are actually sucked up by an accept-queue entry? If 128
is the customary limit, is it actually worth worrying about whether
we are setting it to 128 vs. something smaller?

regards, tom lane

#15Nathan Myers
ncm@zembu.com
In reply to: Tom Lane (#14)
Re: SOMAXCONN (was Re: Solaris source code)

On Tue, Jul 10, 2001 at 06:36:21PM -0400, Tom Lane wrote:

ncm@zembu.com (Nathan Myers) writes:

All the OSes we know of fold it to 128, currently. We can jump it
to 10240 now, or later when there are 20GHz CPUs.

If you want to make it more complicated, it would be more useful to
be able to set the value lower for runtime environments where PG is
competing for OS resources with another daemon that deserves higher
priority.

Hmm, good point. Does anyone have a feeling for the amount of kernel
resources that are actually sucked up by an accept-queue entry? If 128
is the customary limit, is it actually worth worrying about whether
we are setting it to 128 vs. something smaller?

I don't think the issue is the resources that are consumed by the
accept-queue entry. Rather, it's a tuning knob to help shed load
at the entry point to the system, before significant resources have
been committed. An administrator would tune it according to actual
system and traffic characteristics.

It is easy enough for somebody to change, if they care, that it seems
to me we have already devoted it more time than it deserves right now.

Nathan Myers
ncm@zembu.com

#16Ian Lance Taylor
ian@zembu.com
In reply to: Tom Lane (#14)
Re: SOMAXCONN (was Re: Solaris source code)

Tom Lane <tgl@sss.pgh.pa.us> writes:

ncm@zembu.com (Nathan Myers) writes:

If you want to make it more complicated, it would be more useful to
be able to set the value lower for runtime environments where PG is
competing for OS resources with another daemon that deserves higher
priority.

Hmm, good point. Does anyone have a feeling for the amount of kernel
resources that are actually sucked up by an accept-queue entry? If 128
is the customary limit, is it actually worth worrying about whether
we are setting it to 128 vs. something smaller?

Not much in the way of kernel resources is required by an entry on the
accept queue. Basically a socket structure and maybe a couple of
addresses, typically about 200 bytes or so.

But I wouldn't worry about it, and I wouldn't worry about Nathan's
suggestion for making the limit configurable, because Postgres
connections don't spend time on the queue. The postgres server will
be picking them off as fast as it can. If the server can't pick
processes off fast enough, then your system has other problems;
reducing the size of the queue won't help those problems. A large
queue will help when a large number of connections arrives
simultaneously--it will permit Postgres to deal them appropriately,
rather than causing the system to discard them on its terms.

(Matters might be different if the Postgres server were written to not
call accept when it had the maximum number of connections active, and
to just leave connections on the queue in that case. But that's not
how it works today.)

Ian

---------------------------(end of broadcast)---------------------------
TIP 842: "When the only tool you have is a hammer, you tend to treat
everything as if it were a nail."
-- Abraham Maslow

#17Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#14)
Re: SOMAXCONN (was Re: Solaris source code)

ncm@zembu.com (Nathan Myers) writes:

All the OSes we know of fold it to 128, currently. We can jump it
to 10240 now, or later when there are 20GHz CPUs.

If you want to make it more complicated, it would be more useful to
be able to set the value lower for runtime environments where PG is
competing for OS resources with another daemon that deserves higher
priority.

Hmm, good point. Does anyone have a feeling for the amount of kernel
resources that are actually sucked up by an accept-queue entry? If 128
is the customary limit, is it actually worth worrying about whether
we are setting it to 128 vs. something smaller?

All I can say is keep in mind that Solaris uses SVr4 streams, which are
quite a bit heavier than the BSD-based sockets. I don't know any
numbers.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#18Mark
mark@ldssingles.com
In reply to: Nathan Myers (#10)
vacuum problems

Quick rundown of our configuration:
Red Hat 7.1 (no changes or extras added by us)
Postgresql 7.1.2 and CVS HEAD from 07/10/2001
3.8 gb database size

I included two pgsql versions because this happens on both.

Here's the problem we're having:

We run a vacuumdb from the server on the entire database. Some large tables
are vacuumed very quickly, but the vacuum process hangs or takes more than a
few hours on a specific table (we haven't let it finish before). The vacuum
process works quickly on a table (loginhistory) with 2.8 million records, but
is extremely slow on a table (inbox) with 1.1 million records (the table with
1.1 million records is actually larger in kb size than the other table).

We've tried to vacuum the inbox table seperately ('vacuum inbox' within
psql), but this still takes hours (again we have never let it complete, we
need to use the database for development as well).

We noticed 2 things that are significant to this situatoin:
The server logs the following:

DEBUG: --Relation msginbox--
DEBUG: Pages 129921: Changed 26735, reaped 85786, Empty 0, New 0; Tup
1129861: Vac 560327, Keep/VTL 0/0, Crash 0, UnUsed 51549, MinLen 100,
MaxLen 2032; Re-using: Free/Avail. Space 359061488/359059332;
EndEmpty/Avail. Pages 0/85785. CPU 11.18s/5.32u sec.
DEBUG: Index msginbox_pkey: Pages 4749; Tuples 1129861: Deleted 76360.
CPU 0.47s/6.70u sec.
DEBUG: Index msginbox_fromto: Pages 5978; Tuples 1129861: Deleted 0.
CPU 0.37s/6.15u sec.
DEBUG: Index msginbox_search: Pages 4536; Tuples 1129861: Deleted 0.
CPU 0.32s/6.30u sec.
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES
DEBUG: XLogWrite: new log file created - consider increasing WAL_FILES

the last few lines (XLogWrite .....) repeat for ever and ever and ever. With
7.1.2 this never stops unless we run out of disk space or cancel the query.
With CVS HEAD this still continues, but the log files don't consume all disk
space, but we still have to cancel it or it might run forever.

Perhaps we need to let it run until it completes, but we thought that we
might be doing something wrong or have some data (we're converting data from
MS SQL Server) that isn't friendly.

The major issue we're facing with this is that any read or write access to
the table being vacuumed times out (obviously because the table is still
locked). We plan to use PostgreSQL in our production service, but we can't
until we get this resolved.

We're at a loss, not being familiar enough with PostgreSQL and it's source
code. Can anyone please offer some advice or suggestions?

Thanks,

Mark

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ian Lance Taylor (#16)
Re: Re: SOMAXCONN (was Re: Solaris source code)

Ian Lance Taylor <ian@zembu.com> writes:

But I wouldn't worry about it, and I wouldn't worry about Nathan's
suggestion for making the limit configurable, because Postgres
connections don't spend time on the queue. The postgres server will
be picking them off as fast as it can. If the server can't pick
processes off fast enough, then your system has other problems;

Right. Okay, it seems like just making it a hand-configurable entry
in config.h.in is good enough for now. When and if we find that
that's inadequate in a real-world situation, we can improve on it...

regards, tom lane

#20Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#19)
Re: Re: SOMAXCONN (was Re: Solaris source code)

Tom Lane writes:

Right. Okay, it seems like just making it a hand-configurable entry
in config.h.in is good enough for now. When and if we find that
that's inadequate in a real-world situation, we can improve on it...

Would anything computed from the maximum number of allowed connections
make sense?

--
Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#23)
#25Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#23)
#26Mark
mark@ldssingles.com
In reply to: Mark (#18)
#27Nathan Myers
ncm@zembu.com
In reply to: Tom Lane (#21)
#28Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Nathan Myers (#27)
#29Nathan Myers
ncm@zembu.com
In reply to: Zeugswetter Andreas SB (#28)
#30Peter Eisentraut
peter_e@gmx.net
In reply to: Nathan Myers (#29)
#31Zeugswetter Andreas SB
ZeugswetterA@wien.spardat.at
In reply to: Peter Eisentraut (#30)
#32mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB (#31)
#33Nathan Myers
ncm@zembu.com
In reply to: Zeugswetter Andreas SB (#31)
#34Nathan Myers
ncm@zembu.com
In reply to: mlw (#32)
#35Nathan Myers
ncm@zembu.com
In reply to: Peter Eisentraut (#30)
#36mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB (#31)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#36)
#38mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB (#31)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#38)
#40Nathan Myers
ncm@zembu.com
In reply to: Tom Lane (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Myers (#40)