Decrease MAX_BACKENDS to 2^16

Started by Andres Freundover 11 years ago26 messages
#1Andres Freund
andres@2ndquadrant.com

Hi,

Currently the maximum for max_connections (+ bgworkers + autovacuum) is
defined by
#define MAX_BACKENDS 0x7fffff
which unfortunately means that some things like buffer reference counts
need a full integer to store references.

Since there's absolutely no sensible scenario for setting
max_connections that high, I'd like to change the limit to 2^16, so we
can use a uint16 in BufferDesc->refcount.

Does anyone disagree? This clearly is 9.5 material, but I wanted to
raise it early, since I plan to develop some stuff for 9.5 that'd depend
on lowering it.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Greg Stark
stark@mit.edu
In reply to: Andres Freund (#1)
Re: Decrease MAX_BACKENDS to 2^16

On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Since there's absolutely no sensible scenario for setting
max_connections that high, I'd like to change the limit to 2^16, so we
can use a uint16 in BufferDesc->refcount.

Clearly there's no sensible way to run 64k backends in the current
architecture. But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3David Fetter
david@fetter.org
In reply to: Andres Freund (#1)
Re: Decrease MAX_BACKENDS to 2^16

On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:

Hi,

Currently the maximum for max_connections (+ bgworkers + autovacuum) is
defined by
#define MAX_BACKENDS 0x7fffff
which unfortunately means that some things like buffer reference counts
need a full integer to store references.

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@2ndquadrant.com
In reply to: Greg Stark (#2)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-26 11:52:44 +0100, Greg Stark wrote:

On Fri, Apr 25, 2014 at 11:15 PM, Andres Freund <andres@2ndquadrant.com> wrote:

Since there's absolutely no sensible scenario for setting
max_connections that high, I'd like to change the limit to 2^16, so we
can use a uint16 in BufferDesc->refcount.

Clearly there's no sensible way to run 64k backends in the current
architecture.

The current limit is 2^24, I am only proposing to lower it to 2^16.

But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?

I don't think it's realistic that we'll ever have more than 2^16 full
blown backends. We might (I hope!) a builtin pooler, but pooler
connections won't be full backends.
So I really don't see any practical limitation with limiting the max
number of backends to 65k.

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

Imo those are significant scalability advantages...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Andres Freund
andres@2ndquadrant.com
In reply to: David Fetter (#3)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-26 05:40:21 -0700, David Fetter wrote:

On Sat, Apr 26, 2014 at 12:15:40AM +0200, Andres Freund wrote:

Hi,

Currently the maximum for max_connections (+ bgworkers + autovacuum) is
defined by
#define MAX_BACKENDS 0x7fffff
which unfortunately means that some things like buffer reference counts
need a full integer to store references.

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Save space? For one it allows to shrink some structs (into one
cacheline!). For another it allows to combine flags and refcount in
buffer descriptors into one variable, manipulated atomically.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: Decrease MAX_BACKENDS to 2^16

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-26 11:52:44 +0100, Greg Stark wrote:

But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips. And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#5)
Re: Decrease MAX_BACKENDS to 2^16

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-26 05:40:21 -0700, David Fetter wrote:

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Save space? For one it allows to shrink some structs (into one
cacheline!).

And next week when we need some other field in a buffer header,
what's going to happen? If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8David Fetter
david@fetter.org
In reply to: Tom Lane (#6)
Re: Decrease MAX_BACKENDS to 2^16

On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-26 11:52:44 +0100, Greg Stark wrote:

But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips. And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

Rather than hard-wiring one, could we do something clever with
bit-stuffing, or would that tank performance in some terrible ways?

I know we allow for gigantic numbers of backend connections, but I've
never found a win for >2x the number of cores in the box, which at
least in my experience so far tops out in the 8-bit (in extreme cases
unsigned 8-bit) range.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#6)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-26 11:20:56 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-26 11:52:44 +0100, Greg Stark wrote:

But I don't think it's beyond the realm of possibility
that we'll reduce the overhead in the future with an eye to being able
to do that. Is it that helpful that it's worth baking in more
dependencies on that limitation?

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips.

64 byte has been the cacheline size for more than a decade and it's not
just x86. ARM has also moved to it, as well as other architectures. And
even if it's 32 or 128bit - fitting datastructures to a power of 2 of
the cacheline size is still beneficial.
I don't think many datastructures in pg deserves attention to that, but
the buffer descriptors are one of the few. It's currently one of the top
#3 sources of cpu cache issues in pg.

And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

I am pretty sure there are other ways, but since the actual cost of that
restriction imo is just about zero, it seems like a quite sensible
solution.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

Even if you include a lockless pin/unpin buffer? Besides the lwlock's
internal spinlock the buffer spinlocks are the hottest ones in PG by
far.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#7)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-26 11:22:39 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-26 05:40:21 -0700, David Fetter wrote:

Out of curiosity, where are you finding that a 32-bit integer is
causing problems that a 16-bit one would solve?

Save space? For one it allows to shrink some structs (into one
cacheline!).

And next week when we need some other field in a buffer header,
what's going to happen? If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

The problem isn't so much that we need the individual bits, but that we
need something that has an alignment of two, instead of 4.

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Josh Berkus
josh@agliodbs.com
In reply to: Andres Freund (#1)
Re: Decrease MAX_BACKENDS to 2^16

On 04/26/2014 11:06 AM, David Fetter wrote:

I know we allow for gigantic numbers of backend connections, but I've
never found a win for >2x the number of cores in the box, which at
least in my experience so far tops out in the 8-bit (in extreme cases
unsigned 8-bit) range.

For my part, I've found that anything over a few hundred backends on a
commodity server leads to serious performance degradation. Even 2000 is
enough to make most servers fall over. And with proper connection
pooling, I can pump 30,000 queries per second through about 45
connections, so the clear path to supporting large numbers of
connections is some form of built-in pooling.

However, I agree with Tom that Andres should "show his hand" before we
decrease MAX_BACKENDS by 256X.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@2ndquadrant.com
In reply to: Josh Berkus (#11)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-26 13:16:38 -0700, Josh Berkus wrote:

However, I agree with Tom that Andres should "show his hand" before we
decrease MAX_BACKENDS by 256X.

I just don't want to invest time in developing and benchmarking
something that's not going to be accepted anyway. Thus my question.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#6)
Re: Decrease MAX_BACKENDS to 2^16

On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

What I think it's necessary for is at least:

* Move the buffer content lock inline into to the buffer descriptor,
while still fitting into one cacheline.
* lockless/atomic Pin/Unpin Buffer.

TBH, that argument seems darn weak, not to mention probably applicable
only to current-vintage Intel chips. And you have not proven that
narrowing the backend ID is necessary to either goal, even if we
accepted that these goals were that important.

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

I'm satisfied with the arguments Andres presented, which I presume were weak
only because he didn't expect a staunch defense of max_connections=70000 use.
The new restriction will still permit settings an order of magnitude larger
than current *worst* practice and 2-3 orders of magnitude larger than current
good practice. If the next decade sees database server core counts grow by
two orders of magnitude or sees typical cache architectures change enough to
make the compactness irrelevant, we'll have the usual opportunities to react.
Today, the harm from contention on buffer headers totally eclipses the benefit
of allowing max_connections=70000. There's no cause to predict a hardware
development radical enough to change that conclusion.

Sure, let's not actually commit a patch to impose this limit until the first
change benefiting from doing so is ready to go. There remains an opportunity
to evaluate whether that beneficiary change is better done a different way.
By having this thread to first settle that the new max_connections limit is
essentially okay, the eventual thread concerning lock-free pin manipulation
need not inflate from discussion of this side issue.

On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:

And next week when we need some other field in a buffer header,
what's going to happen? If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

The buffer header has seen one change in nine years. Making it an inviting
site for future patches is not important.

nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#13)
Re: Decrease MAX_BACKENDS to 2^16

Noah Misch <noah@leadboat.com> writes:

On Sat, Apr 26, 2014 at 11:20:56AM -0400, Tom Lane wrote:

While I agree with you that it seems somewhat unlikely we'd ever get
past 2^16 backends, these arguments are not nearly good enough to
justify a hard-wired limitation.

I'm satisfied with the arguments Andres presented, which I presume were weak
only because he didn't expect a staunch defense of max_connections=70000 use.
The new restriction will still permit settings an order of magnitude larger
than current *worst* practice and 2-3 orders of magnitude larger than current
good practice. If the next decade sees database server core counts grow by
two orders of magnitude or sees typical cache architectures change enough to
make the compactness irrelevant, we'll have the usual opportunities to react.
Today, the harm from contention on buffer headers totally eclipses the benefit
of allowing max_connections=70000. There's no cause to predict a hardware
development radical enough to change that conclusion.

Well, let me clarify my position: I'm not against reducing MAX_BACKENDS
if we get a significant improvement by doing so. But the case for that
has not been made.

And next week when we need some other field in a buffer header,
what's going to happen? If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

The buffer header has seen one change in nine years. Making it an inviting
site for future patches is not important.

We were just a few days ago discussing (again) making changes to the
buffer allocation algorithms. It hardly seems implausible that any
useful improvements there might need new or different fields in the
buffer headers.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Peter Geoghegan
pg@heroku.com
In reply to: Noah Misch (#13)
Re: Decrease MAX_BACKENDS to 2^16

On Sat, Apr 26, 2014 at 1:30 PM, Noah Misch <noah@leadboat.com> wrote:

Sure, let's not actually commit a patch to impose this limit until the first
change benefiting from doing so is ready to go. There remains an opportunity
to evaluate whether that beneficiary change is better done a different way.
By having this thread to first settle that the new max_connections limit is
essentially okay, the eventual thread concerning lock-free pin manipulation
need not inflate from discussion of this side issue.

I agree with your remarks here. This kind of thing is only going to
become more important.

On Sat, Apr 26, 2014 at 11:22:39AM -0400, Tom Lane wrote:

And next week when we need some other field in a buffer header,
what's going to happen? If things are so tight that we need to
shave a few bits off backend IDs, the whole thing is a house of
cards anyway.

The buffer header has seen one change in nine years. Making it an inviting
site for future patches is not important.

My prototype caching patch, which seems promising to me adds an
instr_time to the BufferDesc struct. While that's obviously something
that isn't acceptable, and while I obviously could do better, it still
strikes me that that is the natural place to put such a piece of
state. That doesn't mean it's the best place, but it's still a point
worth noting in the context of this discussion.

As I mention on the thread concerning that work, the LRU-K paper
recommends a time-based delay throttling incrementation of usage_count
to address the problem of "correlated references" (5 seconds is
suggested there). At least one other major system implements a
configurable delay defaulting to 3 seconds. The 2Q paper also suggests
a correlated reference period.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Peter Geoghegan
pg@heroku.com
In reply to: Peter Geoghegan (#15)
Re: Decrease MAX_BACKENDS to 2^16

On Sat, Apr 26, 2014 at 1:58 PM, Peter Geoghegan <pg@heroku.com> wrote:

The 2Q paper also suggests a correlated reference period.

I withdraw this. 2Q in fact does not have such a parameter, while
LRU-K does. But the other major system I mentioned very explicitly has
a configurable delay that serves this exact purpose. This "prevents a
burst of pins on a buffer counting as many touches". The point is that
this approach is quite feasible, and may even be the best way of
addressing the general problem of correlated references.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Jim Nasby
jim@nasby.net
In reply to: Andres Freund (#10)
Re: Decrease MAX_BACKENDS to 2^16

On 4/26/14, 1:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

Stupid question... how many OSes would actually support 65k active processes, let alone 2^24?
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#10)
Re: Decrease MAX_BACKENDS to 2^16

On 04/26/2014 09:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

It doesn't seem unreasonable to have a database with tens of thousands
of connections. Sure, performance will suffer, but if the connections
sit idle most of the time so that the total load is low, who cares.
Sure, you could use a connection pooler, but it's even better if you
don't have to.

If there are big gains to be had from limiting the number of
connections, I'm not against it. For the purpose of shrinking BufferDesc
though, I have feeling there might be other lower hanging fruit in
there. For example, wait_backend_pid and freeNext are not used very
often, so they could be moved elsewhere, to a separate array. And buf_id
and the LWLock pointers could be calculated from the memory address of
the struct.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#18)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:

On 04/26/2014 09:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

It doesn't seem unreasonable to have a database with tens of thousands of
connections. Sure, performance will suffer, but if the connections sit idle
most of the time so that the total load is low, who cares. Sure, you could
use a connection pooler, but it's even better if you don't have to.

65k connections will be absolutely *disastrous* for performance because
of the big PGPROC et al. I *do* think we have to make live easier for
users here by supplying builtin pooling at some point, but that's just a
separate feature.

If there are big gains to be had from limiting the number of connections,
I'm not against it. For the purpose of shrinking BufferDesc though, I have
feeling there might be other lower hanging fruit in there. For example,
wait_backend_pid and freeNext are not used very often, so they could be
moved elsewhere, to a separate array. And buf_id and the LWLock pointers
could be calculated from the memory address of the struct.

The main reason I want to shrink it is that I want to make pin/unpin
buffer lockless and all solutions I can come up with for that require
flags to be in the same uint32 as the refcount. For performance
it'd be beneficial if usagecount also fits in there.

I agree that we can move a good part of BufferDesc into a separately
indexed array. io_in_progress_lock, freeNext, wait_backend_id are imo
good candidates.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Andres Freund (#19)
Re: Decrease MAX_BACKENDS to 2^16

On 04/28/2014 12:39 PM, Andres Freund wrote:

On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:

On 04/26/2014 09:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

It doesn't seem unreasonable to have a database with tens of thousands of
connections. Sure, performance will suffer, but if the connections sit idle
most of the time so that the total load is low, who cares. Sure, you could
use a connection pooler, but it's even better if you don't have to.

65k connections will be absolutely *disastrous* for performance because
of the big PGPROC et al.

Well, often that's still good enough.

The main reason I want to shrink it is that I want to make pin/unpin
buffer lockless and all solutions I can come up with for that require
flags to be in the same uint32 as the refcount. For performance
it'd be beneficial if usagecount also fits in there.

Would it be enough to put only some of the flags in the same uint32?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andres Freund
andres@2ndquadrant.com
In reply to: Heikki Linnakangas (#20)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-28 13:32:45 +0300, Heikki Linnakangas wrote:

On 04/28/2014 12:39 PM, Andres Freund wrote:

On 2014-04-28 10:48:30 +0300, Heikki Linnakangas wrote:

On 04/26/2014 09:27 PM, Andres Freund wrote:

I don't think we need to decide this without benchmarks proving the
benefits. I basically want to know whether somebody has an actual
usecase - even if I really, really, can't think of one - of setting
max_connections even remotely that high. If there's something
fundamental out there that'd make changing the limit impossible, doing
benchmarks wouldn't be worthwile.

It doesn't seem unreasonable to have a database with tens of thousands of
connections. Sure, performance will suffer, but if the connections sit idle
most of the time so that the total load is low, who cares. Sure, you could
use a connection pooler, but it's even better if you don't have to.

65k connections will be absolutely *disastrous* for performance because
of the big PGPROC et al.

Well, often that's still good enough.

That may be true for 2-4k max_connections, but >65k? That won't even
*run*, not to speak of doing something, in most environments because of
the number of processes required.

Even making only 20k connections will probably crash your computer.

The main reason I want to shrink it is that I want to make pin/unpin
buffer lockless and all solutions I can come up with for that require
flags to be in the same uint32 as the refcount. For performance
it'd be beneficial if usagecount also fits in there.

Would it be enough to put only some of the flags in the same uint32?

It's probably possible, but would make things more complicated. For a
"feature" nobody is ever going to use.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#21)
Re: Decrease MAX_BACKENDS to 2^16

On Mon, Apr 28, 2014 at 7:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Well, often that's still good enough.

That may be true for 2-4k max_connections, but >65k? That won't even
*run*, not to speak of doing something, in most environments because of
the number of processes required.

Even making only 20k connections will probably crash your computer.

I'm of two minds on this topic. On the one hand, "cat
/proc/sys/kernel/pid_max" on a Linux system I just tested (3.2.6)
returns 65536, so we'll run out of PID space before we run out of 64k
backends. On the other hand, that value can easily be increased to a
few million via, e.g., sysctl -w kernel.pid_max=4194303, and I imagine
that as machines continue to get bigger there will be more and more
people wanting to do things like that.

I think the fact that making 20k connections might crash your computer
is an artifact of other problems that we really ought to also fix
(like per-backend memory utilization, and lock contention on various
global data structures) rather than baking it into more places. In
PostgreSQL 25.3, perhaps we'll be able to run distributed PostgreSQL
clusters that can service a million simultaneous connections across
dozens of physical machines. Then again, there might not be much left
of our current buffer manager by that point, so maybe what we decide
right now isn't that relevant.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#22)
Re: Decrease MAX_BACKENDS to 2^16

Robert Haas <robertmhaas@gmail.com> writes:

I think the fact that making 20k connections might crash your computer
is an artifact of other problems that we really ought to also fix
(like per-backend memory utilization, and lock contention on various
global data structures) rather than baking it into more places. In
PostgreSQL 25.3, perhaps we'll be able to run distributed PostgreSQL
clusters that can service a million simultaneous connections across
dozens of physical machines. Then again, there might not be much left
of our current buffer manager by that point, so maybe what we decide
right now isn't that relevant.

Yeah. I think that useful use of 64K backends is far enough away that
it shouldn't be a showstopper argument, assuming that we get something
good in return for baking that into bufmgr. What I find much more
worrisome about Andres' proposals is that he seems to be thinking that
there are *no* other changes to the buffer headers on the horizon.
That's untenable. And I don't want to be told that we can't improve
the buffer management algorithms because adding another field would
make the headers not fit in a cacheline. (For the same reason, I'm
pretty unimpressed by the nearby suggestions that it'd be okay to put
very tight limits on the number of bits in the buffer header flags or
the usage count.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#23)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-28 10:03:58 -0400, Tom Lane wrote:

What I find much more worrisome about Andres' proposals is that he
seems to be thinking that there are *no* other changes to the buffer
headers on the horizon.

Err. I am not thinking that at all. I am pretty sure I never made that
argument. The reason I want to limit the number of connections is it
allows *both*, shrinking the size of BufferDescs due to less alignment
padding *and* stuffing the refcount and flags into one integer.

That's untenable. And I don't want to be told that we can't improve
the buffer management algorithms because adding another field would
make the headers not fit in a cacheline.

I think we need to move some less frequently fields to a separate array
to be future proof. Heikki suggested freeNext, wait_backend_pid I added
io_in_progress_lock. We could theoretically replace buf_id by
calculating it based on the BufferDescriptors array, but that's probably
not a good idea.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#24)
Re: Decrease MAX_BACKENDS to 2^16

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-28 10:03:58 -0400, Tom Lane wrote:

What I find much more worrisome about Andres' proposals is that he
seems to be thinking that there are *no* other changes to the buffer
headers on the horizon.

Err. I am not thinking that at all. I am pretty sure I never made that
argument. The reason I want to limit the number of connections is it
allows *both*, shrinking the size of BufferDescs due to less alignment
padding *and* stuffing the refcount and flags into one integer.

Weren't you saying you also wanted to stuff the usage count into that same
integer? That's getting a little too tight for my taste, even if it would
fit today.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26Andres Freund
andres@2ndquadrant.com
In reply to: Tom Lane (#25)
Re: Decrease MAX_BACKENDS to 2^16

On 2014-04-28 10:57:12 -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-04-28 10:03:58 -0400, Tom Lane wrote:

What I find much more worrisome about Andres' proposals is that he
seems to be thinking that there are *no* other changes to the buffer
headers on the horizon.

Err. I am not thinking that at all. I am pretty sure I never made that
argument. The reason I want to limit the number of connections is it
allows *both*, shrinking the size of BufferDescs due to less alignment
padding *and* stuffing the refcount and flags into one integer.

Weren't you saying you also wanted to stuff the usage count into that same
integer? That's getting a little too tight for my taste, even if it would
fit today.

That's a possible additional optimization that we could use. But it's
certainly not required. Would allow us to use fewer atomic operations...

Right now there'd be enough space for a more precise usagecount and more
flags. ATM there's 9 bits for flags and 3 bits of usagecount...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers