Misaligned BufferDescriptors causing major performance problems on AMD

Started by Andres Freundabout 12 years ago52 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

In the nearby thread at
http://archives.postgresql.org/message-id/20140202140014.GM5930%40awork2.anarazel.de
Peter and I discovered that there is a large performance difference
between different max_connections on a larger machine (4x Opteron 6272,
64 cores together) in a readonly pgbench tests...

Just as reference, we're talking about a performance degradation from
475963.613865 tps to 197744.913556 in a pgbench -S -cj64 just by setting
max_connections to 90, from 91...

On 2014-02-02 15:00:14 +0100, Andres Freund wrote:

On 2014-02-01 19:47:29 -0800, Peter Geoghegan wrote:

Here are the results of a benchmark on Nathan Boley's 64-core, 4
socket server: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/

That's interesting. The maximum number of what you see here (~293125)
is markedly lower than what I can get.

... poke around ...

Hm, that's partially because you're using pgbench without -M prepared if
I see that correctly. The bottleneck in that case is primarily memory
allocation. But even after that I am getting higher
numbers: ~342497.

Trying to nail down the differnce it oddly seems to be your
max_connections=80 vs my 100. The profile in both cases is markedly
different, way much more spinlock contention with 80. All in
Pin/UnpinBuffer().

I think =80 has to lead to some data being badly aligned. I can
reproduce that =91 has *much* better performance than =90. 170841.844938
vs 368490.268577 in a 10s test. Reproducable both with an without the test.
That's certainly worth some investigation.
This is *not* reproducable on the intel machine, so it might the
associativity of the L1/L2 cache on the AMD.

So, I looked into this, and I am fairly certain it's because of the
(mis-)alignment of the buffer descriptors. With certain max_connections
settings InitBufferPool() happens to get 64byte aligned addresses, with
others not. I checked the alignment with gdb to confirm that.

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

Now all this hinges on the fact that by a mere accident
BufferDescriptors are 64byte in size:
struct sbufdesc {
BufferTag tag; /* 0 20 */
BufFlags flags; /* 20 2 */
uint16 usage_count; /* 22 2 */
unsigned int refcount; /* 24 4 */
int wait_backend_pid; /* 28 4 */
slock_t buf_hdr_lock; /* 32 1 */

/* XXX 3 bytes hole, try to pack */

int buf_id; /* 36 4 */
int freeNext; /* 40 4 */

/* XXX 4 bytes hole, try to pack */

LWLock * io_in_progress_lock; /* 48 8 */
LWLock * content_lock; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */

/* size: 64, cachelines: 1, members: 10 */
/* sum members: 57, holes: 2, sum holes: 7 */
};

We could polish up the attached patch and apply it to all the branches,
the costs of memory are minimal. But I wonder if we shouldn't instead
make ShmemInitStruct() always return cacheline aligned addresses. That
will require some fiddling, but it might be a good idea nonetheless?

I think we should also consider some more reliable measures to have
BufferDescriptors cacheline sized, rather than relying on the happy
accident. Debugging alignment issues isn't fun, too much of a guessing
game...

Thoughts?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

keep-BufferDescriptors-cacheline-aligned.patchtext/x-patch; charset=us-asciiDownload+6-0
In reply to: Andres Freund (#1)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Sun, Feb 2, 2014 at 7:13 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Just as reference, we're talking about a performance degradation from
475963.613865 tps to 197744.913556 in a pgbench -S -cj64 just by setting
max_connections to 90, from 91...

That's pretty terrible.

So, I looked into this, and I am fairly certain it's because of the
(mis-)alignment of the buffer descriptors. With certain max_connections
settings InitBufferPool() happens to get 64byte aligned addresses, with
others not. I checked the alignment with gdb to confirm that.

I find your diagnosis to be quite plausible.

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

I think that is explained here:

http://www.agner.org/optimize/blog/read.php?i=142&amp;v=t

With Sandy Bridge, "Misaligned memory operands [are] handled efficiently".

Now all this hinges on the fact that by a mere accident
BufferDescriptors are 64byte in size:

Are they 64 bytes in size on REL9_*_STABLE? How about on win64? I
think we're reasonably disciplined here already, but long is 32-bits
in length even on win64. Looks like it would probably be okay, but as
you say, it doesn't seem like something to leave to chance.

We could polish up the attached patch and apply it to all the branches,
the costs of memory are minimal. But I wonder if we shouldn't instead
make ShmemInitStruct() always return cacheline aligned addresses. That
will require some fiddling, but it might be a good idea nonetheless?

What fiddling are you thinking of?

I think we should also consider some more reliable measures to have
BufferDescriptors cacheline sized, rather than relying on the happy
accident. Debugging alignment issues isn't fun, too much of a guessing
game...

+1. Maybe make code that isn't appropriately aligned fail to compile?

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#2)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-03 15:17:13 -0800, Peter Geoghegan wrote:

On Sun, Feb 2, 2014 at 7:13 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Just as reference, we're talking about a performance degradation from
475963.613865 tps to 197744.913556 in a pgbench -S -cj64 just by setting
max_connections to 90, from 91...

That's pretty terrible.

Yea, I was scared myself.

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

I think that is explained here:

http://www.agner.org/optimize/blog/read.php?i=142&amp;v=t

With Sandy Bridge, "Misaligned memory operands [are] handled efficiently".

No, I don't think so. Those improvements afair refer to unaligned
accesses as in accessing a 4 byte variable at address % 4 != 0.

Now all this hinges on the fact that by a mere accident
BufferDescriptors are 64byte in size:

Are they 64 bytes in size on REL9_*_STABLE? How about on win64? I
think we're reasonably disciplined here already, but long is 32-bits
in length even on win64. Looks like it would probably be okay, but as
you say, it doesn't seem like something to leave to chance.

I haven't checked any branches yet, but I don't remember any recent
changes to sbufdesc. And I think we're lucky enough that all the used
types are the same width across common 64bit OSs.
But I absolutely agree we shouldn't leave this to chance.

We could polish up the attached patch and apply it to all the branches,
the costs of memory are minimal. But I wonder if we shouldn't instead
make ShmemInitStruct() always return cacheline aligned addresses. That
will require some fiddling, but it might be a good idea nonetheless?

What fiddling are you thinking of?

Basically always doing a TYPEALIGN(CACHELINE_SIZE, addr) before
returning from ShmemAlloc() (and thereby ShmemInitStruct). After I'd
written the email I came across an interesting bit of code:

void *
ShmemAlloc(Size size)
{
...
/* extra alignment for large requests, since they are probably buffers */
if (size >= BLCKSZ)
newStart = BUFFERALIGN(newStart);

/*
* Preferred alignment for disk I/O buffers. On some CPUs, copies between
* user space and kernel space are significantly faster if the user buffer
* is aligned on a larger-than-MAXALIGN boundary. Ideally this should be
* a platform-dependent value, but for now we just hard-wire it.
*/
#define ALIGNOF_BUFFER 32

#define BUFFERALIGN(LEN) TYPEALIGN(ALIGNOF_BUFFER, (LEN))

So, we're already trying to make large allocations (which we'll be using
here) try to fit to a 32 byte alignment. Which unfortunately isn't
enough here, but it explains why max_connections has to be changed by
exactly 1 to achieve adequate performance...
So, the absolutely easiest thing would be to just change ALIGNOF_BUFFER
to 64. The current value was accurate in 2003 (common cacheline size
back then), but it really isn't anymore today. Anything relevant is
64byte.

But I really think we should also remove the BLCKSZ limit. There's
relatively few granular/dynamic usages of ShmemAlloc(), basically just
shared memory hashes. And I'd be rather unsurprised if aligning all
allocations for hashes resulted in a noticeable speedup. We have
frightening bottlenecks in those today.

I think we should also consider some more reliable measures to have
BufferDescriptors cacheline sized, rather than relying on the happy
accident. Debugging alignment issues isn't fun, too much of a guessing
game...

+1. Maybe make code that isn't appropriately aligned fail to compile?

I was wondering about using some union trickery to make sure the array
only consists out of properly aligned types. That'd require some changes
but luckily code directly accessing the BufferDescriptors array isn't
too far spread.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#3)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-04 00:38:19 +0100, Andres Freund wrote:

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

I think that is explained here:

http://www.agner.org/optimize/blog/read.php?i=142&amp;v=t

With Sandy Bridge, "Misaligned memory operands [are] handled efficiently".

No, I don't think so. Those improvements afair refer to unaligned
accesses as in accessing a 4 byte variable at address % 4 != 0.

So, Christian did some benchmarking on the intel machine, and his
results were also lower than mine, and I've since confirmed that it's
also possible to reproduce the alignment problems on the intel machine.

Which imo means fixing this got more important...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Andres Freund (#4)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Tue, Feb 4, 2014 at 4:21 AM, Andres Freund <andres@2ndquadrant.com> wrote:

Which imo means fixing this got more important...

I strongly agree.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Andres Freund (#3)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Mon, Feb 3, 2014 at 3:38 PM, Andres Freund <andres@2ndquadrant.com> wrote:

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

What fiddling are you thinking of?

Basically always doing a TYPEALIGN(CACHELINE_SIZE, addr) before
returning from ShmemAlloc() (and thereby ShmemInitStruct).

There is something you have not drawn explicit attention to that is
very interesting. If we take REL9_3_STABLE tip to be representative
(built with full -O2 optimization, no assertions just debugging
symbols), setting max_connections to 91 from 90 does not have the
effect of making the BufferDescriptors array aligned; it has the
effect of making it *misaligned*. You reported that 91 was much better
than 90. I think that the problem actually occurs when the array *is*
aligned!

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#6)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-04 16:24:02 -0800, Peter Geoghegan wrote:

On Mon, Feb 3, 2014 at 3:38 PM, Andres Freund <andres@2ndquadrant.com> wrote:

A quick hack (attached) making BufferDescriptor 64byte aligned indeed
restored performance across all max_connections settings. It's not
surprising that a misaligned buffer descriptor causes problems -
there'll be plenty of false sharing of the spinlocks otherwise. Curious
that the the intel machine isn't hurt much by this.

What fiddling are you thinking of?

Basically always doing a TYPEALIGN(CACHELINE_SIZE, addr) before
returning from ShmemAlloc() (and thereby ShmemInitStruct).

There is something you have not drawn explicit attention to that is
very interesting. If we take REL9_3_STABLE tip to be representative
(built with full -O2 optimization, no assertions just debugging
symbols), setting max_connections to 91 from 90 does not have the
effect of making the BufferDescriptors array aligned; it has the
effect of making it *misaligned*. You reported that 91 was much better
than 90. I think that the problem actually occurs when the array *is*
aligned!

I don't think you can learn much from the alignment in 9.3
vs. HEAD. Loads has changed since, most prominently and recently
Robert's LWLock work. That certainly has changed allocation patterns.
It will also depend on some other parameters, e.g. changing
max_wal_senders, max_background_workers will also change the
offset. It's not that 91 is intrinsically better, it just happened to
give a aligned BufferDescriptors array when the other parameters weren't
changed at the same time.

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

I don't think that's applicable here. What's described there is relevant
for access patterns that are larger multiple of the cacheline size - but
our's is exactly cacheline sized. What can happen in such scenarios is
that all your accesses map to the same set of cachelines, so instead of
using most of the cache, you end up using only 8 or so (8 is a common
size of set associative caches these days).
Theoretically we could see something like that for shared_buffers
itself, but I *think* our accesses are too far spread around in them for
that to be a significant issue.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#7)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-02-04 16:24:02 -0800, Peter Geoghegan wrote:

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

I don't think that's applicable here.

Maybe, or maybe not, but I think it does say that we should be very wary
of proposals to force data structure alignment without any testing of the
consequences.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#8)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-05 09:57:11 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2014-02-04 16:24:02 -0800, Peter Geoghegan wrote:

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

I don't think that's applicable here.

Maybe, or maybe not, but I think it does say that we should be very wary
of proposals to force data structure alignment without any testing of the
consequences.

I agree it needs testing, but what the page is talking about really,
really doesn't have *anything* to do with this. What the author is
talking about is not storing things *page* aligned (I.e. 4k or a
multiple). The problem is that the beginning of each such page falls
into the same cacheline set and thus doesn't utilize the entire L1/L2/L3
but only the single set they map into.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Wed, Feb 5, 2014 at 3:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Maybe, or maybe not, but I think it does say that we should be very wary
of proposals to force data structure alignment without any testing of the
consequences.

Sure. see for instance
http://igoro.com/archive/gallery-of-processor-cache-effects/

From what I understand what Andres is suggesting is ensuring that each
BufferDescriptor is sitting entirely in one cache line. That seems
very unlikely to be worse than having a BufferDescriptor spanning two
cache lines and being on the same cache line as the adjacent
BufferDescriptors. But this all depends on knowing the length of the
cache lines. I see a lot of confusion online over whether cache lines
are 64 bytes, 128 bytes, or other length even just on Intel
architectures, let alone others.

I wonder if there are any generic tools to optimize array/structure
layouts based on cachegrind profiling or something like that. Then we
wouldn't need to know the oddities ourselves and optimize manually. We
could maybe even do it on the build farm and select the right profile
at build time by matching build target information.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#10)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-05 16:14:01 +0100, Greg Stark wrote:

I see a lot of confusion online over whether cache lines
are 64 bytes, 128 bytes, or other length even just on Intel
architectures, let alone others.

All current x86 processors use 64. But even if it were bigger/smaller,
they will be either 32, or 128. Neither benefits from touching more
cachelines than necessary. E.g. in the 128 case, we could still touch
two with the current code.
The effects referred to upthread only affect code with larger multiples
of the cacheline size. Not what we have here.

I wonder if there are any generic tools to optimize array/structure
layouts based on cachegrind profiling or something like that. Then we
wouldn't need to know the oddities ourselves and optimize manually. We
could maybe even do it on the build farm and select the right profile
at build time by matching build target information.

There's profiling tools (e.g. perf's -e
stalled-cycles-(frontent|backend)), but I don't think there's more than
that.
And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it
just wasn't updated in the last 10 years.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

Andres Freund <andres@2ndquadrant.com> writes:

And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it
just wasn't updated in the last 10 years.

No, ALIGNOF_BUFFER is there because we read something that said that I/O
transfers between userspace and kernel disk cache would be faster with
aligned buffers. There's been no particular thought given to alignment
of other data structures, AFAIR.

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#12)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-05 11:23:29 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it
just wasn't updated in the last 10 years.

No, ALIGNOF_BUFFER is there because we read something that said that I/O
transfers between userspace and kernel disk cache would be faster with
aligned buffers. There's been no particular thought given to alignment
of other data structures, AFAIR.

But it's not aligned anymore on at last half a decade's hardware, and
it's what we already align *all* bigger ShmemAlloc() values with. And
BufferDescriptors surely counts as larger in its entirety.

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#13)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Wed, Feb 5, 2014 at 11:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2014-02-05 11:23:29 -0500, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

And I think somebody already thought about it (c.f. ALIGNOF_BUFFER), it
just wasn't updated in the last 10 years.

No, ALIGNOF_BUFFER is there because we read something that said that I/O
transfers between userspace and kernel disk cache would be faster with
aligned buffers. There's been no particular thought given to alignment
of other data structures, AFAIR.

But it's not aligned anymore on at last half a decade's hardware, and
it's what we already align *all* bigger ShmemAlloc() values with. And
BufferDescriptors surely counts as larger in its entirety.

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

I thought your previous idea of increasing BUFFERALIGN to 64 bytes had
a lot to recommend it. But that doesn't mean it doesn't need testing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Andres Freund (#11)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Wed, Feb 5, 2014 at 7:21 AM, Andres Freund <andres@2ndquadrant.com> wrote:

All current x86 processors use 64. But even if it were bigger/smaller,
they will be either 32, or 128. Neither benefits from touching more
cachelines than necessary. E.g. in the 128 case, we could still touch
two with the current code.

That's true, but I believe that a hardware optimization called
Adjacent Cache Line Prefetch is widely supported by recent
microarchitectures, and is sometimes enabled by default. I'm not
suggesting that that would necessarily radically alter the outcome,
but it is a consideration.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Peter Geoghegan (#6)
Re: Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Tue, Feb 4, 2014 at 4:24 PM, Peter Geoghegan <pg@heroku.com> wrote:

There is something you have not drawn explicit attention to that is
very interesting. If we take REL9_3_STABLE tip to be representative
(built with full -O2 optimization, no assertions just debugging
symbols), setting max_connections to 91 from 90 does not have the
effect of making the BufferDescriptors array aligned; it has the
effect of making it *misaligned*.

I spoke too soon; the effect is indeed reversed on master (i.e. "bad"
max_connection settings are misaligned, and vice-versa).

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#14)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-02-05 12:36:42 -0500, Robert Haas wrote:

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

I thought your previous idea of increasing BUFFERALIGN to 64 bytes had
a lot to recommend it.

Good.

I wonder if we shouldn't move that bit of logic:
if (size >= BUFSIZ)
newStart = BUFFERALIGN(newStart);
out of ShmemAlloc() and instead have a ShmemAllocAligned() and
ShmemInitStructAligned() that does it. So we can sensibly can control it
per struct.

But that doesn't mean it doesn't need testing.

I feel the need here, to say that I never said it doesn't need testing
and never thought it didn't...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#17)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote:

On 2014-02-05 12:36:42 -0500, Robert Haas wrote:

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

I thought your previous idea of increasing BUFFERALIGN to 64 bytes had
a lot to recommend it.

Good.

I wonder if we shouldn't move that bit of logic:
if (size >= BUFSIZ)
newStart = BUFFERALIGN(newStart);
out of ShmemAlloc() and instead have a ShmemAllocAligned() and
ShmemInitStructAligned() that does it. So we can sensibly can control it
per struct.

But that doesn't mean it doesn't need testing.

I feel the need here, to say that I never said it doesn't need testing
and never thought it didn't...

Where are we on this?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#18)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote:

On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote:

On 2014-02-05 12:36:42 -0500, Robert Haas wrote:

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

I thought your previous idea of increasing BUFFERALIGN to 64 bytes had
a lot to recommend it.

Good.

I wonder if we shouldn't move that bit of logic:
if (size >= BUFSIZ)
newStart = BUFFERALIGN(newStart);
out of ShmemAlloc() and instead have a ShmemAllocAligned() and
ShmemInitStructAligned() that does it. So we can sensibly can control it
per struct.

But that doesn't mean it doesn't need testing.

I feel the need here, to say that I never said it doesn't need testing
and never thought it didn't...

Where are we on this?

It needs somebody with time to evaluate possible performance regressions
- I personally won't have time to look into this in detail before pgcon.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#19)
Re: Misaligned BufferDescriptors causing major performance problems on AMD

On Thu, Apr 17, 2014 at 11:23:24AM +0200, Andres Freund wrote:

On 2014-04-16 19:18:02 -0400, Bruce Momjian wrote:

On Thu, Feb 6, 2014 at 09:40:32AM +0100, Andres Freund wrote:

On 2014-02-05 12:36:42 -0500, Robert Haas wrote:

It may well be that your proposal is spot on. But I'd like to see some
data-structure-by-data-structure measurements, rather than assuming that
alignment must be a good thing.

I am fine with just aligning BufferDescriptors properly. That has
clearly shown massive improvements.

I thought your previous idea of increasing BUFFERALIGN to 64 bytes had
a lot to recommend it.

Good.

I wonder if we shouldn't move that bit of logic:
if (size >= BUFSIZ)
newStart = BUFFERALIGN(newStart);
out of ShmemAlloc() and instead have a ShmemAllocAligned() and
ShmemInitStructAligned() that does it. So we can sensibly can control it
per struct.

But that doesn't mean it doesn't need testing.

I feel the need here, to say that I never said it doesn't need testing
and never thought it didn't...

Where are we on this?

It needs somebody with time to evaluate possible performance regressions
- I personally won't have time to look into this in detail before pgcon.

Again, has anyone made any headway on this? Is it a TODO item?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ Everyone has their own god. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#19)
#22Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#21)
#23Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#24)
#26Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#26)
#28Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#27)
#29Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#28)
#30Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#29)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#29)
#32Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#32)
#34Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#26)
#35Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#32)
#36Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#34)
#37Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#36)
#38Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#37)
#39Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#38)
#40Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#40)
#42Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#42)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#43)
#45Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#43)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#45)
#47Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#46)
#48Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#47)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#48)
#50Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#49)
#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#50)
#52Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#51)