Question on alignment

Started by Antonin Houskaalmost 7 years ago7 messages
#1Antonin Houska
ah@cybertec.at

In copydir.c:copy_file() I read

/* Use palloc to ensure we get a maxaligned buffer */
buffer = palloc(COPY_BUF_SIZE);

No data type wider than a single byte is used to access the data in the
buffer, and neither read() nor write() should require any specific alignment.
Can someone please explain why alignment matters here?

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Antonin Houska (#1)
Re: Question on alignment

On 01/04/2019 11:01, Antonin Houska wrote:

In copydir.c:copy_file() I read

/* Use palloc to ensure we get a maxaligned buffer */
buffer = palloc(COPY_BUF_SIZE);

No data type wider than a single byte is used to access the data in the
buffer, and neither read() nor write() should require any specific alignment.
Can someone please explain why alignment matters here?

An aligned buffer can allow optimizations in the kernel, when it copies
the data. So it's not strictly required, but potentially makes the
read() and write() faster.

- Heikki

#3Antonin Houska
ah@cybertec.at
In reply to: Heikki Linnakangas (#2)
Re: Question on alignment

Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 01/04/2019 11:01, Antonin Houska wrote:

In copydir.c:copy_file() I read

/* Use palloc to ensure we get a maxaligned buffer */
buffer = palloc(COPY_BUF_SIZE);

No data type wider than a single byte is used to access the data in the
buffer, and neither read() nor write() should require any specific alignment.
Can someone please explain why alignment matters here?

An aligned buffer can allow optimizations in the kernel, when it copies the
data. So it's not strictly required, but potentially makes the read() and
write() faster.

Thanks. Your response reminds me of buffer alignment:

/*
* Preferred alignment for disk I/O buffers. On some CPUs, copies between
* user space and kernel space are significantly faster if the user buffer
* is aligned on a larger-than-MAXALIGN boundary. Ideally this should be
* a platform-dependent value, but for now we just hard-wire it.
*/
#define ALIGNOF_BUFFER 32

Is this what you mean? Since palloc() only ensures MAXIMUM_ALIGNOF, that
wouldn't help here anyway.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#4Antonin Houska
ah@cybertec.at
In reply to: Antonin Houska (#3)
Re: Question on alignment

Antonin Houska <ah@cybertec.at> wrote:

Since palloc() only ensures MAXIMUM_ALIGNOF, that wouldn't help here anyway.

After some more search I'm not sure about that. The following comment
indicates that MAXALIGN helps too:

/*
* Use this, not "char buf[BLCKSZ]", to declare a field or local variable
* holding a page buffer, if that page might be accessed as a page and not
* just a string of bytes. Otherwise the variable might be under-aligned,
* causing problems on alignment-picky hardware. (In some places, we use
* this to declare buffers even though we only pass them to read() and
* write(), because copying to/from aligned buffers is usually faster than
* using unaligned buffers.) We include both "double" and "int64" in the
* union to ensure that the compiler knows the value must be MAXALIGN'ed
* (cf. configure's computation of MAXIMUM_ALIGNOF).
*/
typedef union PGAlignedBlock
{
char data[BLCKSZ];
double force_align_d;
int64 force_align_i64;
} PGAlignedBlock;

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

#5Michael Paquier
michael@paquier.xyz
In reply to: Antonin Houska (#4)
Re: Question on alignment

On Mon, Apr 01, 2019 at 02:38:30PM +0200, Antonin Houska wrote:

After some more search I'm not sure about that. The following comment
indicates that MAXALIGN helps too:

The performance argument is true, now the reason why PGAlignedBlock
has been introduced is here:
/messages/by-id/1535618100.1286.3.camel@credativ.de
--
Michael

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Antonin Houska (#4)
Re: Question on alignment

Antonin Houska <ah@cybertec.at> writes:

Antonin Houska <ah@cybertec.at> wrote:

Since palloc() only ensures MAXIMUM_ALIGNOF, that wouldn't help here anyway.

After some more search I'm not sure about that. The following comment
indicates that MAXALIGN helps too:

Well, there is more than one thing going on here, and more than one
level of potential optimization. On just about any hardware I know,
misalignment below the machine's natural word width is going to cost
cycles in memcpy (or whatever equivalent the kernel is using). Intel
CPUs tend to throw many many transistors at minimizing such costs, but
that still doesn't make it zero. On some hardware, you can get further
speedups with alignment to a bigger-than-word-width boundary, allowing
memcpy to use specialized instructions (SSE2 stuff on Intel, IIRC).
But there's a point of diminishing returns there, plus it takes extra
work and more wasted space to arrange for anything to have extra
alignment. So we generally only bother with ALIGNOF_BUFFER for shared
buffers.

regards, tom lane

#7Antonin Houska
ah@cybertec.at
In reply to: Tom Lane (#6)
Re: Question on alignment

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Antonin Houska <ah@cybertec.at> writes:

Antonin Houska <ah@cybertec.at> wrote:

Since palloc() only ensures MAXIMUM_ALIGNOF, that wouldn't help here anyway.

After some more search I'm not sure about that. The following comment
indicates that MAXALIGN helps too:

Well, there is more than one thing going on here, and more than one
level of potential optimization. On just about any hardware I know,
misalignment below the machine's natural word width is going to cost
cycles in memcpy (or whatever equivalent the kernel is using). Intel
CPUs tend to throw many many transistors at minimizing such costs, but
that still doesn't make it zero. On some hardware, you can get further
speedups with alignment to a bigger-than-word-width boundary, allowing
memcpy to use specialized instructions (SSE2 stuff on Intel, IIRC).
But there's a point of diminishing returns there, plus it takes extra
work and more wasted space to arrange for anything to have extra
alignment.

Thanks for this summary.

So we generally only bother with ALIGNOF_BUFFER for shared buffers.

ok, I'll consider this a (reasonable) convention.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com