Direct I/O

Started by Thomas Munroover 3 years ago113 messageshackers
Jump to latest
#1Thomas Munro
thomas.munro@gmail.com

Hi,

Here is a patch to allow PostgreSQL to use $SUBJECT. It is from the
AIO patch-set[1]https://wiki.postgresql.org/wiki/AIO. It adds three new settings, defaulting to off:

io_data_direct = whether to use O_DIRECT for main data files
io_wal_direct = ... for WAL
io_wal_init_direct = ... for WAL-file initialisation

O_DIRECT asks the kernel to avoid caching file data as much as
possible. Here's a fun quote about it[2]https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO%27s_Semantics:

"The exact semantics of Direct I/O (O_DIRECT) are not well specified.
It is not a part of POSIX, or SUS, or any other formal standards
specification. The exact meaning of O_DIRECT has historically been
negotiated in non-public discussions between powerful enterprise
database companies and proprietary Unix systems, and its behaviour has
generally been passed down as oral lore rather than as a formal set of
requirements and specifications."

It gives the kernel the opportunity to move data directly between
PostgreSQL's user space buffers and the storage hardware using DMA
hardware, that is, without CPU involvement or copying. Not all
storage stacks can do that, for various reasons, but even if not, the
caching policy should ideally still use temporary buffers and avoid
polluting the page cache.

These settings currently destroy performance, and are not intended to
be used by end-users, yet! That's why we filed them under
DEVELOPER_OPTIONS. You don't get automatic read-ahead, concurrency,
clustering or (of course) buffering from the kernel. The idea is that
later parts of the AIO patch-set will introduce mechanisms to replace
what the kernel is doing for us today, and then more, since we ought
to be even better at predicting our own future I/O than it, so that
we'll finish up ahead. Even with all that, you wouldn't want to turn
it on by default because the default shared_buffers would be
insufficient for any real system, and there are portability problems.

Examples of slowness:

* every 8KB sequential read or write becomes a full round trip to the
storage, one at a time

* data that is written to WAL and then read back in by WAL sender will
incur full I/O round trip (that's probably not really an AIO problem,
that's something we should probably address by using shared memory
instead of files, as noted as a TODO item in the source code)

Memory alignment patches:

Direct I/O generally needs to be done to/from VM page-aligned
addresses, but only "standard" 4KB pages, even when larger VM pages
are in use (if there is an exotic system where that isn't true, it
won't work). We need to deal with buffers on the stack, the heap and
in shmem. For the stack, see patch 0001. For the heap and shared
memory, see patch 0002, but David Rowley is going to propose that part
separately, as MemoryContext API adjustments are a specialised enough
topic to deserve another thread; here I include a copy as a
dependency. The main direct I/O patch is 0003.

Assorted portability notes:

I expect this to "work" (that is, successfully destroy performance) on
typical developer systems running at least Linux, macOS, Windows and
FreeBSD. By work, I mean: not be rejected by PostgreSQL, not be
rejected by the kernel, and influence kernel cache behaviour on common
filesystems. It might be rejected with ENOSUPP, EINVAL etc on some
more exotic filesystems and OSes. Of currently supported OSes, only
OpenBSD and Solaris don't have O_DIRECT at all, and we'll reject the
GUCs. For macOS and Windows we internally translate our own
PG_O_DIRECT flag to the correct flags/calls (committed a while
back[3]/messages/by-id/CA+hUKG+ADiyyHe0cun2wfT+SVnFVqNYPxoO6J9zcZkVO7+NGig@mail.gmail.com).

On Windows, scatter/gather is available only with direct I/O, so a
true pwritev would in theory be possible, but that has some more
complications and is left for later patches (probably using native
interfaces, not disguising as POSIX).

There may be systems on which 8KB offset alignment will not work at
all or not work well, and that's expected. For example, BTRFS, ZFS,
JFS "big file", UFS etc allow larger-than-8KB blocks/records, and an
8KB write will have to trigger a read-before-write. Note that
offset/length alignment requirements (blocks) are independent of
buffer alignment requirements (memory pages, 4KB).

The behaviour and cache coherency of files that have open descriptors
using both direct and non-direct flags may be complicated and vary
between systems. The patch currently lets you change the GUCs at
runtime so backends can disagree: that should probably not be allowed,
but is like that now for experimentation. More study is required.

If someone has a compiler that we don't know how to do
pg_attribute_aligned() for, then we can't make correctly aligned stack
buffers, so in that case direct I/O is disabled, but I don't know of
such a system (maybe aCC, but we dropped it). That's why smgr code
can only assert that pointers are IO-aligned if PG_O_DIRECT != 0, and
why PG_O_DIRECT is forced to 0 if there is no pg_attribute_aligned()
macro, disabling the GUCs.

This seems to be an independent enough piece to get into the tree on
its own, with the proviso that it's not actually useful yet other than
for experimentation. Thoughts?

These patches have been hacked on at various times by Andres Freund,
David Rowley and me.

[1]: https://wiki.postgresql.org/wiki/AIO
[2]: https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO%27s_Semantics
[3]: /messages/by-id/CA+hUKG+ADiyyHe0cun2wfT+SVnFVqNYPxoO6J9zcZkVO7+NGig@mail.gmail.com

Attachments:

0001-Align-PGAlignedBlock-to-expected-page-size.patchtext/x-patch; charset=US-ASCII; name=0001-Align-PGAlignedBlock-to-expected-page-size.patchDownload+20-9
0002-XXX-palloc_io_aligned-not-for-review-here.patchtext/x-patch; charset=US-ASCII; name=0002-XXX-palloc_io_aligned-not-for-review-here.patchDownload+141-31
0003-Add-direct-I-O-settings-developer-only.patchtext/x-patch; charset=US-ASCII; name=0003-Add-direct-I-O-settings-developer-only.patchDownload+190-36
#2Justin Pryzby
pryzby@telsasoft.com
In reply to: Thomas Munro (#1)
Re: Direct I/O

On Tue, Nov 01, 2022 at 08:36:18PM +1300, Thomas Munro wrote:

Hi,

Here is a patch to allow PostgreSQL to use $SUBJECT. It is from the
AIO patch-set[1]. It adds three new settings, defaulting to off:

io_data_direct = whether to use O_DIRECT for main data files
io_wal_direct = ... for WAL
io_wal_init_direct = ... for WAL-file initialisation

You added 3 booleans, but I wonder if it's better to add a string GUC
which is parsed for comma separated strings. (By "better", I mean
reducing the number of new GUCs - which is less important for developer
GUCs anyway.)

DIO is slower, but not so much that it can't run under CI. I suggest to
add an 099 commit to enable the feature during development.

Note that this fails under linux with fsanitize=align:
../src/backend/storage/file/buffile.c:117:17: runtime error: member access within misaligned address 0x561a4a8e40f8 for type 'struct BufFile', which requires 4096 byte alignment

--
Justin

#3Thomas Munro
thomas.munro@gmail.com
In reply to: Justin Pryzby (#2)
Re: Direct I/O

On Wed, Nov 2, 2022 at 2:33 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Nov 01, 2022 at 08:36:18PM +1300, Thomas Munro wrote:

io_data_direct = whether to use O_DIRECT for main data files
io_wal_direct = ... for WAL
io_wal_init_direct = ... for WAL-file initialisation

You added 3 booleans, but I wonder if it's better to add a string GUC
which is parsed for comma separated strings. (By "better", I mean
reducing the number of new GUCs - which is less important for developer
GUCs anyway.)

Interesting idea. So "direct_io = data, wal, wal_init", or maybe that
should be spelled io_direct. ("Direct I/O" is a common term of art,
but we also have some more io_XXX GUCs in later patches, so it's hard
to choose...)

DIO is slower, but not so much that it can't run under CI. I suggest to
add an 099 commit to enable the feature during development.

Good idea, will do.

Note that this fails under linux with fsanitize=align:
../src/backend/storage/file/buffile.c:117:17: runtime error: member access within misaligned address 0x561a4a8e40f8 for type 'struct BufFile', which requires 4096 byte alignment

Oh, so BufFile is palloc'd and contains one of these. BufFile is not
even using direct I/O, but by these rules it would need to be
palloc_io_align'd. I will think about what to do about that...

#4Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#3)
Re: Direct I/O

Hi,

On 2022-11-02 09:44:30 +1300, Thomas Munro wrote:

On Wed, Nov 2, 2022 at 2:33 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Nov 01, 2022 at 08:36:18PM +1300, Thomas Munro wrote:

io_data_direct = whether to use O_DIRECT for main data files
io_wal_direct = ... for WAL
io_wal_init_direct = ... for WAL-file initialisation

You added 3 booleans, but I wonder if it's better to add a string GUC
which is parsed for comma separated strings.

In the past more complicated GUCs have not been well received, but it does
seem like a nice way to reduce the amount of redundant stuff.

Perhaps we could use the guc assignment hook to transform the input value into
a bitmask?

(By "better", I mean reducing the number of new GUCs - which is less
important for developer GUCs anyway.)

FWIW, if / once we get to actual AIO, at least some of these would stop being
developer-only GUCs. There's substantial performance benefits in using DIO
with AIO. Buffered IO requires the CPU to copy the data from the userspace
into the kernelspace. But DIO can use DMA for that, freeing the CPU to do more
useful work. Buffered IO tops out much much earlier than AIO + DIO, and
unfortunately tops out at much lower speeds on server CPUs.

DIO is slower, but not so much that it can't run under CI. I suggest to
add an 099 commit to enable the feature during development.

Good idea, will do.

Might be worth to additionally have a short tap test that does some basic
stuff with DIO and leave that enabled? I think it'd be good to have
check-world exercise DIO on dev machines, to reduce the likelihood of finding
problems only in CI, which is somewhat painful.

Note that this fails under linux with fsanitize=align:
../src/backend/storage/file/buffile.c:117:17: runtime error: member access within misaligned address 0x561a4a8e40f8 for type 'struct BufFile', which requires 4096 byte alignment

Oh, so BufFile is palloc'd and contains one of these. BufFile is not
even using direct I/O, but by these rules it would need to be
palloc_io_align'd. I will think about what to do about that...

It might be worth having two different versions of the struct, so we don't
impose unnecessarily high alignment everywhere?

Greetings,

Andres Freund

#5Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#4)
Re: Direct I/O

Hi,

On 2022-11-01 15:54:02 -0700, Andres Freund wrote:

On 2022-11-02 09:44:30 +1300, Thomas Munro wrote:

Oh, so BufFile is palloc'd and contains one of these. BufFile is not
even using direct I/O, but by these rules it would need to be
palloc_io_align'd. I will think about what to do about that...

It might be worth having two different versions of the struct, so we don't
impose unnecessarily high alignment everywhere?

Although it might actually be worth aligning fully everywhere - there's a
noticable performance difference for buffered read IO.

I benchmarked this on my workstation and laptop.

I mmap'ed a buffer with 2 MiB alignment, MAP_ANONYMOUS | MAP_HUGETLB, and then
measured performance of reading 8192 bytes into the buffer at different
offsets. Each time I copied 16GiB in total. Within a program invocation I
benchmarked each offset 4 times, threw away the worst measurement, and
averaged the rest. Then used the best of three program invocations.

workstation with dual xeon Gold 5215:

turbo on turbo off
offset GiB/s GiB/s
0 18.358 13.528
8 15.361 11.472
9 15.330 11.418
32 17.583 13.097
512 17.707 13.229
513 15.890 11.852
4096 18.176 13.568
8192 18.088 13.566
2Mib 18.658 13.496

laptop with i9-9880H:

turbo on turbo off
offset GiB/s GiB/s
0 33.589 17.160
8 28.045 14.301
9 27.582 14.318
32 31.797 16.711
512 32.215 16.810
513 28.864 14.932
4096 32.503 17.266
8192 32.871 17.277
2Mib 32.657 17.262

Seems pretty clear that using 4096 byte alignment is worth it.

Greetings,

Andres Freund

#6Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Thomas Munro (#1)
Re: Direct I/O

On 11/1/22 2:36 AM, Thomas Munro wrote:

Hi,

Here is a patch to allow PostgreSQL to use $SUBJECT. It is from the

This is exciting to see! There's two other items to add to the TODO list
before this would be ready for production:

1) work_mem. This is a significant impediment to scaling shared buffers
the way you'd want to.

2) Clock sweep. Specifically, currently the only thing that drives
usage_count is individual backends running the clock hand. On large
systems with 75% of memory going to shared_buffers, that becomes a very
significant problem, especially when the backend running the clock sweep
is doing so in order to perform an operation like a b-tree page split. I
suspect it shouldn't be too hard to deal with this issue by just having
bgwriter or another bgworker proactively ensuring some reasonable number
of buffers with usage_count=0 exist.

One other thing to be aware of: overflowing as SLRU becomes a massive
problem if there isn't a filesystem backing the SLRU. Obviously only an
issue if you try and apply DIO to SLRU files.

#7Andres Freund
andres@anarazel.de
In reply to: Jim Nasby (#6)
Re: Direct I/O

Hi,

On 2022-11-04 14:47:31 -0500, Jim Nasby wrote:

On 11/1/22 2:36 AM, Thomas Munro wrote:

Here is a patch to allow PostgreSQL to use $SUBJECT. It is from the

This is exciting to see! There's two other items to add to the TODO list
before this would be ready for production:

1) work_mem. This is a significant impediment to scaling shared buffers the
way you'd want to.

I don't really think that's closely enough related to tackle together. Yes,
it'd be easier to set a large s_b if we had better work_mem management, but
it's a completely distinct problem, and in a lot of cases you could use DIO
without tackling the work_mem issue.

2) Clock sweep. Specifically, currently the only thing that drives
usage_count is individual backends running the clock hand. On large systems
with 75% of memory going to shared_buffers, that becomes a very significant
problem, especially when the backend running the clock sweep is doing so in
order to perform an operation like a b-tree page split. I suspect it
shouldn't be too hard to deal with this issue by just having bgwriter or
another bgworker proactively ensuring some reasonable number of buffers with
usage_count=0 exist.

I agree this isn't great, but I don't think the replacement efficiency is that
big a problem. Replacing the wrong buffers is a bigger issue.

I've run tests with s_b=768GB (IIRC) without it showing up as a major
issue. If you have an extreme replacement rate at such a large s_b you have a
lot of other problems.

I don't want to discourage anybody from tackling the clock replacement issues,
the contrary, but AIO+DIO can show significant wins without those
changes. It's already a humongous project...

One other thing to be aware of: overflowing as SLRU becomes a massive
problem if there isn't a filesystem backing the SLRU. Obviously only an
issue if you try and apply DIO to SLRU files.

Which would be a very bad idea for now.... Thomas does have a patch for moving
them into the main buffer pool.

Greetings,

Andres Freund

#8John Naylor
john.naylor@enterprisedb.com
In reply to: Thomas Munro (#1)
Re: Direct I/O

On Tue, Nov 1, 2022 at 2:37 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Memory alignment patches:

Direct I/O generally needs to be done to/from VM page-aligned
addresses, but only "standard" 4KB pages, even when larger VM pages
are in use (if there is an exotic system where that isn't true, it
won't work). We need to deal with buffers on the stack, the heap and
in shmem. For the stack, see patch 0001. For the heap and shared
memory, see patch 0002, but David Rowley is going to propose that part
separately, as MemoryContext API adjustments are a specialised enough
topic to deserve another thread; here I include a copy as a
dependency. The main direct I/O patch is 0003.

One thing to note: Currently, a request to aset above 8kB must go into a
dedicated block. Not sure if it's a coincidence that that matches the
default PG page size, but if allocating pages on the heap is hot enough,
maybe we should consider raising that limit. Although then, aligned-to-4kB
requests would result in 16kB chunks requested unless a different allocator
was used.

--
John Naylor
EDB: http://www.enterprisedb.com

#9Andres Freund
andres@anarazel.de
In reply to: John Naylor (#8)
Re: Direct I/O

Hi,

On 2022-11-10 14:26:20 +0700, John Naylor wrote:

On Tue, Nov 1, 2022 at 2:37 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Memory alignment patches:

Direct I/O generally needs to be done to/from VM page-aligned
addresses, but only "standard" 4KB pages, even when larger VM pages
are in use (if there is an exotic system where that isn't true, it
won't work). We need to deal with buffers on the stack, the heap and
in shmem. For the stack, see patch 0001. For the heap and shared
memory, see patch 0002, but David Rowley is going to propose that part
separately, as MemoryContext API adjustments are a specialised enough
topic to deserve another thread; here I include a copy as a
dependency. The main direct I/O patch is 0003.

One thing to note: Currently, a request to aset above 8kB must go into a
dedicated block. Not sure if it's a coincidence that that matches the
default PG page size, but if allocating pages on the heap is hot enough,
maybe we should consider raising that limit. Although then, aligned-to-4kB
requests would result in 16kB chunks requested unless a different allocator
was used.

With one exception, there's only a small number of places that allocate pages
dynamically and we only do it for a small number of buffers. So I don't think
we should worry too much about this for now.

The one exception to this: GetLocalBufferStorage(). But it already batches
memory allocations by increasing sizes, so I think we're good as well.

Greetings,

Andres Freund

#10Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#4)
Re: Direct I/O

On Wed, Nov 2, 2022 at 11:54 AM Andres Freund <andres@anarazel.de> wrote:

On 2022-11-02 09:44:30 +1300, Thomas Munro wrote:

On Wed, Nov 2, 2022 at 2:33 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Tue, Nov 01, 2022 at 08:36:18PM +1300, Thomas Munro wrote:

io_data_direct = whether to use O_DIRECT for main data files
io_wal_direct = ... for WAL
io_wal_init_direct = ... for WAL-file initialisation

You added 3 booleans, but I wonder if it's better to add a string GUC
which is parsed for comma separated strings.

Done as io_direct=data,wal,wal_init. Thanks Justin, this is better.
I resisted the urge to invent a meaning for 'on' and 'off', mainly
because it's not clear what values 'on' should enable and it'd be
strange to have off without on, so for now an empty string means off.
I suppose the meaning of this string could evolve over time: the names
of forks, etc.

Perhaps we could use the guc assignment hook to transform the input value into
a bitmask?

Makes sense. The only tricky question was where to store the GUC. I
went for fd.c for now, but it doesn't seem quite right...

DIO is slower, but not so much that it can't run under CI. I suggest to
add an 099 commit to enable the feature during development.

Good idea, will do.

Done. The tests take 2-3x as long depending on the OS.

Might be worth to additionally have a short tap test that does some basic
stuff with DIO and leave that enabled? I think it'd be good to have
check-world exercise DIO on dev machines, to reduce the likelihood of finding
problems only in CI, which is somewhat painful.

Done.

Note that this fails under linux with fsanitize=align:
../src/backend/storage/file/buffile.c:117:17: runtime error: member access within misaligned address 0x561a4a8e40f8 for type 'struct BufFile', which requires 4096 byte alignment

Oh, so BufFile is palloc'd and contains one of these. BufFile is not
even using direct I/O, but by these rules it would need to be
palloc_io_align'd. I will think about what to do about that...

It might be worth having two different versions of the struct, so we don't
impose unnecessarily high alignment everywhere?

Done. I now have PGAlignedBlock (unchanged) and PGIOAlignedBlock.
You have to use the latter for SMgr, because I added alignment
assertions there. We might as well use it for any other I/O such as
frontend code too for a chance of a small performance boost as you
showed. For now I have not use PGIOAlignedBlock for BufFile, even
though it would be a great candidate for a potential speedup, only
because I am afraid of adding padding to every BufFile in scenarios
where we allocate many (could be avoided, a subject for separate
research).

V2 comprises:

0001 -- David's palloc_aligned() patch
https://commitfest.postgresql.org/41/3999/
0002 -- I/O-align almost all buffers used for I/O
0003 -- Add the GUCs
0004 -- Throwaway hack to make cfbot turn the GUCs on

Attachments:

v2-0001-Add-allocator-support-for-larger-allocation-align.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-allocator-support-for-larger-allocation-align.patchDownload+263-14
v2-0002-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchDownload+114-50
v2-0003-Add-io_direct-setting-developer-only.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Add-io_direct-setting-developer-only.patchDownload+239-35
v2-0004-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchtext/x-patch; charset=US-ASCII; name=v2-0004-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchDownload+1-2
#11Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#10)
Re: Direct I/O

On Wed, Dec 14, 2022 at 5:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:

0001 -- David's palloc_aligned() patch https://commitfest.postgresql.org/41/3999/
0002 -- I/O-align almost all buffers used for I/O
0003 -- Add the GUCs
0004 -- Throwaway hack to make cfbot turn the GUCs on

David pushed the first as commit 439f6175, so here is a rebase of the
rest. I also fixed a couple of thinkos in the handling of systems
where we don't know how to do direct I/O. In one place I had #ifdef
PG_O_DIRECT, but that's always defined, it's just that it's 0 on
Solaris and OpenBSD, and the check to reject the GUC wasn't quite
right on such systems.

Attachments:

v3-0001-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchDownload+114-50
v3-0002-Add-io_direct-setting-developer-only.patchtext/x-patch; charset=US-ASCII; name=v3-0002-Add-io_direct-setting-developer-only.patchDownload+238-35
v3-0003-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchtext/x-patch; charset=US-ASCII; name=v3-0003-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchDownload+1-2
#12Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Thomas Munro (#11)
Re: Direct I/O

On Thu, Dec 22, 2022 at 7:34 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Wed, Dec 14, 2022 at 5:48 PM Thomas Munro <thomas.munro@gmail.com> wrote:

0001 -- David's palloc_aligned() patch https://commitfest.postgresql.org/41/3999/
0002 -- I/O-align almost all buffers used for I/O
0003 -- Add the GUCs
0004 -- Throwaway hack to make cfbot turn the GUCs on

David pushed the first as commit 439f6175, so here is a rebase of the
rest. I also fixed a couple of thinkos in the handling of systems
where we don't know how to do direct I/O. In one place I had #ifdef
PG_O_DIRECT, but that's always defined, it's just that it's 0 on
Solaris and OpenBSD, and the check to reject the GUC wasn't quite
right on such systems.

Thanks. I have some comments on
v3-0002-Add-io_direct-setting-developer-only.patch:

1. I think we don't need to overwrite the io_direct_string in
check_io_direct so that show_io_direct can be avoided.
2. check_io_direct can leak the flags memory - when io_direct is not
supported or for an invalid list syntax or an invalid option is
specified.

I have addressed my review comments as a delta patch on top of v3-0002
and added it here as v1-0001-Review-comments-io_direct-GUC.txt.

Some comments on the tests added:

1. Is there a way to know if Direct IO for WAL and data has been
picked up programmatically? IOW, can we know if the OS page cache is
bypassed? I know an external extension pgfincore which can help here,
but nothing in the core exists AFAICS.
+is('10000', $node->safe_psql('postgres', 'select count(*) from t1'),
"read back from shared");
+is('10000', $node->safe_psql('postgres', 'select * from t2count'),
"read back from local");
+$node->stop('immediate');
2. Can we combine these two append_conf to a single statement?
+$node->append_conf('io_direct', 'data,wal,wal_init');
+$node->append_conf('shared_buffers', '64kB'); # tiny to force I/O

3. A nitpick: Can we split these queries multi-line instead of in a single line?
+$node->safe_psql('postgres', 'begin; create temporary table t2 as
select 1 as i from generate_series(1, 10000); update t2 set i = i;
insert into t2count select count(*) from t2; commit;');

4. I don't think we need to stop the node before the test ends, no?
+$node->stop;
+
+done_testing();

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Review-comments-io_direct-GUC.txttext/plain; charset=US-ASCII; name=v1-0001-Review-comments-io_direct-GUC.txtDownload+25-38
#13Thomas Munro
thomas.munro@gmail.com
In reply to: Bharath Rupireddy (#12)
Re: Direct I/O

On Wed, Jan 25, 2023 at 8:57 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

Thanks. I have some comments on
v3-0002-Add-io_direct-setting-developer-only.patch:

1. I think we don't need to overwrite the io_direct_string in
check_io_direct so that show_io_direct can be avoided.

Thanks for looking at this, and sorry for the late response. Yeah, agreed.

2. check_io_direct can leak the flags memory - when io_direct is not
supported or for an invalid list syntax or an invalid option is
specified.

I have addressed my review comments as a delta patch on top of v3-0002
and added it here as v1-0001-Review-comments-io_direct-GUC.txt.

Thanks. Your way is nicer. I merged your patch and added you as a co-author.

Some comments on the tests added:

1. Is there a way to know if Direct IO for WAL and data has been
picked up programmatically? IOW, can we know if the OS page cache is
bypassed? I know an external extension pgfincore which can help here,
but nothing in the core exists AFAICS.

Right, that extension can tell you how many pages are in the kernel
page cache which is quite interesting for this. I also once hacked up
something primitive to see *which* pages are in kernel cache, so I
could join that against pg_buffercache to measure double buffering,
when I was studying the "smile" shape where pgbench TPS goes down and
then back up again as you increase shared_buffers if the working set
is nearly as big as physical memory (code available in a link from
[1]: https://twitter.com/MengTangmu/status/994770040745615361

Yeah, I agree it might be nice for human investigators to put
something like that in contrib/pg_buffercache, but I'm not sure you
could rely on it enough for an automated test, though, 'cause it
probably won't work on some file systems and the tests would probably
fail for random transient reasons (for example: some systems won't
kick pages out of kernel cache if they were already there, just
because we decided to open the file with O_DIRECT). (I got curious
about why mincore() wasn't standardised along with mmap() and all that
jazz; it seems the BSD and later Sun people who invented all those
interfaces didn't think that one was quite good enough[2]http://kos.enix.org/pub/gingell8.pdf, but every
(?) Unixoid OS copied it anyway, with variations... Apparently the
Windows thing is called VirtualQuery()).

2. Can we combine these two append_conf to a single statement?
+$node->append_conf('io_direct', 'data,wal,wal_init');
+$node->append_conf('shared_buffers', '64kB'); # tiny to force I/O

OK, sure, done. And also oops, that was completely wrong and not
working the way I had it in that version...

3. A nitpick: Can we split these queries multi-line instead of in a single line?
+$node->safe_psql('postgres', 'begin; create temporary table t2 as
select 1 as i from generate_series(1, 10000); update t2 set i = i;
insert into t2count select count(*) from t2; commit;');

OK.

4. I don't think we need to stop the node before the test ends, no?
+$node->stop;
+
+done_testing();

Sure, but why not?

Otherwise, I rebased, and made a couple more changes:

I found a line of the manual about wal_sync_method that needed to be removed:

- The <literal>open_</literal>* options also use
<literal>O_DIRECT</literal> if available.

In fact that sentence didn't correctly document the behaviour in
released branches (wal_level=minimal is also required for that, so
probably very few people ever used it). I think we should adjust that
misleading sentence in back-branches, separately from this patch set.

I also updated the commit message to highlight the only expected
user-visible change for this, namely the loss of the above incorrectly
documented obscure special case, which is replaced by the less obscure
new setting io_direct=wal, if someone still wants that behaviour.

Also a few minor comment changes.

[1]: https://twitter.com/MengTangmu/status/994770040745615361
[2]: http://kos.enix.org/pub/gingell8.pdf

Attachments:

v4-0001-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Introduce-PG_IO_ALIGN_SIZE-and-align-all-I-O-buff.patchDownload+112-49
v4-0002-Add-io_direct-setting-developer-only.patchtext/x-patch; charset=US-ASCII; name=v4-0002-Add-io_direct-setting-developer-only.patchDownload+233-36
v4-0003-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchtext/x-patch; charset=US-ASCII; name=v4-0003-XXX-turn-on-direct-I-O-by-default-just-for-CI.patchDownload+1-2
#14Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#13)
Re: Direct I/O

I did some testing with non-default block sizes, and found a few minor
things that needed adjustment. The short version is that I blocked
some configurations that won't work or would break an assertion.
After a bit more copy-editing on docs and comments and a round of
automated indenting, I have now pushed this. I will now watch the
build farm. I tested on quite a few OSes that I have access to, but
this is obviously a very OS-sensitive kind of a thing.

The adjustments were:

1. If you set your BLCKSZ or XLOG_BLCKSZ smaller than
PG_IO_ALIGN_SIZE, you shouldn't be allowed to turn on direct I/O for
the relevant operations, because such undersized direct I/Os will fail
on common systems.

FATAL: invalid value for parameter "io_direct": "wal"
DETAIL: io_direct is not supported for WAL because XLOG_BLCKSZ is too small

FATAL: invalid value for parameter "io_direct": "data"
DETAIL: io_direct is not supported for data because BLCKSZ is too small

In fact some systems would be OK with it if the true requirement is
512 not 4096, but (1) tiny blocks are a niche build option that
doesn't even pass regression tests and (2) it's hard and totally
unportable to find out the true requirement at runtime, and (3) the
conservative choice of 4096 has additional benefits by matching memory
pages. So I think a conservative compile-time number is a good
starting position.

2. Previously I had changed the WAL buffer alignment to be the larger
of PG_IO_ALIGN_SIZE and XLOG_BLCKSZ, but in light of the above
thinking, I reverted that part (no point in aligning the address of
the buffer when the size is too small for direct I/O, but now that
combination is blocked off at GUC level so we don't need any change
here).

3. I updated the md.c alignment assertions to allow for tiny blocks.
The point of these assertions is to fail if any new code does I/O from
badly aligned buffers even with io_direct turned off (ie how most
people hack), 'cause that will fail with io_direct turned on. The
change is that I don't make the assertion if you're using BLCKSZ <
PG_IO_ALIGN_SIZE. Such buffers wouldn't work if used for direct I/O
but that's OK, the GUC won't allow it.

4. I made the language to explain where PG_IO_ALIGN_SIZE really comes
from a little vaguer because it's complex.

#15Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#14)
Re: Direct I/O

On Sat, Apr 8, 2023 at 4:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:

After a bit more copy-editing on docs and comments and a round of
automated indenting, I have now pushed this. I will now watch the
build farm. I tested on quite a few OSes that I have access to, but
this is obviously a very OS-sensitive kind of a thing.

Hmm. I see a strange "invalid page" failure on Andrew's machine crake
in 004_io_direct.pl. Let's see what else comes in.

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#14)
Re: Direct I/O

Thomas Munro <thomas.munro@gmail.com> writes:

I did some testing with non-default block sizes, and found a few minor
things that needed adjustment.

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&amp;dt=2023-04-08%2004%3A42%3A04

This seems like another thing that should not have been pushed mere
hours before feature freeze.

regards, tom lane

#17Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#15)
Re: Direct I/O

Hi,

On 2023-04-08 16:59:20 +1200, Thomas Munro wrote:

On Sat, Apr 8, 2023 at 4:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:

After a bit more copy-editing on docs and comments and a round of
automated indenting, I have now pushed this. I will now watch the
build farm. I tested on quite a few OSes that I have access to, but
this is obviously a very OS-sensitive kind of a thing.

Hmm. I see a strange "invalid page" failure on Andrew's machine crake
in 004_io_direct.pl. Let's see what else comes in.

There were some failures in CI (e.g. [1]https://cirrus-ci.com/task/4519721039560704 (and perhaps also bf, didn't yet
check), about "no unpinned buffers available". I was worried for a moment
that this could actually be relation to the bulk extension patch.

But it looks like it's older - and not caused by direct_io support (except by
way of the test existing). I reproduced the issue locally by setting s_b even
lower, to 16 and made the ERROR a PANIC.

#4 0x00005624dfe90336 in errfinish (filename=0x5624df6867c0 "../../../../home/andres/src/postgresql/src/backend/storage/buffer/freelist.c", lineno=353,
funcname=0x5624df686900 <__func__.6> "StrategyGetBuffer") at ../../../../home/andres/src/postgresql/src/backend/utils/error/elog.c:604
#5 0x00005624dfc71dbe in StrategyGetBuffer (strategy=0x0, buf_state=0x7ffd4182137c, from_ring=0x7ffd4182137b)
at ../../../../home/andres/src/postgresql/src/backend/storage/buffer/freelist.c:353
#6 0x00005624dfc6a922 in GetVictimBuffer (strategy=0x0, io_context=IOCONTEXT_NORMAL)
at ../../../../home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:1601
#7 0x00005624dfc6a29f in BufferAlloc (smgr=0x5624e1ff27f8, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=16, strategy=0x0, foundPtr=0x7ffd418214a3,
io_context=IOCONTEXT_NORMAL) at ../../../../home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:1290
#8 0x00005624dfc69c0c in ReadBuffer_common (smgr=0x5624e1ff27f8, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=16, mode=RBM_NORMAL, strategy=0x0,
hit=0x7ffd4182156b) at ../../../../home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:1056
#9 0x00005624dfc69335 in ReadBufferExtended (reln=0x5624e1ee09f0, forkNum=MAIN_FORKNUM, blockNum=16, mode=RBM_NORMAL, strategy=0x0)
at ../../../../home/andres/src/postgresql/src/backend/storage/buffer/bufmgr.c:776
#10 0x00005624df8eb78a in log_newpage_range (rel=0x5624e1ee09f0, forknum=MAIN_FORKNUM, startblk=0, endblk=45, page_std=false)
at ../../../../home/andres/src/postgresql/src/backend/access/transam/xloginsert.c:1290
#11 0x00005624df9567e7 in smgrDoPendingSyncs (isCommit=true, isParallelWorker=false)
at ../../../../home/andres/src/postgresql/src/backend/catalog/storage.c:837
#12 0x00005624df8d1dd2 in CommitTransaction () at ../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:2225
#13 0x00005624df8d2da2 in CommitTransactionCommand () at ../../../../home/andres/src/postgresql/src/backend/access/transam/xact.c:3060
#14 0x00005624dfcbe0a1 in finish_xact_command () at ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:2779
#15 0x00005624dfcbb867 in exec_simple_query (query_string=0x5624e1eacd98 "create table t1 as select 1 as i from generate_series(1, 10000)")
at ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:1299
#16 0x00005624dfcc09c4 in PostgresMain (dbname=0x5624e1ee40e8 "postgres", username=0x5624e1e6c5f8 "andres")
at ../../../../home/andres/src/postgresql/src/backend/tcop/postgres.c:4623
#17 0x00005624dfbecc03 in BackendRun (port=0x5624e1ed8250) at ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4461
#18 0x00005624dfbec48e in BackendStartup (port=0x5624e1ed8250) at ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:4189
#19 0x00005624dfbe8541 in ServerLoop () at ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1779
#20 0x00005624dfbe7e56 in PostmasterMain (argc=4, argv=0x5624e1e6a520) at ../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1463
#21 0x00005624dfad538b in main (argc=4, argv=0x5624e1e6a520) at ../../../../home/andres/src/postgresql/src/backend/main/main.c:200

If you look at log_newpage_range(), it's not surprising that we get this error
- it pins up to 32 buffers at once.

Afaics log_newpage_range() originates in 9155580fd5fc, but this caller is from
c6b92041d385.

It doesn't really seem OK to me to unconditionally pin 32 buffers. For the
relation extension patch I introduced LimitAdditionalPins() to deal with this
concern. Perhaps it needs to be exposed and log_newpage_buffers() should use
it?

Do we care about fixing this in the backbranches? Probably not, given there
haven't been user complaints?

Greetings,

Andres Freund

[1]: https://cirrus-ci.com/task/4519721039560704

#18Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#15)
Re: Direct I/O

On Sat, Apr 8, 2023 at 4:59 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sat, Apr 8, 2023 at 4:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:

After a bit more copy-editing on docs and comments and a round of
automated indenting, I have now pushed this. I will now watch the
build farm. I tested on quite a few OSes that I have access to, but
this is obviously a very OS-sensitive kind of a thing.

Hmm. I see a strange "invalid page" failure on Andrew's machine crake
in 004_io_direct.pl. Let's see what else comes in.

No particular ideas about what happened there yet. It *looks* like we
wrote out a page, and then read it back in very soon afterwards, all
via the usual locked bufmgr/smgr pathways, and it failed basic page
validation. The reader was a parallel worker, because of the
debug_parallel_mode setting on that box. The page number looks
reasonable (I guess it's reading a page created by the UPDATE full of
new tuples, but I don't know which process wrote it). It's also not
immediately obvious how this could be connected to the 32 pinned
buffer problem (all that would have happened in the CREATE TABLE
process which ended already before the UPDATE and then the SELECT
backends even started).

Andrew, what file system and type of disk is that machine using?

#19Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#17)
Re: Direct I/O

Hi,

On 2023-04-07 23:04:08 -0700, Andres Freund wrote:

There were some failures in CI (e.g. [1] (and perhaps also bf, didn't yet
check), about "no unpinned buffers available". I was worried for a moment
that this could actually be relation to the bulk extension patch.

But it looks like it's older - and not caused by direct_io support (except by
way of the test existing). I reproduced the issue locally by setting s_b even
lower, to 16 and made the ERROR a PANIC.

[backtrace]

If you look at log_newpage_range(), it's not surprising that we get this error
- it pins up to 32 buffers at once.

Afaics log_newpage_range() originates in 9155580fd5fc, but this caller is from
c6b92041d385.

It doesn't really seem OK to me to unconditionally pin 32 buffers. For the
relation extension patch I introduced LimitAdditionalPins() to deal with this
concern. Perhaps it needs to be exposed and log_newpage_buffers() should use
it?

Do we care about fixing this in the backbranches? Probably not, given there
haven't been user complaints?

Here's a quick prototype of this approach. If we expose LimitAdditionalPins(),
we'd probably want to add "Buffer" to the name, and pass it a relation, so
that it can hand off LimitAdditionalLocalPins() when appropriate? The callsite
in question doesn't need it, but ...

Without the limiting of pins the modified 004_io_direct.pl fails 100% of the
time for me.

Presumably the reason it fails occasionally with 256kB of shared buffers
(i.e. NBuffers=32) is that autovacuum or checkpointer briefly pins a single
buffer. As log_newpage_range() thinks it can just pin 32 buffers
unconditionally, it fails in that case.

Greetings,

Andres Freund

Attachments:

limit-pins.difftext/x-diff; charset=us-asciiDownload+11-6
#20Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#19)
Re: Direct I/O

Hi,

Given the frequency of failures on this in the buildfarm, I propose using the
temporary workaround of using wal_level=replica. That avoids the use of the
over-eager log_newpage_range().

Greetings,

Andres Freund

#21Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#21)
#23Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#22)
#24Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#22)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#25)
#27Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#26)
#28Thomas Munro
thomas.munro@gmail.com
In reply to: Andrew Dunstan (#27)
#29Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#24)
#30Thomas Munro
thomas.munro@gmail.com
In reply to: Andrew Dunstan (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#28)
#32Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#32)
#34Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#32)
#35Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#34)
#36Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#19)
#37Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#35)
#38Andrew Dunstan
andrew@dunslane.net
In reply to: Thomas Munro (#30)
#39Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#37)
#40Thomas Munro
thomas.munro@gmail.com
In reply to: Andrew Dunstan (#38)
#41Andrew Dunstan
andrew@dunslane.net
In reply to: Thomas Munro (#40)
#42Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#39)
#44Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#36)
#45Noah Misch
noah@leadboat.com
In reply to: Andres Freund (#44)
#46Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#43)
#47Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#39)
#48Andrea Gelmini
andrea.gelmini@gmail.com
In reply to: Andres Freund (#47)
#49Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#47)
#50Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#49)
#51Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#49)
#52Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#51)
#53Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#52)
#54Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#45)
#56Christoph Berg
myon@debian.org
In reply to: Thomas Munro (#1)
#57Thomas Munro
thomas.munro@gmail.com
In reply to: Christoph Berg (#56)
#58Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#57)
#59Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#58)
#60Andrew Dunstan
andrew@dunslane.net
In reply to: Thomas Munro (#58)
In reply to: Thomas Munro (#58)
In reply to: Andrew Dunstan (#60)
#63Andrew Dunstan
andrew@dunslane.net
In reply to: Dagfinn Ilmari Mannsåker (#62)
In reply to: Andrew Dunstan (#63)
#65Andrew Dunstan
andrew@dunslane.net
In reply to: Dagfinn Ilmari Mannsåker (#64)
#66Thomas Munro
thomas.munro@gmail.com
In reply to: Andrew Dunstan (#65)
#67Christoph Berg
myon@debian.org
In reply to: Thomas Munro (#66)
#68Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#66)
#69Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#68)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#69)
#71Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#70)
#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#71)
#73Mikael Kjellström
mikael.kjellstrom@gmail.com
In reply to: Andres Freund (#71)
#74Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#72)
#75Thomas Munro
thomas.munro@gmail.com
In reply to: Mikael Kjellström (#73)
#76Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#75)
#77Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Tom Lane (#76)
#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#74)
#79Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#78)
#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#79)
#81Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#78)
#82Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Tom Lane (#80)
#83Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mikael Kjellström (#82)
#84Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Tom Lane (#83)
#85Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#83)
#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#85)
#87Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#86)
#88Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Andrew Dunstan (#87)
#89Michael Paquier
michael@paquier.xyz
In reply to: Mikael Kjellström (#84)
#90Mikael Kjellström
mikael.kjellstrom@mksoft.nu
In reply to: Mikael Kjellström (#88)
#91Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#78)
#92Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#91)
#93Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#92)
#94Bruce Momjian
bruce@momjian.us
In reply to: Thomas Munro (#93)
#95Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#94)
#96Joe Conway
mail@joeconway.com
In reply to: Robert Haas (#95)
#97Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#95)
#98Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#93)
#99Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#94)
#100Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#97)
#101Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#100)
#102Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#92)
#103Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#102)
#104Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#103)
#105Justin Pryzby
pryzby@telsasoft.com
In reply to: Thomas Munro (#103)
#106Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#105)
#107Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#106)
#108Thomas Munro
thomas.munro@gmail.com
In reply to: Bruce Momjian (#94)
#109Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#107)
#110Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#109)
#111Peter Eisentraut
peter_e@gmx.net
In reply to: Thomas Munro (#107)
In reply to: Andres Freund (#101)
#113Thomas Munro
thomas.munro@gmail.com
In reply to: Peter Eisentraut (#111)