Relation bulk write facility

Started by Heikki Linnakangasover 2 years ago54 messages
Jump to latest
#1Heikki Linnakangas
heikki.linnakangas@enterprisedb.com

Several places bypass the buffer manager and use direct smgrextend()
calls to populate a new relation: Index AM build methods, rewriteheap.c
and RelationCopyStorage(). There's fair amount of duplicated code to
WAL-log the pages, calculate checksums, call smgrextend(), and finally
call smgrimmedsync() if needed. The duplication is tedious and
error-prone. For example, if we want to optimize by WAL-logging multiple
pages in one record, that needs to be implemented in each AM separately.
Currently only sorted GiST index build does that but it would be equally
beneficial in all of those places.

And I believe we got the smgrimmedsync() logic slightly wrong in a
number of places [1]/messages/by-id/58effc10-c160-b4a6-4eb7-384e95e6f9e3@iki.fi. And it's not great for latency, we could let the
checkpointer do the fsyncing lazily, like Robert mentioned in the same
thread.

The attached patch centralizes that pattern to a new bulk writing
facility, and changes all those AMs to use it. The facility buffers 32
pages and WAL-logs them in record, calculates checksums. You could
imagine a lot of further optimizations, like writing those 32 pages in
one vectored pvwrite() call [2]/messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com, and not skipping the buffer manager
when the relation is small. But the scope of this initial version is
mostly to refactor the existing code.

One new optimization included here is to let the checkpointer do the
fsyncing if possible. That gives a big speedup when e.g. restoring a
schema-only dump with lots of relations.

[1]: /messages/by-id/58effc10-c160-b4a6-4eb7-384e95e6f9e3@iki.fi
/messages/by-id/58effc10-c160-b4a6-4eb7-384e95e6f9e3@iki.fi

[2]: /messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com
/messages/by-id/CA+hUKGJkOiOCa+mag4BF+zHo7qo=o9CFheB8=g6uT5TUm2gkvA@mail.gmail.com

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v1-0001-Introduce-a-new-bulk-loading-facility.patchtext/x-patch; charset=UTF-8; name=v1-0001-Introduce-a-new-bulk-loading-facility.patchDownload+535-306
#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#1)
Re: Relation bulk write facility

On 19/09/2023 17:13, Heikki Linnakangas wrote:

The attached patch centralizes that pattern to a new bulk writing
facility, and changes all those AMs to use it.

Here's a new rebased version of the patch.

This includes fixes to the pageinspect regression test. They were
explained in the commit message, but I forgot to include the actual test
changes. That should fix the cfbot failures.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v2-0001-Introduce-a-new-bulk-loading-facility.patchtext/x-patch; charset=UTF-8; name=v2-0001-Introduce-a-new-bulk-loading-facility.patchDownload+543-314
#3Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#2)
Re: Relation bulk write facility

Hi,

On 2023-11-17 11:37:21 +0100, Heikki Linnakangas wrote:

The new facility makes it easier to optimize bulk loading, as the
logic for buffering, WAL-logging, and syncing the relation only needs
to be implemented once. It's also less error-prone: We have had a
number of bugs in how a relation is fsync'd - or not - at the end of a
bulk loading operation. By centralizing that logic to one place, we
only need to write it correctly once.

One thing I'd like to use the centralized handling for is to track such
writes in pg_stat_io. I don't mean as part of the initial patch, just that
that's another reason I like the facility.

The new facility is faster for small relations: Instead of of calling
smgrimmedsync(), we register the fsync to happen at next checkpoint,
which avoids the fsync latency. That can make a big difference if you
are e.g. restoring a schema-only dump with lots of relations.

I think this would be a huge win for people running their application tests
against postgres.

+	bulkw = bulkw_start_smgr(dst, forkNum, use_wal);
+
nblocks = smgrnblocks(src, forkNum);

for (blkno = 0; blkno < nblocks; blkno++)
{
+ Page page;
+
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();

-		smgrread(src, forkNum, blkno, buf.data);
+		page = bulkw_alloc_buf(bulkw);
+		smgrread(src, forkNum, blkno, page);
if (!PageIsVerifiedExtended(page, blkno,
PIV_LOG_WARNING | PIV_REPORT_STAT))
@@ -511,30 +514,9 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* page this is, so we have to log the full page including any unused
* space.
*/
-		if (use_wal)
-			log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
-
-		PageSetChecksumInplace(page, blkno);
-
-		/*
-		 * Now write the page.  We say skipFsync = true because there's no
-		 * need for smgr to schedule an fsync for this write; we'll do it
-		 * ourselves below.
-		 */
-		smgrextend(dst, forkNum, blkno, buf.data, true);
+		bulkw_write(bulkw, blkno, page, false);

I wonder if bulkw_alloc_buf() is a good name - if you naively read this
change, it looks like it'll just leak memory. It also might be taken to be
valid until freed, which I don't think is the case?

+/*-------------------------------------------------------------------------
+ *
+ * bulk_write.c
+ *	  Efficiently and reliably populate a new relation
+ *
+ * The assumption is that no other backends access the relation while we are
+ * loading it, so we can take some shortcuts.  Alternatively, you can use the
+ * buffer manager as usual, if performance is not critical, but you must not
+ * mix operations through the buffer manager and the bulk loading interface at
+ * the same time.

From "Alternatively" onward this is is somewhat confusing.

+ * We bypass the buffer manager to avoid the locking overhead, and call
+ * smgrextend() directly.  A downside is that the pages will need to be
+ * re-read into shared buffers on first use after the build finishes.  That's
+ * usually a good tradeoff for large relations, and for small relations, the
+ * overhead isn't very significant compared to creating the relation in the
+ * first place.

FWIW, I doubt the "isn't very significant" bit is really true.

+ * One tricky point is that because we bypass the buffer manager, we need to
+ * register the relation for fsyncing at the next checkpoint ourselves, and
+ * make sure that the relation is correctly fsync by us or the checkpointer
+ * even if a checkpoint happens concurrently.

"fsync'ed" or such? Otherwise this reads awkwardly for me.

+ *
+ *
+ * Portions Copyright (c) 1996-2023, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/storage/smgr/bulk_write.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/xloginsert.h"
+#include "access/xlogrecord.h"
+#include "storage/bufmgr.h"
+#include "storage/bufpage.h"
+#include "storage/bulk_write.h"
+#include "storage/proc.h"
+#include "storage/smgr.h"
+#include "utils/rel.h"
+
+#define MAX_BUFFERED_PAGES XLR_MAX_BLOCK_ID
+
+typedef struct BulkWriteBuffer
+{
+	Page		page;
+	BlockNumber blkno;
+	bool		page_std;
+	int16		order;
+} BulkWriteBuffer;
+

The name makes it sound like this struct itself contains a buffer - but it's
just pointing to one. *BufferRef or such maybe?

I was wondering how you dealt with the alignment of buffers given the struct
definition, which is what lead me to look at this...

+/*
+ * Bulk writer state for one relation fork.
+ */
+typedef struct BulkWriteState
+{
+	/* Information about the target relation we're writing */
+	SMgrRelation smgr;

Isn't there a danger of this becoming a dangling pointer? At least until
/messages/by-id/CA+hUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
is merged?

+	ForkNumber	forknum;
+	bool		use_wal;
+
+	/* We keep several pages buffered, and WAL-log them in batches */
+	int			nbuffered;
+	BulkWriteBuffer buffers[MAX_BUFFERED_PAGES];
+
+	/* Current size of the relation */
+	BlockNumber pages_written;
+
+	/* The RedoRecPtr at the time that the bulk operation started */
+	XLogRecPtr	start_RedoRecPtr;
+
+	Page		zeropage;		/* workspace for filling zeroes */

We really should just have one such page in shared memory somewhere... For WAL
writes as well.

But until then - why do you allocate the page? Seems like we could just use a
static global variable?

+/*
+ * Write all buffered pages to disk.
+ */
+static void
+bulkw_flush(BulkWriteState *bulkw)
+{
+	int			nbuffered = bulkw->nbuffered;
+	BulkWriteBuffer *buffers = bulkw->buffers;
+
+	if (nbuffered == 0)
+		return;
+
+	if (nbuffered > 1)
+	{
+		int			o;
+
+		qsort(buffers, nbuffered, sizeof(BulkWriteBuffer), buffer_cmp);
+
+		/*
+		 * Eliminate duplicates, keeping the last write of each block.
+		 * (buffer_cmp uses 'order' as the last sort key)
+		 */

Huh - which use cases would actually cause duplicate writes?

Greetings,

Andres Freund

#4Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andres Freund (#3)
Re: Relation bulk write facility

On 19/11/2023 02:04, Andres Freund wrote:

On 2023-11-17 11:37:21 +0100, Heikki Linnakangas wrote:

The new facility makes it easier to optimize bulk loading, as the
logic for buffering, WAL-logging, and syncing the relation only needs
to be implemented once. It's also less error-prone: We have had a
number of bugs in how a relation is fsync'd - or not - at the end of a
bulk loading operation. By centralizing that logic to one place, we
only need to write it correctly once.

One thing I'd like to use the centralized handling for is to track such
writes in pg_stat_io. I don't mean as part of the initial patch, just that
that's another reason I like the facility.

Oh I didn't realize they're not counted at the moment.

+	bulkw = bulkw_start_smgr(dst, forkNum, use_wal);
+
nblocks = smgrnblocks(src, forkNum);

for (blkno = 0; blkno < nblocks; blkno++)
{
+ Page page;
+
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();

-		smgrread(src, forkNum, blkno, buf.data);
+		page = bulkw_alloc_buf(bulkw);
+		smgrread(src, forkNum, blkno, page);
if (!PageIsVerifiedExtended(page, blkno,
PIV_LOG_WARNING | PIV_REPORT_STAT))
@@ -511,30 +514,9 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* page this is, so we have to log the full page including any unused
* space.
*/
-		if (use_wal)
-			log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
-
-		PageSetChecksumInplace(page, blkno);
-
-		/*
-		 * Now write the page.  We say skipFsync = true because there's no
-		 * need for smgr to schedule an fsync for this write; we'll do it
-		 * ourselves below.
-		 */
-		smgrextend(dst, forkNum, blkno, buf.data, true);
+		bulkw_write(bulkw, blkno, page, false);

I wonder if bulkw_alloc_buf() is a good name - if you naively read this
change, it looks like it'll just leak memory. It also might be taken to be
valid until freed, which I don't think is the case?

Yeah, I'm not very happy with this interface. The model is that you get
a buffer to write to by calling bulkw_alloc_buf(). Later, you hand it
over to bulkw_write(), which takes ownership of it and frees it later.
There is no other function to free it, although currently the buffer is
just palloc'd so you could call pfree on it.

However, I'd like to not expose that detail to the callers. I'm
imagining that in the future we might optimize further, by having a
larger e.g. 1 MB buffer, and carve the 8kB blocks from that. Then
opportunistically, if you fill the buffers sequentially, bulk_write.c
could do one smgrextend() to write the whole 1 MB chunk.

I renamed it to bulkw_get_buf() now, and made it return a new
BulkWriteBuffer typedef instead of a plain Page. The point of the new
typedef is to distinguish a buffer returned by bulkw_get_buf() from a
Page or char[BLCKSZ] that you might palloc on your own. That indeed
revealed some latent bugs in gistbuild.c where I had mixed up buffers
returned by bulkw_alloc_buf() and palloc'd buffers.

(The previous version of this patch called a different struct
BulkWriteBuffer, but I renamed that to PendingWrite; see below. Don't be
confused!)

I think this helps a little, but I'm still not very happy with it. I'll
give it some more thought after sleeping over it, but in the meanwhile,
I'm all ears for suggestions.

+/*-------------------------------------------------------------------------
+ *
+ * bulk_write.c
+ *	  Efficiently and reliably populate a new relation
+ *
+ * The assumption is that no other backends access the relation while we are
+ * loading it, so we can take some shortcuts.  Alternatively, you can use the
+ * buffer manager as usual, if performance is not critical, but you must not
+ * mix operations through the buffer manager and the bulk loading interface at
+ * the same time.

From "Alternatively" onward this is is somewhat confusing.

Rewrote that to just "Do not mix operations through the regular buffer
manager and the bulk loading interface!"

+ * One tricky point is that because we bypass the buffer manager, we need to
+ * register the relation for fsyncing at the next checkpoint ourselves, and
+ * make sure that the relation is correctly fsync by us or the checkpointer
+ * even if a checkpoint happens concurrently.

"fsync'ed" or such? Otherwise this reads awkwardly for me.

Yep, fixed.

+typedef struct BulkWriteBuffer
+{
+	Page		page;
+	BlockNumber blkno;
+	bool		page_std;
+	int16		order;
+} BulkWriteBuffer;
+

The name makes it sound like this struct itself contains a buffer - but it's
just pointing to one. *BufferRef or such maybe?

I was wondering how you dealt with the alignment of buffers given the struct
definition, which is what lead me to look at this...

I renamed this to PendingWrite, and the field that holds these
"pending_writes". Think of it as a queue of writes that haven't been
performed yet.

+/*
+ * Bulk writer state for one relation fork.
+ */
+typedef struct BulkWriteState
+{
+	/* Information about the target relation we're writing */
+	SMgrRelation smgr;

Isn't there a danger of this becoming a dangling pointer? At least until
/messages/by-id/CA+hUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
is merged?

Good point. I just added a FIXME comment to remind about that, hoping
that that patch gets merged soon. If not, I'll come up with a different fix.

+	ForkNumber	forknum;
+	bool		use_wal;
+
+	/* We keep several pages buffered, and WAL-log them in batches */
+	int			nbuffered;
+	BulkWriteBuffer buffers[MAX_BUFFERED_PAGES];
+
+	/* Current size of the relation */
+	BlockNumber pages_written;
+
+	/* The RedoRecPtr at the time that the bulk operation started */
+	XLogRecPtr	start_RedoRecPtr;
+
+	Page		zeropage;		/* workspace for filling zeroes */

We really should just have one such page in shared memory somewhere... For WAL
writes as well.

But until then - why do you allocate the page? Seems like we could just use a
static global variable?

I made it a static global variable for now. (The palloc way was copied
over from nbtsort.c)

+/*
+ * Write all buffered pages to disk.
+ */
+static void
+bulkw_flush(BulkWriteState *bulkw)
+{
+	int			nbuffered = bulkw->nbuffered;
+	BulkWriteBuffer *buffers = bulkw->buffers;
+
+	if (nbuffered == 0)
+		return;
+
+	if (nbuffered > 1)
+	{
+		int			o;
+
+		qsort(buffers, nbuffered, sizeof(BulkWriteBuffer), buffer_cmp);
+
+		/*
+		 * Eliminate duplicates, keeping the last write of each block.
+		 * (buffer_cmp uses 'order' as the last sort key)
+		 */

Huh - which use cases would actually cause duplicate writes?

Hmm, nothing anymore I guess. Many AMs used to write zero pages as a
placeholder and come back to fill them in later, but now that
bulk_write.c handles that,

Removed that, and replaced it with with an assertion in buffer_cmp()
that there are no duplicates.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v3-0001-Introduce-a-new-bulk-loading-facility.patchtext/x-patch; charset=UTF-8; name=v3-0001-Introduce-a-new-bulk-loading-facility.patchDownload+556-338
#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#4)
Re: Relation bulk write facility

Melanie just reminded about an older thread about this same thing:
/messages/by-id/CAAKRu_ZQEpk6Q1WtNLgfXBdCmdU5xN_w0boVO6faO_Ax+ckjig@mail.gmail.com.
I had completely forgotten about that.

Melanie's patches in that thread implemented the same optimization of
avoiding the fsync() if no checkpoint has happened during the index
build. My patch here also implements batching the WAL records of
multiple blocks, which was not part of those older patches. OTOH, those
patches included an additional optimization of not bypassing the shared
buffer cache if the index is small. That seems sensible too.

In this new patch, I subconsciously implemented an API close to what I
suggested at the end of that old thread.

So I'd like to continue this effort based on this new patch. We can add
the bypass-buffer-cache optimization later on top of this. With the new
API that this introduces, it should be an isolated change to the
implementation, with no changes required to the callers.

--
Heikki Linnakangas
Neon (https://neon.tech)

#6vignesh C
vignesh21@gmail.com
In reply to: Heikki Linnakangas (#4)
Re: Relation bulk write facility

On Sat, 25 Nov 2023 at 06:49, Heikki Linnakangas <hlinnaka@iki.fi> wrote:

On 19/11/2023 02:04, Andres Freund wrote:

On 2023-11-17 11:37:21 +0100, Heikki Linnakangas wrote:

The new facility makes it easier to optimize bulk loading, as the
logic for buffering, WAL-logging, and syncing the relation only needs
to be implemented once. It's also less error-prone: We have had a
number of bugs in how a relation is fsync'd - or not - at the end of a
bulk loading operation. By centralizing that logic to one place, we
only need to write it correctly once.

One thing I'd like to use the centralized handling for is to track such
writes in pg_stat_io. I don't mean as part of the initial patch, just that
that's another reason I like the facility.

Oh I didn't realize they're not counted at the moment.

+    bulkw = bulkw_start_smgr(dst, forkNum, use_wal);
+
nblocks = smgrnblocks(src, forkNum);

for (blkno = 0; blkno < nblocks; blkno++)
{
+ Page page;
+
/* If we got a cancel signal during the copy of the data, quit */
CHECK_FOR_INTERRUPTS();

-            smgrread(src, forkNum, blkno, buf.data);
+            page = bulkw_alloc_buf(bulkw);
+            smgrread(src, forkNum, blkno, page);
if (!PageIsVerifiedExtended(page, blkno,
PIV_LOG_WARNING | PIV_REPORT_STAT))
@@ -511,30 +514,9 @@ RelationCopyStorage(SMgrRelation src, SMgrRelation dst,
* page this is, so we have to log the full page including any unused
* space.
*/
-            if (use_wal)
-                    log_newpage(&dst->smgr_rlocator.locator, forkNum, blkno, page, false);
-
-            PageSetChecksumInplace(page, blkno);
-
-            /*
-             * Now write the page.  We say skipFsync = true because there's no
-             * need for smgr to schedule an fsync for this write; we'll do it
-             * ourselves below.
-             */
-            smgrextend(dst, forkNum, blkno, buf.data, true);
+            bulkw_write(bulkw, blkno, page, false);

I wonder if bulkw_alloc_buf() is a good name - if you naively read this
change, it looks like it'll just leak memory. It also might be taken to be
valid until freed, which I don't think is the case?

Yeah, I'm not very happy with this interface. The model is that you get
a buffer to write to by calling bulkw_alloc_buf(). Later, you hand it
over to bulkw_write(), which takes ownership of it and frees it later.
There is no other function to free it, although currently the buffer is
just palloc'd so you could call pfree on it.

However, I'd like to not expose that detail to the callers. I'm
imagining that in the future we might optimize further, by having a
larger e.g. 1 MB buffer, and carve the 8kB blocks from that. Then
opportunistically, if you fill the buffers sequentially, bulk_write.c
could do one smgrextend() to write the whole 1 MB chunk.

I renamed it to bulkw_get_buf() now, and made it return a new
BulkWriteBuffer typedef instead of a plain Page. The point of the new
typedef is to distinguish a buffer returned by bulkw_get_buf() from a
Page or char[BLCKSZ] that you might palloc on your own. That indeed
revealed some latent bugs in gistbuild.c where I had mixed up buffers
returned by bulkw_alloc_buf() and palloc'd buffers.

(The previous version of this patch called a different struct
BulkWriteBuffer, but I renamed that to PendingWrite; see below. Don't be
confused!)

I think this helps a little, but I'm still not very happy with it. I'll
give it some more thought after sleeping over it, but in the meanwhile,
I'm all ears for suggestions.

+/*-------------------------------------------------------------------------
+ *
+ * bulk_write.c
+ *    Efficiently and reliably populate a new relation
+ *
+ * The assumption is that no other backends access the relation while we are
+ * loading it, so we can take some shortcuts.  Alternatively, you can use the
+ * buffer manager as usual, if performance is not critical, but you must not
+ * mix operations through the buffer manager and the bulk loading interface at
+ * the same time.

From "Alternatively" onward this is is somewhat confusing.

Rewrote that to just "Do not mix operations through the regular buffer
manager and the bulk loading interface!"

+ * One tricky point is that because we bypass the buffer manager, we need to
+ * register the relation for fsyncing at the next checkpoint ourselves, and
+ * make sure that the relation is correctly fsync by us or the checkpointer
+ * even if a checkpoint happens concurrently.

"fsync'ed" or such? Otherwise this reads awkwardly for me.

Yep, fixed.

+typedef struct BulkWriteBuffer
+{
+    Page            page;
+    BlockNumber blkno;
+    bool            page_std;
+    int16           order;
+} BulkWriteBuffer;
+

The name makes it sound like this struct itself contains a buffer - but it's
just pointing to one. *BufferRef or such maybe?

I was wondering how you dealt with the alignment of buffers given the struct
definition, which is what lead me to look at this...

I renamed this to PendingWrite, and the field that holds these
"pending_writes". Think of it as a queue of writes that haven't been
performed yet.

+/*
+ * Bulk writer state for one relation fork.
+ */
+typedef struct BulkWriteState
+{
+    /* Information about the target relation we're writing */
+    SMgrRelation smgr;

Isn't there a danger of this becoming a dangling pointer? At least until
/messages/by-id/CA+hUKGJ8NTvqLHz6dqbQnt2c8XCki4r2QvXjBQcXpVwxTY_pvA@mail.gmail.com
is merged?

Good point. I just added a FIXME comment to remind about that, hoping
that that patch gets merged soon. If not, I'll come up with a different fix.

+    ForkNumber      forknum;
+    bool            use_wal;
+
+    /* We keep several pages buffered, and WAL-log them in batches */
+    int                     nbuffered;
+    BulkWriteBuffer buffers[MAX_BUFFERED_PAGES];
+
+    /* Current size of the relation */
+    BlockNumber pages_written;
+
+    /* The RedoRecPtr at the time that the bulk operation started */
+    XLogRecPtr      start_RedoRecPtr;
+
+    Page            zeropage;               /* workspace for filling zeroes */

We really should just have one such page in shared memory somewhere... For WAL
writes as well.

But until then - why do you allocate the page? Seems like we could just use a
static global variable?

I made it a static global variable for now. (The palloc way was copied
over from nbtsort.c)

+/*
+ * Write all buffered pages to disk.
+ */
+static void
+bulkw_flush(BulkWriteState *bulkw)
+{
+    int                     nbuffered = bulkw->nbuffered;
+    BulkWriteBuffer *buffers = bulkw->buffers;
+
+    if (nbuffered == 0)
+            return;
+
+    if (nbuffered > 1)
+    {
+            int                     o;
+
+            qsort(buffers, nbuffered, sizeof(BulkWriteBuffer), buffer_cmp);
+
+            /*
+             * Eliminate duplicates, keeping the last write of each block.
+             * (buffer_cmp uses 'order' as the last sort key)
+             */

Huh - which use cases would actually cause duplicate writes?

Hmm, nothing anymore I guess. Many AMs used to write zero pages as a
placeholder and come back to fill them in later, but now that
bulk_write.c handles that,

Removed that, and replaced it with with an assertion in buffer_cmp()
that there are no duplicates.

There are few compilation errors reported by CFBot at [1]https://cirrus-ci.com/task/5299954164432896, patch needs
to be rebased:
[02:38:12.675] In file included from ../../../../src/include/postgres.h:45,
[02:38:12.675] from nbtsort.c:41:
[02:38:12.675] nbtsort.c: In function ‘_bt_load’:
[02:38:12.675] nbtsort.c:1309:57: error: ‘BTPageState’ has no member
named ‘btps_page’
[02:38:12.675] 1309 | Assert(dstate->maxpostingsize <=
BTMaxItemSize(state->btps_page) &&
[02:38:12.675] | ^~
[02:38:12.675] ../../../../src/include/c.h:864:9: note: in definition
of macro ‘Assert’
[02:38:12.675] 864 | if (!(condition)) \
[02:38:12.675] | ^~~~~~~~~
[02:38:12.675] ../../../../src/include/c.h:812:29: note: in expansion
of macro ‘TYPEALIGN_DOWN’
[02:38:12.675] 812 | #define MAXALIGN_DOWN(LEN)
TYPEALIGN_DOWN(MAXIMUM_ALIGNOF, (LEN))
[02:38:12.675] | ^~~~~~~~~~~~~~
[02:38:12.675] ../../../../src/include/access/nbtree.h:165:3: note: in
expansion of macro ‘MAXALIGN_DOWN’
[02:38:12.675] 165 | (MAXALIGN_DOWN((PageGetPageSize(page) - \

[1]: https://cirrus-ci.com/task/5299954164432896

Regards,
Vignesh

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: vignesh C (#6)
Re: Relation bulk write facility

On 09/01/2024 08:50, vignesh C wrote:

There are few compilation errors reported by CFBot at [1], patch needs
to be rebased:

Here you go.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v4-0001-Introduce-a-new-bulk-loading-facility.patchtext/x-patch; charset=UTF-8; name=v4-0001-Introduce-a-new-bulk-loading-facility.patchDownload+553-340
#8Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#4)
Re: Relation bulk write facility

On Fri, Nov 24, 2023 at 10:22 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Yeah, I'm not very happy with this interface. The model is that you get
a buffer to write to by calling bulkw_alloc_buf(). Later, you hand it
over to bulkw_write(), which takes ownership of it and frees it later.
There is no other function to free it, although currently the buffer is
just palloc'd so you could call pfree on it.

I think we should try to pick prefixes that are one or more words
rather than using word fragments. bulkw is an awkward prefix even for
people whose first language is English, and probably more awkward for
others.

--
Robert Haas
EDB: http://www.enterprisedb.com

#9Peter Smith
smithpb2250@gmail.com
In reply to: Heikki Linnakangas (#7)
Re: Relation bulk write facility

2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1]https://commitfest.postgresql.org/46/4575/, but it seems
there was a CFbot test failure last time it was run [2]https://cirrus-ci.com/task/4990764426461184. Please have a
look and post an updated version if necessary.

======
[1]: https://commitfest.postgresql.org/46/4575/
[2]: https://cirrus-ci.com/task/4990764426461184

Kind Regards,
Peter Smith.

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Peter Smith (#9)
Re: Relation bulk write facility

On 10/01/2024 18:17, Robert Haas wrote:

I think we should try to pick prefixes that are one or more words
rather than using word fragments. bulkw is an awkward prefix even for
people whose first language is English, and probably more awkward for
others.

Renamed the 'bulkw' variables to 'bulkstate, and the functions to have
smgr_bulk_* prefix.

I was tempted to use just bulk_* as the prefix, but I'm afraid e.g.
bulk_write() is too generic.

On 22/01/2024 07:50, Peter Smith wrote:

Hi, This patch has a CF status of "Needs Review" [1], but it seems
there was a CFbot test failure last time it was run [2]. Please have a
look and post an updated version if necessary.

Fixed the headerscheck failure by adding appropriate #includes.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v5-0001-Introduce-a-new-bulk-loading-facility.patchtext/x-patch; charset=UTF-8; name=v5-0001-Introduce-a-new-bulk-loading-facility.patchDownload+558-340
#11Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#10)
Re: Relation bulk write facility

Committed this. Thanks everyone!

--
Heikki Linnakangas
Neon (https://neon.tech)

#12Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Heikki Linnakangas (#11)
Re: Relation bulk write facility

On 23/02/2024 16:27, Heikki Linnakangas wrote:

Committed this. Thanks everyone!

Buildfarm animals 'sifaka' and 'longfin' are not happy, I will investigate..

--
Heikki Linnakangas
Neon (https://neon.tech)

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#12)
Re: Relation bulk write facility

Heikki Linnakangas <hlinnaka@iki.fi> writes:

Buildfarm animals 'sifaka' and 'longfin' are not happy, I will investigate..

Those are mine, let me know if you need local investigation.

regards, tom lane

#14Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#13)
Re: Relation bulk write facility

On 23/02/2024 17:15, Tom Lane wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

Buildfarm animals 'sifaka' and 'longfin' are not happy, I will investigate..

Those are mine, let me know if you need local investigation.

Thanks, the error message was clear enough:

bulk_write.c:78:3: error: redefinition of typedef 'BulkWriteState' is a C11 feature [-Werror,-Wtypedef-redefinition]
} BulkWriteState;
^
../../../../src/include/storage/bulk_write.h:20:31: note: previous definition is here
typedef struct BulkWriteState BulkWriteState;
^
1 error generated.

Fixed now, but I'm a bit surprised other buildfarm members nor cirrus CI
caught that. I also tried to reproduce it locally by adding
-Wtypedef-redefinition, but my version of clang didn't produce any
warnings. Are there any extra compiler options on those animals or
something?

--
Heikki Linnakangas
Neon (https://neon.tech)

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#14)
Re: Relation bulk write facility

Heikki Linnakangas <hlinnaka@iki.fi> writes:

Thanks, the error message was clear enough:

bulk_write.c:78:3: error: redefinition of typedef 'BulkWriteState' is a C11 feature [-Werror,-Wtypedef-redefinition]
} BulkWriteState;

Fixed now, but I'm a bit surprised other buildfarm members nor cirrus CI
caught that. I also tried to reproduce it locally by adding
-Wtypedef-redefinition, but my version of clang didn't produce any
warnings. Are there any extra compiler options on those animals or
something?

They use Apple's standard compiler (clang 15 or so), but

'CC' => 'ccache clang -std=gnu99',

so maybe the -std has something to do with it. I installed that
(or -std=gnu90 as appropriate to branch) on most of my build
setups back when we started the C99 push.

regards, tom lane

#16Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#11)
Re: Relation bulk write facility

On Fri, Feb 23, 2024 at 04:27:34PM +0200, Heikki Linnakangas wrote:

Committed this. Thanks everyone!

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&amp;dt=2024-02-24%2015%3A13%3A14 got:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 43188608

with this stack trace:
#5 0x10005cf0 in ExceptionalCondition (conditionName=0x1015d790 <XLogBeginInsert+80> "`", fileName=0x0, lineNumber=16780064) at assert.c:66
#6 0x102daba8 in mdextend (reln=0x1042628c <PageSetChecksumInplace+44>, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at md.c:472
#7 0x102d6760 in smgrextend (reln=0x306e6670, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at smgr.c:541
#8 0x104c8dac in smgr_bulk_flush (bulkstate=0x306e6000) at bulk_write.c:245
#9 0x107baf24 in _bt_blwritepage (wstate=0x100d0a14 <datum_image_eq@AF65_7+404>, buf=0x304f13b0, blkno=807631240) at nbtsort.c:638
#10 0x107bccd8 in _bt_buildadd (wstate=0x104c9384 <smgr_bulk_start_rel+132>, state=0x304eb190, itup=0xe10, truncextra=805686672) at nbtsort.c:984
#11 0x107bc86c in _bt_sort_dedup_finish_pending (wstate=0x3b6, state=0x19, dstate=0x3) at nbtsort.c:1036
#12 0x107bc188 in _bt_load (wstate=0x10, btspool=0x0, btspool2=0x0) at nbtsort.c:1331
#13 0x107bd4ec in _bt_leafbuild (btspool=0x101589fc <ProcessInvalidationMessages+188>, btspool2=0x0) at nbtsort.c:571
#14 0x107be028 in btbuild (heap=0x304d2a00, index=0x4e1f, indexInfo=0x3) at nbtsort.c:329
#15 0x1013538c in index_build (heapRelation=0x2, indexRelation=0x10bdc518 <getopt_long+2464664>, indexInfo=0x2, isreindex=10, parallel=false) at index.c:3047
#16 0x101389e0 in index_create (heapRelation=0x1001aac0 <palloc+192>, indexRelationName=0x20 <error: Cannot access memory at address 0x20>, indexRelationId=804393376, parentIndexRelid=805686672,
parentConstraintId=268544704, relFileNumber=805309688, indexInfo=0x3009a328, indexColNames=0x30237a20, accessMethodId=403, tableSpaceId=0, collationIds=0x304d29d8, opclassIds=0x304d29f8,
opclassOptions=0x304d2a18, coloptions=0x304d2a38, reloptions=0, flags=0, constr_flags=0, allow_system_table_mods=false, is_internal=false, constraintId=0x2ff211b4) at index.c:1260
#17 0x1050342c in DefineIndex (tableId=19994, stmt=0x2ff21370, indexRelationId=0, parentIndexId=0, parentConstraintId=0, total_parts=0, is_alter_table=false, check_rights=true, check_not_in_use=true,
skip_build=false, quiet=false) at indexcmds.c:1204
#18 0x104b4474 in ProcessUtilitySlow (pstate=<error reading variable>, pstmt=0x3009a408, queryString=0x30099730 "CREATE INDEX dupindexcols_i ON dupindexcols (f1, id, f1 text_pattern_ops);",

If there are other ways I should poke at it, let me know.

#17Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#16)
Re: Relation bulk write facility

On Sun, Feb 25, 2024 at 6:24 AM Noah Misch <noah@leadboat.com> wrote:

On Fri, Feb 23, 2024 at 04:27:34PM +0200, Heikki Linnakangas wrote:

Committed this. Thanks everyone!

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&amp;dt=2024-02-24%2015%3A13%3A14 got:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 43188608

with this stack trace:
#5 0x10005cf0 in ExceptionalCondition (conditionName=0x1015d790 <XLogBeginInsert+80> "`", fileName=0x0, lineNumber=16780064) at assert.c:66
#6 0x102daba8 in mdextend (reln=0x1042628c <PageSetChecksumInplace+44>, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at md.c:472
#7 0x102d6760 in smgrextend (reln=0x306e6670, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at smgr.c:541
#8 0x104c8dac in smgr_bulk_flush (bulkstate=0x306e6000) at bulk_write.c:245

So that's:

static const PGIOAlignedBlock zero_buffer = {{0}}; /* worth BLCKSZ */

...
smgrextend(bulkstate->smgr, bulkstate->forknum,
bulkstate->pages_written++,
&zero_buffer,
true);

... where PGIOAlignedBlock is:

typedef union PGIOAlignedBlock
{
#ifdef pg_attribute_aligned
pg_attribute_aligned(PG_IO_ALIGN_SIZE)
#endif
char data[BLCKSZ];
...

We see this happen with both xlc and gcc (new enough to know how to do
this). One idea would be that the AIX *linker* is unable to align it,
as that is the common tool-chain component here (and unlike stack and
heap objects, this scope is the linker's job). There is a
pre-existing example of a zero-buffer that is at file scope like that:
pg_prewarm.c. Perhaps it doesn't get tested?

Hmm.

#18Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#17)
Re: Relation bulk write facility

On Sun, Feb 25, 2024 at 07:52:16AM +1300, Thomas Munro wrote:

On Sun, Feb 25, 2024 at 6:24 AM Noah Misch <noah@leadboat.com> wrote:

On Fri, Feb 23, 2024 at 04:27:34PM +0200, Heikki Linnakangas wrote:

Committed this. Thanks everyone!

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&amp;dt=2024-02-24%2015%3A13%3A14 got:
TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 43188608

with this stack trace:
#5 0x10005cf0 in ExceptionalCondition (conditionName=0x1015d790 <XLogBeginInsert+80> "`", fileName=0x0, lineNumber=16780064) at assert.c:66
#6 0x102daba8 in mdextend (reln=0x1042628c <PageSetChecksumInplace+44>, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at md.c:472
#7 0x102d6760 in smgrextend (reln=0x306e6670, forknum=812540744, blocknum=33, buffer=0x306e6000, skipFsync=812539904) at smgr.c:541
#8 0x104c8dac in smgr_bulk_flush (bulkstate=0x306e6000) at bulk_write.c:245

So that's:

static const PGIOAlignedBlock zero_buffer = {{0}}; /* worth BLCKSZ */

...
smgrextend(bulkstate->smgr, bulkstate->forknum,
bulkstate->pages_written++,
&zero_buffer,
true);

... where PGIOAlignedBlock is:

typedef union PGIOAlignedBlock
{
#ifdef pg_attribute_aligned
pg_attribute_aligned(PG_IO_ALIGN_SIZE)
#endif
char data[BLCKSZ];
...

We see this happen with both xlc and gcc (new enough to know how to do
this). One idea would be that the AIX *linker* is unable to align it,
as that is the common tool-chain component here (and unlike stack and
heap objects, this scope is the linker's job). There is a
pre-existing example of a zero-buffer that is at file scope like that:
pg_prewarm.c. Perhaps it doesn't get tested?

Hmm.

GCC docs do say "For some linkers, the maximum supported alignment may be very
very small.", but AIX "man LD" says "data sections are aligned on a boundary
so as to satisfy the alignment of all CSECTs in the sections". It also has -H
and -K flags to force some particular higher alignment.

On GNU/Linux x64, gcc correctly records alignment=2**12 for the associated
section (.rodata for bulk_write.o zero_buffer, .bss for pg_prewarm.o
blockbuffer). If I'm reading this right, neither AIX gcc nor xlc is marking
the section with sufficient alignment, in bulk_write.o or pg_prewarm.o:

$ /opt/cfarm/binutils-latest/bin/objdump --section-headers ~/farm/*/HEAD/pgsqlkeep.*/src/backend/storage/smgr/bulk_write.o

/home/nm/farm/gcc64/HEAD/pgsqlkeep.2024-02-24_00-03-22/src/backend/storage/smgr/bulk_write.o: file format aix5coff64-rs6000

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000277c 0000000000000000 0000000000000000 000000f0 2**2
CONTENTS, ALLOC, LOAD, RELOC, CODE
1 .data 000000e4 000000000000277c 000000000000277c 0000286c 2**3
CONTENTS, ALLOC, LOAD, RELOC, DATA
2 .debug 0001f7ea 0000000000000000 0000000000000000 00002950 2**3
CONTENTS

/home/nm/farm/xlc32/HEAD/pgsqlkeep.2024-02-24_15-12-23/src/backend/storage/smgr/bulk_write.o: file format aixcoff-rs6000

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000880 00000000 00000000 00000180 2**2
CONTENTS, ALLOC, LOAD, RELOC, CODE
1 .data 0000410c 00000880 00000880 00000a00 2**3
CONTENTS, ALLOC, LOAD, RELOC, DATA
2 .bss 00000000 0000498c 0000498c 00000000 2**3
ALLOC
3 .debug 00008448 00000000 00000000 00004b24 2**3
CONTENTS
4 .except 00000018 00000000 00000000 00004b0c 2**3
CONTENTS, LOAD

$ /opt/cfarm/binutils-latest/bin/objdump --section-headers ~/farm/*/HEAD/pgsqlkeep.*/contrib/pg_prewarm/pg_prewarm.o

/home/nm/farm/gcc32/HEAD/pgsqlkeep.2024-01-21_03-16-12/contrib/pg_prewarm/pg_prewarm.o: file format aixcoff-rs6000

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000a6c 00000000 00000000 000000b4 2**2
CONTENTS, ALLOC, LOAD, RELOC, CODE
1 .data 00000044 00000a6c 00000a6c 00000b20 2**3
CONTENTS, ALLOC, LOAD, RELOC, DATA
2 .bss 00002550 00000ab0 00000ab0 00000000 2**3
ALLOC
3 .debug 0001c50e 00000000 00000000 00000b64 2**3
CONTENTS

/home/nm/farm/gcc64/HEAD/pgsqlkeep.2024-02-15_17-13-04/contrib/pg_prewarm/pg_prewarm.o: file format aix5coff64-rs6000

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000948 0000000000000000 0000000000000000 00000138 2**2
CONTENTS, ALLOC, LOAD, RELOC, CODE
1 .data 00000078 0000000000000948 0000000000000948 00000a80 2**3
CONTENTS, ALLOC, LOAD, RELOC, DATA
2 .bss 00002640 00000000000009c0 00000000000009c0 00000000 2**3
ALLOC
3 .debug 0001d887 0000000000000000 0000000000000000 00000af8 2**3
CONTENTS

#19Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#18)
Re: Relation bulk write facility

On Sun, Feb 25, 2024 at 8:50 AM Noah Misch <noah@leadboat.com> wrote:

On GNU/Linux x64, gcc correctly records alignment=2**12 for the associated
section (.rodata for bulk_write.o zero_buffer, .bss for pg_prewarm.o
blockbuffer). If I'm reading this right, neither AIX gcc nor xlc is marking
the section with sufficient alignment, in bulk_write.o or pg_prewarm.o:

Ah, that is a bit of a hazard that we should probably document.

I guess the ideas to fix this would be: use smgrzeroextend() instead
of this coding, and/or perhaps look at the coding of pg_pwrite_zeros()
(function-local static) for any other place that needs such a thing,
if it would be satisfied by function-local scope?

#20Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#19)
Re: Relation bulk write facility

On Sun, Feb 25, 2024 at 9:12 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Sun, Feb 25, 2024 at 8:50 AM Noah Misch <noah@leadboat.com> wrote:

On GNU/Linux x64, gcc correctly records alignment=2**12 for the associated
section (.rodata for bulk_write.o zero_buffer, .bss for pg_prewarm.o
blockbuffer). If I'm reading this right, neither AIX gcc nor xlc is marking
the section with sufficient alignment, in bulk_write.o or pg_prewarm.o:

Ah, that is a bit of a hazard that we should probably document.

I guess the ideas to fix this would be: use smgrzeroextend() instead
of this coding, and/or perhaps look at the coding of pg_pwrite_zeros()
(function-local static) for any other place that needs such a thing,
if it would be satisfied by function-local scope?

Erm, wait, how does that function-local static object work differently?

#21Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#18)
#23Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andres Freund (#22)
#24Thomas Munro
thomas.munro@gmail.com
In reply to: Heikki Linnakangas (#23)
#25Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#24)
#26Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Thomas Munro (#25)
#27Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#27)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#28)
#30Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#29)
#31Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Michael Paquier (#30)
#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#31)
#33Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#32)
#34Thomas Munro
thomas.munro@gmail.com
In reply to: Heikki Linnakangas (#31)
#35Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#31)
#36Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#31)
#37Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#33)
#38Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Andres Freund (#36)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#38)
#40Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#40)
#42Michael Banck
michael.banck@credativ.de
In reply to: Andres Freund (#22)
#43Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Banck (#42)
#44Andres Freund
andres@anarazel.de
In reply to: Michael Banck (#42)
#45Michael Banck
michael.banck@credativ.de
In reply to: Andres Freund (#44)
#46Andres Freund
andres@anarazel.de
In reply to: Michael Banck (#45)
#47Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Banck (#45)
#48Phil Florent
philflorent@hotmail.com
In reply to: Andres Freund (#46)
#49Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#11)
#50Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Noah Misch (#49)
#51Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#50)
#52Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Noah Misch (#51)
#53Noah Misch
noah@leadboat.com
In reply to: Heikki Linnakangas (#52)
#54Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Noah Misch (#53)