New Table Access Methods for Multi and Single Inserts

Started by Bharath Rupireddyover 5 years ago89 messageshackers
Jump to latest
#1Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com

Hi,

Currently, for any component (such as COPY, CTAS[1]/messages/by-id/4eee0730-f6ec-e72d-3477-561643f4b327@swarm64.com, CREATE/REFRESH
Mat View[1]/messages/by-id/4eee0730-f6ec-e72d-3477-561643f4b327@swarm64.com, INSERT INTO SELECTs[2]/messages/by-id/20201124020020.GK24052@telsasoft.com) multi insert logic such as buffer
slots allocation, maintenance, decision to flush and clean up, need to
be implemented outside the table_multi_insert() API. The main problem
is that it fails to take into consideration the underlying storage
engine capabilities, for more details of this point refer to a
discussion in multi inserts in CTAS thread[1]/messages/by-id/4eee0730-f6ec-e72d-3477-561643f4b327@swarm64.com. This also creates a lot
of duplicate code which is more error prone and not maintainable.

More importantly, in another thread [3]/messages/by-id/20200924024128.kyk3r5g7dnu3fxxx@alap3.anarazel.de @Andres Freund suggested to
have table insert APIs in such a way that they look more like 'scan'
APIs i.e. insert_begin, insert, insert_end. The main advantages doing
this are(quoting from his statement in [3]/messages/by-id/20200924024128.kyk3r5g7dnu3fxxx@alap3.anarazel.de) - "more importantly it'd
allow an AM to optimize operations across multiple inserts, which is
important for column stores."

I propose to introduce new table access methods for both multi and
single inserts based on the prototype suggested by Andres in [3]/messages/by-id/20200924024128.kyk3r5g7dnu3fxxx@alap3.anarazel.de. Main
design goal of these new APIs is to give flexibility to tableam
developers in implementing multi insert logic dependent on the
underlying storage engine.

Below are the APIs. I suggest to have a look at
v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch for details
of the new data structure and the API functionality. Note that
temporarily I used XX_v2, we can change it later.

TableInsertState* table_insert_begin(initial_args);
void table_insert_v2(TableInsertState *state, TupleTableSlot *slot);
void table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot);
void table_multi_insert_flush(TableInsertState *state);
void table_insert_end(TableInsertState *state);

I'm attaching a few patches(just to show that these APIs work, avoids
a lot of duplicate code and makes life easier). Better commenting can
be added later. If these APIs and patches look okay, we can even
consider replacing them in other places such as nodeModifyTable.c and
so on.

v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch --->
introduces new table access methods for multi and single inserts. Also
implements/rearranges the outside code for heap am into these new
APIs.
v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patch
---> adds new multi insert table access methods to CREATE TABLE AS,
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW.
v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patch ---> adds
new single insert table access method to ALTER TABLE rewrite table
code.
v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patch ---> adds
new single and multi insert table access method to COPY code.

Thoughts?

Many thanks to Robert, Vignesh and Dilip for offlist discussion.

[1]: /messages/by-id/4eee0730-f6ec-e72d-3477-561643f4b327@swarm64.com
[2]: /messages/by-id/20201124020020.GK24052@telsasoft.com
[3]: /messages/by-id/20200924024128.kyk3r5g7dnu3fxxx@alap3.anarazel.de

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patchapplication/octet-stream; name=v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patchDownload+471-2
v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patchapplication/octet-stream; name=v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patchDownload+12-17
v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patchapplication/octet-stream; name=v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patchDownload+163-321
v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patchapplication/octet-stream; name=v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patchDownload+59-42
#2Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#1)
Re: New Table Access Methods for Multi and Single Inserts

On Tue, Dec 8, 2020 at 6:27 PM Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:

Hi,

Currently, for any component (such as COPY, CTAS[1], CREATE/REFRESH
Mat View[1], INSERT INTO SELECTs[2]) multi insert logic such as buffer
slots allocation, maintenance, decision to flush and clean up, need to
be implemented outside the table_multi_insert() API. The main problem
is that it fails to take into consideration the underlying storage
engine capabilities, for more details of this point refer to a
discussion in multi inserts in CTAS thread[1]. This also creates a lot
of duplicate code which is more error prone and not maintainable.

More importantly, in another thread [3] @Andres Freund suggested to
have table insert APIs in such a way that they look more like 'scan'
APIs i.e. insert_begin, insert, insert_end. The main advantages doing
this are(quoting from his statement in [3]) - "more importantly it'd
allow an AM to optimize operations across multiple inserts, which is
important for column stores."

I propose to introduce new table access methods for both multi and
single inserts based on the prototype suggested by Andres in [3]. Main
design goal of these new APIs is to give flexibility to tableam
developers in implementing multi insert logic dependent on the
underlying storage engine.

Below are the APIs. I suggest to have a look at
v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch for details
of the new data structure and the API functionality. Note that
temporarily I used XX_v2, we can change it later.

TableInsertState* table_insert_begin(initial_args);
void table_insert_v2(TableInsertState *state, TupleTableSlot *slot);
void table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot);
void table_multi_insert_flush(TableInsertState *state);
void table_insert_end(TableInsertState *state);

I'm attaching a few patches(just to show that these APIs work, avoids
a lot of duplicate code and makes life easier). Better commenting can
be added later. If these APIs and patches look okay, we can even
consider replacing them in other places such as nodeModifyTable.c and
so on.

v1-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patch --->
introduces new table access methods for multi and single inserts. Also
implements/rearranges the outside code for heap am into these new
APIs.
v1-0002-CTAS-and-REFRESH-Mat-View-With-New-Multi-Insert-Table-AM.patch
---> adds new multi insert table access methods to CREATE TABLE AS,
CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW.
v1-0003-ATRewriteTable-With-New-Single-Insert-Table-AM.patch ---> adds
new single insert table access method to ALTER TABLE rewrite table
code.
v1-0004-COPY-With-New-Multi-and-Single-Insert-Table-AM.patch ---> adds
new single and multi insert table access method to COPY code.

Thoughts?

[1] -

/messages/by-id/4eee0730-f6ec-e72d-3477-561643f4b327@swarm64.com

[2] -

/messages/by-id/20201124020020.GK24052@telsasoft.com

[3] -

/messages/by-id/20200924024128.kyk3r5g7dnu3fxxx@alap3.anarazel.de

Added this to commitfest to get it reviewed further.

https://commitfest.postgresql.org/31/2871/

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#3Justin Pryzby
pryzby@telsasoft.com
In reply to: Bharath Rupireddy (#1)
Re: New Table Access Methods for Multi and Single Inserts

Typos:

+ *  1) Specify is_multi as true, then multi insert state is allcoated.
=> allocated
+ * dropped, short-lived memory context is delted and mistate is freed up.
=> deleted
+ * 2) Currently, GetTupleSize() handles the existing heap, buffer, minmal and
=> minimal
+       /* Mulit insert state if requested, otherwise NULL. */
=> multi
+ * Buffer the input slots and insert the tuples from the buffered slots at a
=> *one* at a time ?
+ * Compute the size of the tuple only if mi_max_size i.e. the total tuple size
=> I guess you mean max_size

This variable could use a better name:
+CopyMulitInsertFlushBuffers(List **mirri, ..
mirri is fine for a local variable like an element of a struture/array, or a
loop variable, but not for a function parameter which is an "List" of arbitrary
pointers.

I think this comment needs to be updated (again) for the removal of the Info
structure.
- * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
+ * multi insert buffer items stored in CopyMultiInsertInfo's

I think the COPY patch should be 0002 (or maybe merged into 0001).
There's some superfluous whitespace (and other) changes there which make the
patch unnecessarily long.

You made the v2 insert interface a requirement for all table AMs.
Should it be optional, and fall back to simple inserts if not implemented ?

For CTAS, I think we need to consider Paul's idea here.
/messages/by-id/26C14A63-CCE5-4B46-975A-57C1784B3690@vmware.com
Conceivably, tableam should support something like that for arbitrary AMs
("insert into a new table for which we have exclusive lock"). I think that AM
method should also be optional. It should be possible to implement a minimal
AM without implementing every available optimization, which may not apply to
all AMs, anyway.

--
Justin

#4Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#3)
Re: New Table Access Methods for Multi and Single Inserts

Thanks a lot for taking a look at the patches.

On Thu, Dec 17, 2020 at 10:35 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Typos:

+ *  1) Specify is_multi as true, then multi insert state is allcoated.
=> allocated
+ * dropped, short-lived memory context is delted and mistate is freed up.
=> deleted
+ * 2) Currently, GetTupleSize() handles the existing heap, buffer, minmal and
=> minimal
+       /* Mulit insert state if requested, otherwise NULL. */
=> multi
+ * Buffer the input slots and insert the tuples from the buffered slots at a
=> *one* at a time ?
+ * Compute the size of the tuple only if mi_max_size i.e. the total tuple size
=> I guess you mean max_size

This variable could use a better name:
+CopyMulitInsertFlushBuffers(List **mirri, ..
mirri is fine for a local variable like an element of a struture/array, or a
loop variable, but not for a function parameter which is an "List" of arbitrary
pointers.

I think this comment needs to be updated (again) for the removal of the Info
structure.
- * CopyMultiInsertBuffer items stored in CopyMultiInsertInfo's
+ * multi insert buffer items stored in CopyMultiInsertInfo's

There's some superfluous whitespace (and other) changes there which make the
patch unnecessarily long.

I will correct them and post the next version of the patch set. Before
that, I would like to have the discussion and thoughts on the APIs and
their usefulness.

I think the COPY patch should be 0002 (or maybe merged into 0001).

I can make it as a 0002 patch.

You made the v2 insert interface a requirement for all table AMs.
Should it be optional, and fall back to simple inserts if not implemented ?

I tried to implement the APIs mentioned by Andreas here in [1]/messages/by-id/CALj2ACX0u=QvB7GHLEqeVYwvs2eQS7=-cEuem7ZaF=p+qZ0ikA@mail.gmail.com. I just
used v2 table am APIs in existing table_insert places to show that it
works. Having said that, if you notice, I moved the bulk insert
allocation and deallocation to the new APIs table_insert_begin() and
table_insert_end() respectively, which make them tableam specific.
Currently, the bulk insert state is outside and independent of
tableam. I think we should not make bulk insert state allocation and
deallocation tableam specific. Thoughts?

[1]: /messages/by-id/CALj2ACX0u=QvB7GHLEqeVYwvs2eQS7=-cEuem7ZaF=p+qZ0ikA@mail.gmail.com

For CTAS, I think we need to consider Paul's idea here.
/messages/by-id/26C14A63-CCE5-4B46-975A-57C1784B3690@vmware.com

IMO, if we were to allow those raw insert APIs to perform parallel
inserts, then we would be reimplementing the existing table_insert or
table_mulit_insert API by having some sort of shared memory for
coordinating among workers and so on, may be in some other way. Yes,
we could avoid all the existing locking and shared buffers with those
raw insert APIs, I also feel that we can now do that with the existing
insert APIs for unlogged tables and bulk insert state. To me, the raw
insert APIs after implementing them for the parallel inserts, they
would look like the existing insert APIs for unlogged tables and with
bulk insert state. Thoughts?

Please have a look at [1]/messages/by-id/CALj2ACX0u=QvB7GHLEqeVYwvs2eQS7=-cEuem7ZaF=p+qZ0ikA@mail.gmail.com for detailed comment.

[1]: /messages/by-id/CALj2ACX0u=QvB7GHLEqeVYwvs2eQS7=-cEuem7ZaF=p+qZ0ikA@mail.gmail.com

Conceivably, tableam should support something like that for arbitrary AMs
("insert into a new table for which we have exclusive lock"). I think that AM
method should also be optional. It should be possible to implement a minimal
AM without implementing every available optimization, which may not apply to
all AMs, anyway.

I could not understand this point well. Maybe more thoughts help me here.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#5Justin Pryzby
pryzby@telsasoft.com
In reply to: Bharath Rupireddy (#4)
Re: New Table Access Methods for Multi and Single Inserts

On Thu, Dec 17, 2020 at 04:35:33PM +0530, Bharath Rupireddy wrote:

You made the v2 insert interface a requirement for all table AMs.
Should it be optional, and fall back to simple inserts if not implemented ?

I tried to implement the APIs mentioned by Andreas here in [1]. I just
used v2 table am APIs in existing table_insert places to show that it
works. Having said that, if you notice, I moved the bulk insert
allocation and deallocation to the new APIs table_insert_begin() and
table_insert_end() respectively, which make them tableam specific.

I mean I think it should be optional for a tableam to support the optimized
insert routines. Here, you've made it a requirement.

+       Assert(routine->tuple_insert_begin != NULL);
+       Assert(routine->tuple_insert_v2 != NULL);
+       Assert(routine->multi_insert_v2 != NULL);
+       Assert(routine->multi_insert_flush != NULL);
+       Assert(routine->tuple_insert_end != NULL);
+static inline void
+table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
+{
+       state->rel->rd_tableam->multi_insert_v2(state, slot);
+}

If multi_insert_v2 == NULL, I think table_multi_insert_v2() would just call
table_insert_v2(), and begin/flush/end would do nothing. If
table_multi_insert_v2!=NULL, then you should assert that the other routines are
provided.

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

--
Justin

#6Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#5)
Re: New Table Access Methods for Multi and Single Inserts

On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Dec 17, 2020 at 04:35:33PM +0530, Bharath Rupireddy wrote:

You made the v2 insert interface a requirement for all table AMs.
Should it be optional, and fall back to simple inserts if not implemented ?

I tried to implement the APIs mentioned by Andreas here in [1]. I just
used v2 table am APIs in existing table_insert places to show that it
works. Having said that, if you notice, I moved the bulk insert
allocation and deallocation to the new APIs table_insert_begin() and
table_insert_end() respectively, which make them tableam specific.

I mean I think it should be optional for a tableam to support the optimized
insert routines. Here, you've made it a requirement.

+       Assert(routine->tuple_insert_begin != NULL);
+       Assert(routine->tuple_insert_v2 != NULL);
+       Assert(routine->multi_insert_v2 != NULL);
+       Assert(routine->multi_insert_flush != NULL);
+       Assert(routine->tuple_insert_end != NULL);

+1 to make them optional. I will change.

+static inline void
+table_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
+{
+       state->rel->rd_tableam->multi_insert_v2(state, slot);
+}

If multi_insert_v2 == NULL, I think table_multi_insert_v2() would just call
table_insert_v2(), and begin/flush/end would do nothing. If
table_multi_insert_v2!=NULL, then you should assert that the other routines are
provided.

What should happen if both multi_insert_v2 and insert_v2 are NULL?
Should we error out from table_insert_v2()?

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

I could not think of other tableam attributes now. But +1 to have that
kind of flexible structure for TableInsertState. So, it can have
tableam type and attributes within the union for each type.

I moved the bulk insert allocation and deallocation to the new APIs table_insert_begin()
and table_insert_end() respectively, which make them tableam specific.
Currently, the bulk insert state is outside and independent of
tableam. I think we should not make bulk insert state allocation and
deallocation tableam specific.

Any thoughts on the above point?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#7Justin Pryzby
pryzby@telsasoft.com
In reply to: Bharath Rupireddy (#6)
Re: New Table Access Methods for Multi and Single Inserts

On Fri, Dec 18, 2020 at 07:39:14AM +0530, Bharath Rupireddy wrote:

On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

I could not think of other tableam attributes now. But +1 to have that
kind of flexible structure for TableInsertState. So, it can have
tableam type and attributes within the union for each type.

Right now you have heap_insert_begin(), and I asked if it was really
heap-specific. Right now, it populates a struct based on a static list of
arguments, which are what heap uses.

If you were to implement a burp_insert_begin(), how would it differ from
heap's? With the current API, they'd (have to) be the same, which means either
that it should apply to all AMs (or have a "default" implementation that can be
overridden by an AM), or that this API assumes that other AMs will want to do
exactly what heap does, and fails to allow other AMs to implement optimizations
for bulk inserts as claimed.

I don't think using a "union" solves the problem, since it can only accommodate
core AMs, and not extensions, so I suggested something like reloptions, which
have a "namespace" prefix (and core has toast.*, like ALTER TABLE t SET
toast.autovacuum_enabled).

--
Justin

#8Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#7)
Re: New Table Access Methods for Multi and Single Inserts

On Fri, Dec 18, 2020 at 11:24 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Fri, Dec 18, 2020 at 07:39:14AM +0530, Bharath Rupireddy wrote:

On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

I could not think of other tableam attributes now. But +1 to have that
kind of flexible structure for TableInsertState. So, it can have
tableam type and attributes within the union for each type.

Right now you have heap_insert_begin(), and I asked if it was really
heap-specific. Right now, it populates a struct based on a static list of
arguments, which are what heap uses.

If you were to implement a burp_insert_begin(), how would it differ from
heap's? With the current API, they'd (have to) be the same, which means either
that it should apply to all AMs (or have a "default" implementation that can be
overridden by an AM), or that this API assumes that other AMs will want to do
exactly what heap does, and fails to allow other AMs to implement optimizations
for bulk inserts as claimed.

I don't think using a "union" solves the problem, since it can only accommodate
core AMs, and not extensions, so I suggested something like reloptions, which
have a "namespace" prefix (and core has toast.*, like ALTER TABLE t SET
toast.autovacuum_enabled).

IIUC, your suggestion is to make the heap options such as
alloc_bistate(bulk insert state is required or not), mi_max_slots
(number of maximum buffered slots/tuples) and mi_max_size (the maximum
tuple size of the buffered slots) as reloptions with some default
values in reloptions.c under RELOPT_KIND_HEAP category so that they
can be modified by users on a per table basis. And likewise other
tableam options can be added by the tableam developers. This way, the
APIs will become more generic. The tableam developers need to add
reloptions of their choice and use them in the new API
implementations.

Let me know if I am missing anything from what you have in your mind.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#9Justin Pryzby
pryzby@telsasoft.com
In reply to: Justin Pryzby (#7)
Re: New Table Access Methods for Multi and Single Inserts

On Fri, Dec 18, 2020 at 11:54:39AM -0600, Justin Pryzby wrote:

On Fri, Dec 18, 2020 at 07:39:14AM +0530, Bharath Rupireddy wrote:

On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

I could not think of other tableam attributes now. But +1 to have that
kind of flexible structure for TableInsertState. So, it can have
tableam type and attributes within the union for each type.

Right now you have heap_insert_begin(), and I asked if it was really
heap-specific. Right now, it populates a struct based on a static list of
arguments, which are what heap uses.

If you were to implement a burp_insert_begin(), how would it differ from
heap's? With the current API, they'd (have to) be the same, which means either
that it should apply to all AMs (or have a "default" implementation that can be
overridden by an AM), or that this API assumes that other AMs will want to do
exactly what heap does, and fails to allow other AMs to implement optimizations
for bulk inserts as claimed.

I don't think using a "union" solves the problem, since it can only accommodate
core AMs, and not extensions, so I suggested something like reloptions, which
have a "namespace" prefix (and core has toast.*, like ALTER TABLE t SET
toast.autovacuum_enabled).

I think you'd want to handle things like:

- a compressed AM wants to specify a threshold for a tuple's *compressed* size
(maybe in addition to the uncompressed size);
- a "columnar" AM wants to specify a threshold size for a column, rather
than for each tuple;

I'm not proposing to handle those specific parameters, but rather pointing out
that your implementation doesn't allow handling AM-specific considerations,
which I think was the goal.

The TableInsertState structure would need to store those, and then the AM's
multi_insert_v2 routine would need to make use of them.

It feels a bit like we'd introduce the idea of an "AM option", except that it
wouldn't be user-facing (or maybe some of them would be?). Maybe I've
misunderstood though, so other opinions are welcome.

--
Justin

#10Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#9)
Re: New Table Access Methods for Multi and Single Inserts

On Mon, Dec 21, 2020 at 1:17 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Fri, Dec 18, 2020 at 11:54:39AM -0600, Justin Pryzby wrote:

On Fri, Dec 18, 2020 at 07:39:14AM +0530, Bharath Rupireddy wrote:

On Fri, Dec 18, 2020 at 2:14 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

Are you thinking that TableInsertState would eventually have additional
attributes which would apply to other tableams, but not to heap ? Is
heap_insert_begin() really specific to heap ? It's allocating and populating a
structure based on its arguments, but those same arguments would be passed to
every other AM's insert_begin routine, too. Do you need a more flexible data
structure, something that would also accomodate extensions? I'm thinking of
reloptions as a loose analogy.

I could not think of other tableam attributes now. But +1 to have that
kind of flexible structure for TableInsertState. So, it can have
tableam type and attributes within the union for each type.

Right now you have heap_insert_begin(), and I asked if it was really
heap-specific. Right now, it populates a struct based on a static list of
arguments, which are what heap uses.

If you were to implement a burp_insert_begin(), how would it differ from
heap's? With the current API, they'd (have to) be the same, which means either
that it should apply to all AMs (or have a "default" implementation that can be
overridden by an AM), or that this API assumes that other AMs will want to do
exactly what heap does, and fails to allow other AMs to implement optimizations
for bulk inserts as claimed.

I don't think using a "union" solves the problem, since it can only accommodate
core AMs, and not extensions, so I suggested something like reloptions, which
have a "namespace" prefix (and core has toast.*, like ALTER TABLE t SET
toast.autovacuum_enabled).

I think you'd want to handle things like:

- a compressed AM wants to specify a threshold for a tuple's *compressed* size
(maybe in addition to the uncompressed size);
- a "columnar" AM wants to specify a threshold size for a column, rather
than for each tuple;

I'm not proposing to handle those specific parameters, but rather pointing out
that your implementation doesn't allow handling AM-specific considerations,
which I think was the goal.

The TableInsertState structure would need to store those, and then the AM's
multi_insert_v2 routine would need to make use of them.

It feels a bit like we'd introduce the idea of an "AM option", except that it
wouldn't be user-facing (or maybe some of them would be?). Maybe I've
misunderstood though, so other opinions are welcome.

Attaching a v2 patch for the new table AMs.

This patch has following changes:

1) Made the TableInsertState structure generic by having a void
pointer for multi insert state and defined the heap specific multi
insert state information in heapam.h. This way each AM can have it's
own multi insert state structure and dereference the void pointer
using that structure inside the respective AM implementations.
2) Earlier in the v1 patch, the bulk insert state
allocation/deallocation was moved to AM level, but I see that there's
nothing specific in doing so and I think it should be independent of
AM. So I'm doing that in table_insert_begin() and table_insert_end().
Because of this, I had to move the BulkInsert function declarations
from heapam.h to tableam.h
3) Corrected the typos and tried to adjust indentation of the code.

Note that I have not yet made the multi_insert_v2 API optional as
suggested earlier. I will think more on this and update.

I'm not posting the updated 0002 to 0004 patches, I plan to do so
after a couple of reviews happen on the design of the APIs in 0001.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

v2-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patchapplication/x-patch; name=v2-0001-New-Table-AMs-for-Multi-and-Single-Inserts.patchDownload+442-7
#11Justin Pryzby
pryzby@telsasoft.com
In reply to: Bharath Rupireddy (#10)
Re: New Table Access Methods for Multi and Single Inserts

On Thu, Dec 24, 2020 at 05:48:42AM +0530, Bharath Rupireddy wrote:

I'm not posting the updated 0002 to 0004 patches, I plan to do so
after a couple of reviews happen on the design of the APIs in 0001.

Thoughts?

Are you familiar with this work ?

https://commitfest.postgresql.org/31/2717/
Reloptions for table access methods

It seems like that can be relevant for your patch, and I think some of what
your patch needs might be provided by AM opts.

It's difficult to generalize AMs when we have only one, but your use-case might
be a concrete example which would help to answer some questions on the other
thread.

@Jeff: https://commitfest.postgresql.org/31/2871/

--
Justin

#12Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Justin Pryzby (#11)
Re: New Table Access Methods for Multi and Single Inserts

On Fri, Dec 25, 2020 at 8:10 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Dec 24, 2020 at 05:48:42AM +0530, Bharath Rupireddy wrote:

I'm not posting the updated 0002 to 0004 patches, I plan to do so
after a couple of reviews happen on the design of the APIs in 0001.

Thoughts?

Are you familiar with this work ?

https://commitfest.postgresql.org/31/2717/
Reloptions for table access methods

It seems like that can be relevant for your patch, and I think some of what
your patch needs might be provided by AM opts.

It's difficult to generalize AMs when we have only one, but your use-case might
be a concrete example which would help to answer some questions on the other
thread.

@Jeff: https://commitfest.postgresql.org/31/2871/

Note that I have not gone through the entire thread at [1]https://commitfest.postgresql.org/31/2717/. On some
initial study, that patch is proposing to allow different table AMs to
have custom rel options.

In the v2 patch that I sent upthread [2]/messages/by-id/CALj2ACWMnZZCu=G0PJkEeYYicKeuJ-X=SU19i6vQ1+=uXz8u0Q@mail.gmail.com for new table AMs has heap AM
multi insert code moved inside the new heap AM implementation and I
don't see any need of having rel options. In case, any other AMs want
to have the control for their multi insert API implementation via rel
options, I think the proposal at [1]https://commitfest.postgresql.org/31/2717/ can be useful.

IIUC, there's no dependency or anything as such for the new table AM
patch with the rel options thread [1]https://commitfest.postgresql.org/31/2717/. If I'm right, can this new
table AM patch [2]/messages/by-id/CALj2ACWMnZZCu=G0PJkEeYYicKeuJ-X=SU19i6vQ1+=uXz8u0Q@mail.gmail.com be reviewed further?

Thoughts?

[1]: https://commitfest.postgresql.org/31/2717/
[2]: /messages/by-id/CALj2ACWMnZZCu=G0PJkEeYYicKeuJ-X=SU19i6vQ1+=uXz8u0Q@mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#13Luc Vlaming
luc@swarm64.com
In reply to: Bharath Rupireddy (#12)
Re: New Table Access Methods for Multi and Single Inserts

On 28-12-2020 13:48, Bharath Rupireddy wrote:

On Fri, Dec 25, 2020 at 8:10 AM Justin Pryzby <pryzby@telsasoft.com> wrote:

On Thu, Dec 24, 2020 at 05:48:42AM +0530, Bharath Rupireddy wrote:

I'm not posting the updated 0002 to 0004 patches, I plan to do so
after a couple of reviews happen on the design of the APIs in 0001.

Thoughts?

Are you familiar with this work ?

https://commitfest.postgresql.org/31/2717/
Reloptions for table access methods

It seems like that can be relevant for your patch, and I think some of what
your patch needs might be provided by AM opts.

It's difficult to generalize AMs when we have only one, but your use-case might
be a concrete example which would help to answer some questions on the other
thread.

@Jeff: https://commitfest.postgresql.org/31/2871/

Note that I have not gone through the entire thread at [1]. On some
initial study, that patch is proposing to allow different table AMs to
have custom rel options.

In the v2 patch that I sent upthread [2] for new table AMs has heap AM
multi insert code moved inside the new heap AM implementation and I
don't see any need of having rel options. In case, any other AMs want
to have the control for their multi insert API implementation via rel
options, I think the proposal at [1] can be useful.

Thoughts?

[1] - https://commitfest.postgresql.org/31/2717/
[2] - /messages/by-id/CALj2ACWMnZZCu=G0PJkEeYYicKeuJ-X=SU19i6vQ1+=uXz8u0Q@mail.gmail.com

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Hi,

IIUC, there's no dependency or anything as such for the new table AM
patch with the rel options thread [1]. If I'm right, can this new
table AM patch [2] be reviewed further?

To me this seems good enough. Reason is that I anticipate that there
would not necessarily be per-table options for now but rather global
options, if any. Moreover, if we want to make these kind of tradeoffs
user-controllable I would argue this should be done in a different
patch-set either way. Reason is that there are parameters in heap
already that are computed / hardcoded as well (see e.g.
RelationAddExtraBlocks).

===

As to the patches themselves:

I think the API is a huge step forward! I assume that we want to have a
single-insert API like heap_insert_v2 so that we can encode the
knowledge that there will just be a single insert coming and likely a
commit afterwards?

Reason I'm asking is that I quite liked the heap_insert_begin parameter
is_multi, which could even be turned into a "expected_rowcount" of the
amount of rows expected to be commited in the transaction (e.g. single,
several, thousands/stream).
If we were to make the API based on expected rowcounts, the whole
heap_insert_v2, heap_insert and heap_multi_insert could be turned into a
single function heap_insert, as the knowledge about buffering of the
slots is then already stored in the TableInsertState, creating an API like:

// expectedRows: -1 = streaming, otherwise expected rowcount.
TableInsertState* heap_insert_begin(Relation rel, CommandId cid, int
options, int expectedRows);
heap_insert(TableInsertState *state, TupleTableSlot *slot);

Do you think that's a good idea?

Two smaller things I'm wondering:
- the clear_mi_slots; why is this not in the HeapMultiInsertState? the
slots themselves are declared there? also, the boolean themselves is
somewhat problematic I think because it would only work if you specified
is_multi=true which would depend on the actual tableam to implement this
then in a way that copy/ctas/etc can also use the slot properly, which I
think would severely limit their freedom to store the slots more
efficiently? Also, why do we want to do ExecClearTuple() anyway? Isn't
it good enough that the next call to ExecCopySlot will effectively clear
it out?
- flushed -> why is this a stored boolean? isn't this indirectly encoded
by cur_slots/cur_size == 0?

For patches 02-04 I quickly skimmed through them as I assume we first
want the API agreed upon. Generally they look nice and like a big step
forward. What I'm just wondering about is the usage of the
implementation details like mistate->slots[X]. It makes a lot of sense
to do so but also makes for a difficult compromise, because now the
tableam has to guarantee a copy of the slot, and hopefully even one in a
somewhat efficient form.

Kind regards,
Luc

#14Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Luc Vlaming (#13)
Re: New Table Access Methods for Multi and Single Inserts

On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming <luc@swarm64.com> wrote:

table AM patch [2] be reviewed further?

As to the patches themselves:

I think the API is a huge step forward! I assume that we want to have a
single-insert API like heap_insert_v2 so that we can encode the
knowledge that there will just be a single insert coming and likely a
commit afterwards?

Reason I'm asking is that I quite liked the heap_insert_begin parameter
is_multi, which could even be turned into a "expected_rowcount" of the
amount of rows expected to be commited in the transaction (e.g. single,
several, thousands/stream).
If we were to make the API based on expected rowcounts, the whole
heap_insert_v2, heap_insert and heap_multi_insert could be turned into a
single function heap_insert, as the knowledge about buffering of the
slots is then already stored in the TableInsertState, creating an API

like:

// expectedRows: -1 = streaming, otherwise expected rowcount.
TableInsertState* heap_insert_begin(Relation rel, CommandId cid, int
options, int expectedRows);
heap_insert(TableInsertState *state, TupleTableSlot *slot);

Do you think that's a good idea?

IIUC, your suggestion is to use expectedRows and move the multi insert
implementation heap_multi_insert_v2 to heap_insert_v2. If that's correct,
so heap_insert_v2 will look something like this:

heap_insert_v2()
{
if (single_insert)
//do single insertion work, the code in existing heap_insert_v2 comes
here
else
//do multi insertion work, the code in existing heap_multi_insert_v2
comes here
}

I don't see any problem in combining single and multi insert APIs into one.
Having said that, will the APIs be cleaner then? Isn't it going to be
confusing if a single heap_insert_v2 API does both the works? With the
existing separate APIs, for single insertion, the sequence of the API can
be like begin, insert_v2, end and for multi inserts it's like begin,
multi_insert_v2, flush, end. I prefer to have a separate multi insert API
so that it will make the code look readable.

Thoughts?

Two smaller things I'm wondering:
- the clear_mi_slots; why is this not in the HeapMultiInsertState? the
slots themselves are declared there?

Firstly, we need to have the buffered slots sometimes(please have a look at
the comments in TableInsertState structure) outside the multi_insert API.
And we need to have cleared the previously flushed slots before we start
buffering in heap_multi_insert_v2(). I can remove the clear_mi_slots flag
altogether and do as follows: I will not set mistate->cur_slots to 0 in
heap_multi_insert_flush after the flush, I will only set state->flushed to
true. In heap_multi_insert_v2,

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
TupleTableSlot *batchslot;
HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
Size sz;

Assert(mistate && mistate->slots);

* /* if the slots are flushed previously then clear them off before using
them again. */ if (state->flushed) { int i; for (i = 0;
i < mistate->cur_slots; i++) ExecClearTuple(mistate->slots[i]);
mistate->cur_slots = 0; state->flushed = false }*

if (mistate->slots[mistate->cur_slots] == NULL)
mistate->slots[mistate->cur_slots] =
table_slot_create(state->rel, NULL);

batchslot = mistate->slots[mistate->cur_slots];

ExecCopySlot(batchslot, slot);

Thoughts?

Also, why do we want to do ExecClearTuple() anyway? Isn't
it good enough that the next call to ExecCopySlot will effectively clear
it out?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the
slot before copying. But, for buffer heap slots, the
tts_buffer_heap_copyslot does not always clear the destination slot, see
below. If we fall into else condition, we might get some issues. And also
note that, once the slot is cleared in ExecClearTuple, it will not be
cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be false.
That is why, let's have ExecClearTuple as is.

/*
* If the source slot is of a different kind, or is a buffer slot that
has
* been materialized / is virtual, make a new copy of the tuple.
Otherwise
* make a new reference to the in-buffer tuple.
*/
if (dstslot->tts_ops != srcslot->tts_ops ||
TTS_SHOULDFREE(srcslot) ||
!bsrcslot->base.tuple)
{
MemoryContext oldContext;

ExecClearTuple(dstslot);
}
else
{
Assert(BufferIsValid(bsrcslot->buffer));

tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
bsrcslot->buffer, false);

- flushed -> why is this a stored boolean? isn't this indirectly encoded
by cur_slots/cur_size == 0?

Note that cur_slots is in HeapMultiInsertState and outside of the new APIs
i.e. in TableInsertState, mistate is a void pointer, and we can't really
access the cur_slots. I mean, we can access but we need to be dereferencing
using the tableam kind. Instead of doing all of that, to keep the API
cleaner, I chose to have a boolean in the TableInsertState which we can see
and use outside of the new APIs. Hope that's fine.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#15Jeff Davis
pgsql@j-davis.com
In reply to: Luc Vlaming (#13)
Re: New Table Access Methods for Multi and Single Inserts

On Mon, 2021-01-04 at 08:59 +0100, Luc Vlaming wrote:

Reason I'm asking is that I quite liked the heap_insert_begin
parameter
is_multi, which could even be turned into a "expected_rowcount" of
the
amount of rows expected to be commited in the transaction (e.g.
single,
several, thousands/stream).

Do you mean "written by the statement" instead of "committed in the
transaction"? It doesn't look like the TableInsertState state will
survive across statement boundaries.

Though that is an important question to consider. If the premise is
that a given custom AM may be much more efficient at bulk inserts than
retail inserts (which is reasonable), then it makes sense to handle the
case of a transaction with many single-tuple inserts. But keeping
insert state across statement boundaries also raises a few potential
problems.

Regards,
Jeff Davis

#16Luc Vlaming
luc@swarm64.com
In reply to: Jeff Davis (#15)
Re: New Table Access Methods for Multi and Single Inserts

On 05-01-2021 22:28, Jeff Davis wrote:

On Mon, 2021-01-04 at 08:59 +0100, Luc Vlaming wrote:

Reason I'm asking is that I quite liked the heap_insert_begin
parameter
is_multi, which could even be turned into a "expected_rowcount" of
the
amount of rows expected to be commited in the transaction (e.g.
single,
several, thousands/stream).

Do you mean "written by the statement" instead of "committed in the
transaction"? It doesn't look like the TableInsertState state will
survive across statement boundaries.

Though that is an important question to consider. If the premise is
that a given custom AM may be much more efficient at bulk inserts than
retail inserts (which is reasonable), then it makes sense to handle the
case of a transaction with many single-tuple inserts. But keeping
insert state across statement boundaries also raises a few potential
problems.

Regards,
Jeff Davis

I did actually mean until the end of the transaction. I know this is
currently not possible with the current design but I think it would be
cool to start going that way (even if slightly). Creating some more
freedom on how a tableam optimizes inserts, when one syncs to disk, etc
would be good imo. It would allow one to create e.g. a tableam that
would not have as a high overhead when doing single statement inserts.

Kind regards,
Luc

#17Luc Vlaming
luc@swarm64.com
In reply to: Bharath Rupireddy (#14)
Re: New Table Access Methods for Multi and Single Inserts

On 05-01-2021 11:06, Bharath Rupireddy wrote:

On Mon, Jan 4, 2021 at 1:29 PM Luc Vlaming <luc@swarm64.com
<mailto:luc@swarm64.com>> wrote:

 > table AM patch [2] be reviewed further?
As to the patches themselves:

I think the API is a huge step forward! I assume that we want to have a
single-insert API like heap_insert_v2 so that we can encode the
knowledge that there will just be a single insert coming and likely a
commit afterwards?

Reason I'm asking is that I quite liked the heap_insert_begin parameter
is_multi, which could even be turned into a "expected_rowcount" of the
amount of rows expected to be commited in the transaction (e.g. single,
several, thousands/stream).
If we were to make the API based on expected rowcounts, the whole
heap_insert_v2, heap_insert and heap_multi_insert could be turned into a
single function heap_insert, as the knowledge about buffering of the
slots is then already stored in the TableInsertState, creating an API

like:

// expectedRows: -1 = streaming, otherwise expected rowcount.
TableInsertState* heap_insert_begin(Relation rel, CommandId cid, int
options, int expectedRows);
heap_insert(TableInsertState *state, TupleTableSlot *slot);

Do you think that's a good idea?

IIUC, your suggestion is to use expectedRows and move the multi insert
implementation heap_multi_insert_v2 to heap_insert_v2. If that's
correct, so heap_insert_v2 will look something like this:

heap_insert_v2()
{
    if (single_insert)
      //do single insertion work, the code in existing heap_insert_v2
comes here
   else
      //do multi insertion work, the code in existing
heap_multi_insert_v2 comes here
}

I don't see any problem in combining single and multi insert APIs into
one. Having said that, will the APIs be cleaner then? Isn't it going to
be confusing if a single heap_insert_v2 API does both the works? With
the existing separate APIs, for single insertion, the sequence of the
API can be like begin, insert_v2, end and for multi inserts it's like
begin, multi_insert_v2, flush, end. I prefer to have a separate multi
insert API so that it will make the code look readable.

Thoughts?

The main reason for me for wanting a single API is that I would like the
decision of using single or multi inserts to move to inside the tableam.
For e.g. a heap insert we might want to put the threshold at e.g. 100
rows so that the overhead of buffering the tuples is actually
compensated. For other tableam this logic might also be quite different,
and I think therefore that it shouldn't be e.g. COPY or CTAS deciding
whether or not multi inserts should be used. Because otherwise the thing
we'll get is that there will be tableams that will ignore this flag and
do their own thing anyway. I'd rather have an API that gives all
necessary information to the tableam and then make the tableam do "the
right thing".

Another reason I'm suggesting this API is that I would expect that the
begin is called in a different place in the code for the (multiple)
inserts than the actual insert statement.
To me conceptually the begin and end are like e.g. the executor begin
and end: you prepare the inserts with the knowledge you have at that
point. I assumed (wrongly?) that during the start of the statement one
knows best how many rows are coming; and then the actual insertion of
the row doesn't have to deal anymore with multi/single inserts, choosing
when to buffer or not, because that information has already been given
during the initial phase. One of the reasons this is appealing to me is
that e.g. in [1] there was discussion on when to switch to a multi
insert state, and imo this should be up to the tableam.

Two smaller things I'm wondering:
- the clear_mi_slots; why is this not in the HeapMultiInsertState? the
slots themselves are declared there?

Firstly, we need to have the buffered slots sometimes(please have a look
at the comments in TableInsertState structure) outside the multi_insert
API. And we need to have cleared the previously flushed slots before we
start buffering in heap_multi_insert_v2(). I can remove the
clear_mi_slots flag altogether and do as follows: I will not set
mistate->cur_slots to 0 in heap_multi_insert_flush after the flush, I
will only set state->flushed to true. In heap_multi_insert_v2,

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
    TupleTableSlot  *batchslot;
    HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
    Size sz;

    Assert(mistate && mistate->slots);

*  /* if the slots are flushed previously then clear them off before
using them again. */
    if (state->flushed)
    {
        int i;

        for (i = 0; i < mistate->cur_slots; i++)
            ExecClearTuple(mistate->slots[i]);

        mistate->cur_slots = 0;
        state->flushed = false
    }*

    if (mistate->slots[mistate->cur_slots] == NULL)
        mistate->slots[mistate->cur_slots] =
                                    table_slot_create(state->rel, NULL);

    batchslot = mistate->slots[mistate->cur_slots];

    ExecCopySlot(batchslot, slot);

Thoughts?

From what I can see you can just keep the v2-0001 patch and:
- remove the flushed variable alltogether. mistate->cur_slots == 0
encodes this already and the variable is never actually checked on.
- call ExecClearTuple just before ExecCopySlot()

Which would make the code something like:

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
TupleTableSlot *batchslot;
HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
Size sz;

Assert(mistate && mistate->slots);

if (mistate->slots[mistate->cur_slots] == NULL)
mistate->slots[mistate->cur_slots] =
table_slot_create(state->rel, NULL);

batchslot = mistate->slots[mistate->cur_slots];

ExecClearTuple(batchslot);
ExecCopySlot(batchslot, slot);

/*
* Calculate the tuple size after the original slot is copied, because the
* copied slot type and the tuple size may change.
*/
sz = GetTupleSize(batchslot, mistate->max_size);

Assert(sz > 0);

mistate->cur_slots++;
mistate->cur_size += sz;

if (mistate->cur_slots >= mistate->max_slots ||
mistate->cur_size >= mistate->max_size)
heap_multi_insert_flush(state);
}

void
heap_multi_insert_flush(TableInsertState *state)
{
HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
MemoryContext oldcontext;

Assert(mistate && mistate->slots && mistate->cur_slots >= 0 &&
mistate->context);

if (mistate->cur_slots == 0)
return;

oldcontext = MemoryContextSwitchTo(mistate->context);

heap_multi_insert(state->rel, mistate->slots, mistate->cur_slots,
state->cid, state->options, state->bistate);

MemoryContextReset(mistate->context);
MemoryContextSwitchTo(oldcontext);

/*
* Do not clear the slots always. Sometimes callers may want the slots for
* index insertions or after row trigger executions in which case they have
* to clear the tuples before using for the next insert batch.
*/
if (state->clear_mi_slots)
{
int i;

for (i = 0; i < mistate->cur_slots; i++)
ExecClearTuple(mistate->slots[i]);
}

mistate->cur_slots = 0;
mistate->cur_size = 0;
}

Also, why do we want to do ExecClearTuple() anyway? Isn't
it good enough that the next call to ExecCopySlot will effectively clear
it out?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the
slot before copying. But, for buffer heap slots, the
tts_buffer_heap_copyslot does not always clear the destination slot, see
below. If we fall into else condition, we might get some issues. And
also note that, once the slot is cleared in ExecClearTuple, it will not
be cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be
false. That is why, let's have ExecClearTuple as is.

I had no idea the buffer heap slot doesn't unconditionally clear out the
slot :( So yes lets call it unconditionally ourselves. See also
suggestion above.

    /*
     * If the source slot is of a different kind, or is a buffer slot
that has
     * been materialized / is virtual, make a new copy of the tuple.
Otherwise
     * make a new reference to the in-buffer tuple.
     */
    if (dstslot->tts_ops != srcslot->tts_ops ||
        TTS_SHOULDFREE(srcslot) ||
        !bsrcslot->base.tuple)
    {
        MemoryContext oldContext;

        ExecClearTuple(dstslot);
    }
    else
    {
        Assert(BufferIsValid(bsrcslot->buffer));

        tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
                                    bsrcslot->buffer, false);

- flushed -> why is this a stored boolean? isn't this indirectly encoded
by cur_slots/cur_size == 0?

Note that cur_slots is in HeapMultiInsertState and outside of the new
APIs i.e. in TableInsertState, mistate is a void pointer, and we can't
really access the cur_slots. I mean, we can access but we need to be
dereferencing using the tableam kind. Instead of  doing all of that, to
keep the API cleaner, I chose to have a boolean in the TableInsertState
which we can see and use outside of the new APIs. Hope that's fine.

So you mean the flushed variable is actually there to tell the user of
the API that they are supposed to call flush before end? Why can't the
end call flush itself then? I guess I completely misunderstood the
purpose of table_multi_insert_flush being public. I had assumed it is
there to from the usage site indicate that now would be a good time to
flush, e.g. because of a statement ending or something. I had not
understood this is a requirement that its always required to do
table_multi_insert_flush + table_insert_end.
IMHO I would hide this from the callee, given that you would only really
call flush yourself when you immediately after would call end, or are
there other cases where one would be required to explicitly call flush?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com <http://www.enterprisedb.com&gt;

Kind regards,
Luc

#18Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Luc Vlaming (#17)
Re: New Table Access Methods for Multi and Single Inserts

On Wed, Jan 6, 2021 at 12:56 PM Luc Vlaming <luc@swarm64.com> wrote:

The main reason for me for wanting a single API is that I would like the
decision of using single or multi inserts to move to inside the tableam.
For e.g. a heap insert we might want to put the threshold at e.g. 100
rows so that the overhead of buffering the tuples is actually
compensated. For other tableam this logic might also be quite different,
and I think therefore that it shouldn't be e.g. COPY or CTAS deciding
whether or not multi inserts should be used. Because otherwise the thing
we'll get is that there will be tableams that will ignore this flag and
do their own thing anyway. I'd rather have an API that gives all
necessary information to the tableam and then make the tableam do "the
right thing".

Another reason I'm suggesting this API is that I would expect that the
begin is called in a different place in the code for the (multiple)
inserts than the actual insert statement.
To me conceptually the begin and end are like e.g. the executor begin
and end: you prepare the inserts with the knowledge you have at that
point. I assumed (wrongly?) that during the start of the statement one
knows best how many rows are coming; and then the actual insertion of
the row doesn't have to deal anymore with multi/single inserts, choosing
when to buffer or not, because that information has already been given
during the initial phase. One of the reasons this is appealing to me is
that e.g. in [1] there was discussion on when to switch to a multi
insert state, and imo this should be up to the tableam.

Agree that whether to go with the multi or single inserts should be
completely left to tableam implementation, we, as callers of those API
just need to inform whether we expect single or multiple rows, and it
should be left to tableam implementation whether to actually go with
buffering or single inserts. ISTM that it's an elegant way of making
the API generic and abstracting everything from the callers. What I
wonder is how can we know in advance the expected row count that we
need to pass in to heap_insert_begin()? IIUC, we can not estimate the
upcoming rows in COPY, Insert Into Select, or Refresh Mat View or some
other insert queries? Of course, we can look at the planner's
estimated row count for the selects in COPY, Insert Into Select or
Refresh Mat View after the planning, but to me that's not something we
can depend on and pass in the row count to the insert APIs.

When we don't know the expected row count, why can't we(as callers of
the APIs) tell the APIs something like, "I'm intending to perform
multi inserts, so if possible and if you have a mechanism to buffer
the slots, do it, otherwise insert the tuples one by one, or else do
whatever you want to do with the tuples I give it you". So, in case of
COPY we can ask the API for multi inserts and call heap_insert_begin()
and heap_insert_v2().

Given the above explanation, I still feel bool is_multi would suffice.

Thoughts?

On dynamically, switching from single to multi inserts, this can be
done by heap_insert_v2 itself. The way I think it's possible is that,
say we have some threshold row count 1000(can be a macro) after
inserting those many tuples, heap_insert_v2 can switch to buffering
mode.

Thoughts?

Which would make the code something like:

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
TupleTableSlot *batchslot;
HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
Size sz;

Assert(mistate && mistate->slots);

if (mistate->slots[mistate->cur_slots] == NULL)
mistate->slots[mistate->cur_slots] =
table_slot_create(state->rel, NULL);

batchslot = mistate->slots[mistate->cur_slots];

ExecClearTuple(batchslot);
ExecCopySlot(batchslot, slot);

/*
* Calculate the tuple size after the original slot is copied, because the
* copied slot type and the tuple size may change.
*/
sz = GetTupleSize(batchslot, mistate->max_size);

Assert(sz > 0);

mistate->cur_slots++;
mistate->cur_size += sz;

if (mistate->cur_slots >= mistate->max_slots ||
mistate->cur_size >= mistate->max_size)
heap_multi_insert_flush(state);
}

I think clearing tuples before copying the slot as you suggested may
work without the need of clear_slots flag.

Also, why do we want to do ExecClearTuple() anyway? Isn't
it good enough that the next call to ExecCopySlot will effectively clear
it out?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the
slot before copying. But, for buffer heap slots, the
tts_buffer_heap_copyslot does not always clear the destination slot, see
below. If we fall into else condition, we might get some issues. And
also note that, once the slot is cleared in ExecClearTuple, it will not
be cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be
false. That is why, let's have ExecClearTuple as is.

I had no idea the buffer heap slot doesn't unconditionally clear out the
slot :( So yes lets call it unconditionally ourselves. See also
suggestion above.

Yeah, we will clear the tuple slot before copy to be on the safer side.

/*
* If the source slot is of a different kind, or is a buffer slot
that has
* been materialized / is virtual, make a new copy of the tuple.
Otherwise
* make a new reference to the in-buffer tuple.
*/
if (dstslot->tts_ops != srcslot->tts_ops ||
TTS_SHOULDFREE(srcslot) ||
!bsrcslot->base.tuple)
{
MemoryContext oldContext;

ExecClearTuple(dstslot);
}
else
{
Assert(BufferIsValid(bsrcslot->buffer));

tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
bsrcslot->buffer, false);

- flushed -> why is this a stored boolean? isn't this indirectly encoded
by cur_slots/cur_size == 0?

Note that cur_slots is in HeapMultiInsertState and outside of the new
APIs i.e. in TableInsertState, mistate is a void pointer, and we can't
really access the cur_slots. I mean, we can access but we need to be
dereferencing using the tableam kind. Instead of doing all of that, to
keep the API cleaner, I chose to have a boolean in the TableInsertState
which we can see and use outside of the new APIs. Hope that's fine.

So you mean the flushed variable is actually there to tell the user of
the API that they are supposed to call flush before end? Why can't the
end call flush itself then? I guess I completely misunderstood the
purpose of table_multi_insert_flush being public. I had assumed it is
there to from the usage site indicate that now would be a good time to
flush, e.g. because of a statement ending or something. I had not
understood this is a requirement that its always required to do
table_multi_insert_flush + table_insert_end.
IMHO I would hide this from the callee, given that you would only really
call flush yourself when you immediately after would call end, or are
there other cases where one would be required to explicitly call flush?

We need to know outside the multi_insert API whether the buffered
slots in case of multi inserts are flushed. Reason is that if we have
indexes or after row triggers, currently we call ExecInsertIndexTuples
or ExecARInsertTriggers on the buffered slots outside the API in a
loop after the flush.

If we agree on removing heap_multi_insert_v2 API and embed that logic
inside heap_insert_v2, then we can do this - pass the required
information and the functions ExecInsertIndexTuples and
ExecARInsertTriggers as callbacks so that, whether or not
heap_insert_v2 choses single or multi inserts, it can callback these
functions with the required information passed after the flush. We can
add the callback and required information into TableInsertState. But,
I'm not quite sure, we would make ExecInsertIndexTuples and
ExecARInsertTriggers. And in

If we don't want to go with callback way, then at least we need to
know whether or not heap_insert_v2 has chosen multi inserts, if yes,
the buffered slots array, and the number of current buffered slots,
whether they are flushed or not in the TableInsertState. Then,
eventually, we might need all the HeapMultiInsertState info in the
TableInsertState.

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

#19Luc Vlaming
luc@swarm64.com
In reply to: Bharath Rupireddy (#18)
Re: New Table Access Methods for Multi and Single Inserts

On 06-01-2021 14:06, Bharath Rupireddy wrote:

On Wed, Jan 6, 2021 at 12:56 PM Luc Vlaming <luc@swarm64.com> wrote:

The main reason for me for wanting a single API is that I would like the
decision of using single or multi inserts to move to inside the tableam.
For e.g. a heap insert we might want to put the threshold at e.g. 100
rows so that the overhead of buffering the tuples is actually
compensated. For other tableam this logic might also be quite different,
and I think therefore that it shouldn't be e.g. COPY or CTAS deciding
whether or not multi inserts should be used. Because otherwise the thing
we'll get is that there will be tableams that will ignore this flag and
do their own thing anyway. I'd rather have an API that gives all
necessary information to the tableam and then make the tableam do "the
right thing".

Another reason I'm suggesting this API is that I would expect that the
begin is called in a different place in the code for the (multiple)
inserts than the actual insert statement.
To me conceptually the begin and end are like e.g. the executor begin
and end: you prepare the inserts with the knowledge you have at that
point. I assumed (wrongly?) that during the start of the statement one
knows best how many rows are coming; and then the actual insertion of
the row doesn't have to deal anymore with multi/single inserts, choosing
when to buffer or not, because that information has already been given
during the initial phase. One of the reasons this is appealing to me is
that e.g. in [1] there was discussion on when to switch to a multi
insert state, and imo this should be up to the tableam.

Agree that whether to go with the multi or single inserts should be
completely left to tableam implementation, we, as callers of those API
just need to inform whether we expect single or multiple rows, and it
should be left to tableam implementation whether to actually go with
buffering or single inserts. ISTM that it's an elegant way of making
the API generic and abstracting everything from the callers. What I
wonder is how can we know in advance the expected row count that we
need to pass in to heap_insert_begin()? IIUC, we can not estimate the
upcoming rows in COPY, Insert Into Select, or Refresh Mat View or some
other insert queries? Of course, we can look at the planner's
estimated row count for the selects in COPY, Insert Into Select or
Refresh Mat View after the planning, but to me that's not something we
can depend on and pass in the row count to the insert APIs.

When we don't know the expected row count, why can't we(as callers of
the APIs) tell the APIs something like, "I'm intending to perform
multi inserts, so if possible and if you have a mechanism to buffer
the slots, do it, otherwise insert the tuples one by one, or else do
whatever you want to do with the tuples I give it you". So, in case of
COPY we can ask the API for multi inserts and call heap_insert_begin()
and heap_insert_v2().

I thought that when it is available (because of planning) it would be
nice to pass it in. If you don't know you could pass in a 1 for doing
single inserts, and e.g. -1 or max-int for streaming. The reason I
proposed it is so that tableam's have as much knowledge as posisble to
do the right thing. is_multi does also work of course but is just
somewhat less informative.

What to me seemed somewhat counterintuitive is that with the proposed
API it is possible to say is_multi=true and then still call
heap_insert_v2 to do a single insert.

Given the above explanation, I still feel bool is_multi would suffice.

Thoughts?

On dynamically, switching from single to multi inserts, this can be
done by heap_insert_v2 itself. The way I think it's possible is that,
say we have some threshold row count 1000(can be a macro) after
inserting those many tuples, heap_insert_v2 can switch to buffering
mode.

For that I thought it'd be good to use the expected row count, but yeah
dynamically switching also works and might work better if the expected
row counts are usually off.

Thoughts?

Which would make the code something like:

void
heap_multi_insert_v2(TableInsertState *state, TupleTableSlot *slot)
{
TupleTableSlot *batchslot;
HeapMultiInsertState *mistate = (HeapMultiInsertState *)state->mistate;
Size sz;

Assert(mistate && mistate->slots);

if (mistate->slots[mistate->cur_slots] == NULL)
mistate->slots[mistate->cur_slots] =
table_slot_create(state->rel, NULL);

batchslot = mistate->slots[mistate->cur_slots];

ExecClearTuple(batchslot);
ExecCopySlot(batchslot, slot);

/*
* Calculate the tuple size after the original slot is copied, because the
* copied slot type and the tuple size may change.
*/
sz = GetTupleSize(batchslot, mistate->max_size);

Assert(sz > 0);

mistate->cur_slots++;
mistate->cur_size += sz;

if (mistate->cur_slots >= mistate->max_slots ||
mistate->cur_size >= mistate->max_size)
heap_multi_insert_flush(state);
}

I think clearing tuples before copying the slot as you suggested may
work without the need of clear_slots flag.

ok, cool :)

Also, why do we want to do ExecClearTuple() anyway? Isn't
it good enough that the next call to ExecCopySlot will effectively clear
it out?

For virtual, heap, minimal tuple slots, yes ExecCopySlot slot clears the
slot before copying. But, for buffer heap slots, the
tts_buffer_heap_copyslot does not always clear the destination slot, see
below. If we fall into else condition, we might get some issues. And
also note that, once the slot is cleared in ExecClearTuple, it will not
be cleared again in ExecCopySlot because TTS_SHOULDFREE(slot) will be
false. That is why, let's have ExecClearTuple as is.

I had no idea the buffer heap slot doesn't unconditionally clear out the
slot :( So yes lets call it unconditionally ourselves. See also
suggestion above.

Yeah, we will clear the tuple slot before copy to be on the safer side.

ok

/*
* If the source slot is of a different kind, or is a buffer slot
that has
* been materialized / is virtual, make a new copy of the tuple.
Otherwise
* make a new reference to the in-buffer tuple.
*/
if (dstslot->tts_ops != srcslot->tts_ops ||
TTS_SHOULDFREE(srcslot) ||
!bsrcslot->base.tuple)
{
MemoryContext oldContext;

ExecClearTuple(dstslot);
}
else
{
Assert(BufferIsValid(bsrcslot->buffer));

tts_buffer_heap_store_tuple(dstslot, bsrcslot->base.tuple,
bsrcslot->buffer, false);

- flushed -> why is this a stored boolean? isn't this indirectly encoded
by cur_slots/cur_size == 0?

Note that cur_slots is in HeapMultiInsertState and outside of the new
APIs i.e. in TableInsertState, mistate is a void pointer, and we can't
really access the cur_slots. I mean, we can access but we need to be
dereferencing using the tableam kind. Instead of doing all of that, to
keep the API cleaner, I chose to have a boolean in the TableInsertState
which we can see and use outside of the new APIs. Hope that's fine.

So you mean the flushed variable is actually there to tell the user of
the API that they are supposed to call flush before end? Why can't the
end call flush itself then? I guess I completely misunderstood the
purpose of table_multi_insert_flush being public. I had assumed it is
there to from the usage site indicate that now would be a good time to
flush, e.g. because of a statement ending or something. I had not
understood this is a requirement that its always required to do
table_multi_insert_flush + table_insert_end.
IMHO I would hide this from the callee, given that you would only really
call flush yourself when you immediately after would call end, or are
there other cases where one would be required to explicitly call flush?

We need to know outside the multi_insert API whether the buffered
slots in case of multi inserts are flushed. Reason is that if we have
indexes or after row triggers, currently we call ExecInsertIndexTuples
or ExecARInsertTriggers on the buffered slots outside the API in a
loop after the flush.

If we agree on removing heap_multi_insert_v2 API and embed that logic
inside heap_insert_v2, then we can do this - pass the required
information and the functions ExecInsertIndexTuples and
ExecARInsertTriggers as callbacks so that, whether or not
heap_insert_v2 choses single or multi inserts, it can callback these
functions with the required information passed after the flush. We can
add the callback and required information into TableInsertState. But,
I'm not quite sure, we would make ExecInsertIndexTuples and
ExecARInsertTriggers. And in

If we don't want to go with callback way, then at least we need to
know whether or not heap_insert_v2 has chosen multi inserts, if yes,
the buffered slots array, and the number of current buffered slots,
whether they are flushed or not in the TableInsertState. Then,
eventually, we might need all the HeapMultiInsertState info in the
TableInsertState.

To me the callback API seems cleaner, that on heap_insert_begin we can
pass in a callback that is called on every flushed slot, or only on
multi-insert flushes. Is there a reason it would only be done for
multi-insert flushes or can it be generic?

Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Hi,

Replied inline.

Kind regards,
Luc

#20Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#18)
Re: New Table Access Methods for Multi and Single Inserts

If we agree on removing heap_multi_insert_v2 API and embed that logic
inside heap_insert_v2, then we can do this - pass the required
information and the functions ExecInsertIndexTuples and
ExecARInsertTriggers as callbacks so that, whether or not
heap_insert_v2 choses single or multi inserts, it can callback these
functions with the required information passed after the flush. We
can
add the callback and required information into TableInsertState. But,
I'm not quite sure, we would make ExecInsertIndexTuples and
ExecARInsertTriggers.

How should the API interact with INSERT INTO ... SELECT? Right now it
doesn't appear to be integrated at all, but that seems like a fairly
important path for bulk inserts.

Regards,
Jeff Davis

#21Luc Vlaming
luc@swarm64.com
In reply to: Jeff Davis (#20)
#22Jeff Davis
pgsql@j-davis.com
In reply to: Luc Vlaming (#21)
#23Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#18)
#24Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#23)
#25Zhihong Yu
zyu@yugabyte.com
In reply to: Bharath Rupireddy (#24)
#26Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Zhihong Yu (#25)
#27Dilip Kumar
dilipbalaut@gmail.com
In reply to: Bharath Rupireddy (#24)
#28Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Dilip Kumar (#27)
#29Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#28)
#30Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#29)
#31Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#30)
#32Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#31)
#33Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Matthias van de Meent (#32)
#34Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Bharath Rupireddy (#33)
#35Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#34)
#36Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Michael Paquier (#35)
#37Andres Freund
andres@anarazel.de
In reply to: Bharath Rupireddy (#31)
#38Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#37)
#39Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Andres Freund (#37)
#40Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Bharath Rupireddy (#39)
#41Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jacob Champion (#40)
#42Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#41)
#43Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#42)
#44Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#43)
#45Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#44)
#46Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bharath Rupireddy (#45)
#47Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Masahiko Sawada (#46)
#48Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#47)
#49Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#48)
#50Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#49)
#51Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#50)
#52Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#51)
#53Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#52)
#54Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#53)
#55Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#54)
#56Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#55)
#57Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#56)
#58Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#55)
#59Pavel Stehule
pavel.stehule@gmail.com
In reply to: Bharath Rupireddy (#58)
#60Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#58)
#61Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#60)
#62Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#61)
#63Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bharath Rupireddy (#62)
#64Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Alvaro Herrera (#63)
#65Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#63)
#66Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#63)
#67Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#62)
#68Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#67)
#69Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#67)
#70Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bharath Rupireddy (#69)
#71Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#70)
#72Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Jeff Davis (#71)
#73Jeff Davis
pgsql@j-davis.com
In reply to: Matthias van de Meent (#72)
#74Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Jeff Davis (#73)
#75Jeff Davis
pgsql@j-davis.com
In reply to: Matthias van de Meent (#74)
#76Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#70)
#77Jeff Davis
pgsql@j-davis.com
In reply to: Jeff Davis (#71)
#78Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#77)
#79Jeff Davis
pgsql@j-davis.com
In reply to: Bharath Rupireddy (#78)
#80Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jeff Davis (#79)
#81Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Bharath Rupireddy (#80)
#82Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Jingtang Zhang (#81)
#83Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Bharath Rupireddy (#80)
#84Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Jingtang Zhang (#83)
#85Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Bharath Rupireddy (#84)
#86Daniil Davydov
3danissimo@gmail.com
In reply to: Jingtang Zhang (#85)
#87Daniil Davydov
3danissimo@gmail.com
In reply to: Daniil Davydov (#86)
#88Jingtang Zhang
mrdrivingduck@gmail.com
In reply to: Daniil Davydov (#87)
#89Daniil Davydov
3danissimo@gmail.com
In reply to: Jingtang Zhang (#88)