FOR EACH ROW triggers on partitioned tables
This patch enables FOR EACH ROW triggers on partitioned tables.
As presented, this patch is sufficient to discuss the semantics that we
want for triggers on partitioned tables, which is the most pressing
question here ISTM.
However, this is incomplete: it doesn't create triggers when you do
ALTER TABLE ATTACH PARTITION or by CREATE TABLE PARTITION OF. I'm using
this as a basis on which to try foreign keys for partitioned tables.
Getting this to committable status requires adding those features.
This is essentially the same patch I posted as 0003 in
/messages/by-id/20171220194937.pldcecyx7yrwmgkg@alvherre.pgsql
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v1-0001-Allow-FOR-EACH-ROW-triggers-on-partitioned-tables.patchtext/plain; charset=us-asciiDownload+202-27
On 12/29/17 17:53, Alvaro Herrera wrote:
This patch enables FOR EACH ROW triggers on partitioned tables.
As presented, this patch is sufficient to discuss the semantics that we
want for triggers on partitioned tables, which is the most pressing
question here ISTM.
This seems pretty straightforward. What semantics questions do you have?
However, this is incomplete: it doesn't create triggers when you do
ALTER TABLE ATTACH PARTITION or by CREATE TABLE PARTITION OF. I'm using
this as a basis on which to try foreign keys for partitioned tables.
Getting this to committable status requires adding those features.
Yeah that, and also perhaps preventing the removal of triggers from
partitions if they are supposed to be on the whole partition hierarchy.
And then make pg_dump do the right things. That's all mostly legwork, I
think.
Also, does ALTER TABLE ... ENABLE/DISABLE TRIGGER do the right things on
partitioned tables?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 3 January 2018 at 03:12, Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
On 12/29/17 17:53, Alvaro Herrera wrote:
This patch enables FOR EACH ROW triggers on partitioned tables.
As presented, this patch is sufficient to discuss the semantics that we
want for triggers on partitioned tables, which is the most pressing
question here ISTM.This seems pretty straightforward. What semantics questions do you have?
I see the patch imposes these restrictions
* AFTER TRIGGERS only
* No transition tables
* No WHEN clause
All of which might be removed/extended at some later date
So that's all good... there's not much here, so easy to commit soon.
However, this is incomplete: it doesn't create triggers when you do
ALTER TABLE ATTACH PARTITION or by CREATE TABLE PARTITION OF. I'm using
this as a basis on which to try foreign keys for partitioned tables.
Getting this to committable status requires adding those features.Yeah that, and also perhaps preventing the removal of triggers from
partitions if they are supposed to be on the whole partition hierarchy.
+1
And then make pg_dump do the right things. That's all mostly legwork, I
think.Also, does ALTER TABLE ... ENABLE/DISABLE TRIGGER do the right things on
partitioned tables?
Not sure I care about that, since it just breaks FKs and other things,
but we can add it later.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut wrote:
On 12/29/17 17:53, Alvaro Herrera wrote:
This patch enables FOR EACH ROW triggers on partitioned tables.
As presented, this patch is sufficient to discuss the semantics that we
want for triggers on partitioned tables, which is the most pressing
question here ISTM.This seems pretty straightforward. What semantics questions do you have?
The main question is this: when running the trigger function, it is
going to look as it is running in the context of the partition, not in
the context of the parent partitioned table (TG_RELNAME etc). That
seems mildly ugly: some users may be expecting that the partitioning
stuff is invisible to the rest of the system, so if you have triggers on
a regular table and later on decide to partition that table, the
behavior of triggers will change, which is maybe unexpected. Maybe this
is not really a problem, but I'm not sure and would like further
opinions.
Anyway, the attached v2 has the following changes
1. ALTER TABLE ATTACH PARTITION and CREATE TABLE PARTITION OF now clone
any triggers from the main table, as if the trigger had been created
with the partitions in place.
2. dependencies work correctly: dropping the trigger on a partition is
disallowed; dropping the table removes the trigger. This is pretty
much the same behavior we have for indexes in partitions; I've reused
the new dependency type.
While existing pg_dump tests pass, I have not verified that it does
anything remotely sensible.
Also, does ALTER TABLE ... ENABLE/DISABLE TRIGGER do the right things on
partitioned tables?
Haven't done this yet, either. I like Simon's suggestion of outright
disallowing this.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v2-0001-Allow-FOR-EACH-ROW-triggers-on-partitioned-tables.patchtext/plain; charset=us-asciiDownload+393-40
On Tue, Jan 23, 2018 at 5:10 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
The main question is this: when running the trigger function, it is
going to look as it is running in the context of the partition, not in
the context of the parent partitioned table (TG_RELNAME etc). That
seems mildly ugly: some users may be expecting that the partitioning
stuff is invisible to the rest of the system, so if you have triggers on
a regular table and later on decide to partition that table, the
behavior of triggers will change, which is maybe unexpected. Maybe this
is not really a problem, but I'm not sure and would like further
opinions.
It doesn't seem either great or horrible.
Also, what about logical replication? Amit just raised this issue for
the UPDATE row movement patch, and it seems like the issues are
similar here. If somebody's counting on the same kinds of per-row
triggers to fire during logical replication as we do during the
original operation, they will be disappointed.
Also, does ALTER TABLE ... ENABLE/DISABLE TRIGGER do the right things on
partitioned tables?Haven't done this yet, either. I like Simon's suggestion of outright
disallowing this.
Why not just make it work?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 1/23/18 17:10, Alvaro Herrera wrote:
The main question is this: when running the trigger function, it is
going to look as it is running in the context of the partition, not in
the context of the parent partitioned table (TG_RELNAME etc). That
seems mildly ugly: some users may be expecting that the partitioning
stuff is invisible to the rest of the system, so if you have triggers on
a regular table and later on decide to partition that table, the
behavior of triggers will change, which is maybe unexpected. Maybe this
is not really a problem, but I'm not sure and would like further
opinions.
One could go either way on this, but I think reporting the actual table
partition is acceptable and preferable. If you are writing a generic
trigger function, maybe to dump out all columns, you want to know the
physical table and its actual columns. It's easy[citation needed] to
get the partition root for a given table, if the trigger code needs
that. The other way around is not possible.
Some other comments are reading the patch:
It seems to generally follow the patterns of the partitioned indexes
patch, which is good.
I think WHEN clauses on partition triggers should be OK. I don't see a
reason to disallow them.
Similarly, transition tables should be OK. You only get the current
partition to look at, of course.
The function name CloneRowTriggersOnPartition() confused me. A more
accurate phrasing might be CloneRowTriggersToPartition(), or maybe
reword altogether.
New CommandCounterIncrement() call in AttachPartitionEnsureIndexes()
should be explained. Or maybe it belongs in ATExecAttachPartition()
between the calls to AttachPartitionEnsureIndexes() and
CloneRowTriggersOnPartition()?
Prohibition against constraint triggers is unclear. The subsequent
foreign-key patches mess with that further. It's not clear to me why
constraint triggers shouldn't be allowed like normal triggers.
Obvious missing things: documentation, pg_dump, psql updates
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/01/30 5:30, Peter Eisentraut wrote:
On 1/23/18 17:10, Alvaro Herrera wrote:
The main question is this: when running the trigger function, it is
going to look as it is running in the context of the partition, not in
the context of the parent partitioned table (TG_RELNAME etc). That
seems mildly ugly: some users may be expecting that the partitioning
stuff is invisible to the rest of the system, so if you have triggers on
a regular table and later on decide to partition that table, the
behavior of triggers will change, which is maybe unexpected. Maybe this
is not really a problem, but I'm not sure and would like further
opinions.One could go either way on this, but I think reporting the actual table
partition is acceptable and preferable.
+1
If you are writing a generic
trigger function, maybe to dump out all columns, you want to know the
physical table and its actual columns. It's easy[citation needed] to
get the partition root for a given table, if the trigger code needs
that. The other way around is not possible.
I guess you mean the root where a trigger originated, that is, ancestor
table on which an inherited trigger was originally defined. It is
possible for a trigger to be defined on an intermediate parent and not the
topmost root in a partition tree.
I see that the only parent-child relationship for triggers created
recursively is recorded in the form of a dependency. I wonder why not a
flag in, say, pg_trigger to indicate that a trigger may have been created
recursively. With the committed for inherited indexes, I can see that
inheritance is explicitly recorded in pg_inherits because indexes are
relations, so it's possible in the indexes' case to get the parent in
which a given inherited index originated.
Similarly, transition tables should be OK. You only get the current
partition to look at, of course.
+1
The function name CloneRowTriggersOnPartition() confused me. A more
accurate phrasing might be CloneRowTriggersToPartition(), or maybe
reword altogether.
CloneRowTriggers*For*Partition()?
Thanks,
Amit
On 1/30/18 04:49, Amit Langote wrote:
If you are writing a generic
trigger function, maybe to dump out all columns, you want to know the
physical table and its actual columns. It's easy[citation needed] to
get the partition root for a given table, if the trigger code needs
that. The other way around is not possible.I guess you mean the root where a trigger originated, that is, ancestor
table on which an inherited trigger was originally defined. It is
possible for a trigger to be defined on an intermediate parent and not the
topmost root in a partition tree.
OK, so maybe not so "easy".
But this muddies the situation even further. You could be updating
table A, which causes an update in intermediate partition B, which
causes an update in leaf partition C, which fires a trigger that was
logically defined on B and has a local child on C. Under this proposal,
the trigger will see TG_RELNAME = C. You could make arguments that the
trigger should also somehow know about B (where the trigger was defined)
and A (what the user actually targeted in their statement). I'm not
sure how useful these would be. But if you want to cover everything,
you'll need three values.
I think the patch can go ahead as proposed, and the other things could
be future separate additions.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/01/31 9:44, Peter Eisentraut wrote:
On 1/30/18 04:49, Amit Langote wrote:
If you are writing a generic
trigger function, maybe to dump out all columns, you want to know the
physical table and its actual columns. It's easy[citation needed] to
get the partition root for a given table, if the trigger code needs
that. The other way around is not possible.I guess you mean the root where a trigger originated, that is, ancestor
table on which an inherited trigger was originally defined. It is
possible for a trigger to be defined on an intermediate parent and not the
topmost root in a partition tree.OK, so maybe not so "easy".
But this muddies the situation even further. You could be updating
table A, which causes an update in intermediate partition B, which
causes an update in leaf partition C, which fires a trigger that was
logically defined on B and has a local child on C. Under this proposal,
the trigger will see TG_RELNAME = C. You could make arguments that the
trigger should also somehow know about B (where the trigger was defined)
and A (what the user actually targeted in their statement). I'm not
sure how useful these would be. But if you want to cover everything,
you'll need three values.I think the patch can go ahead as proposed, and the other things could
be future separate additions.
Yeah, I see no problem with going ahead with the patch as it for now.
Thanks,
Amit
Moved to next commit fest.
There is some work to be done, but there appears to be a straight path
to success.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Robert Haas wrote:
On Tue, Jan 23, 2018 at 5:10 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Also, does ALTER TABLE ... ENABLE/DISABLE TRIGGER do the right things on
partitioned tables?Haven't done this yet, either. I like Simon's suggestion of outright
disallowing this.Why not just make it work?
I haven't had as much time to work on this as I wished, so progress has
been a bit slow. That is to say, this version is almost identical to
the one I last posted. I added a test for enable/disable trigger, which
currently fails because the code to support it is not implemented. I
added a report of the trigger name to the relevant test, for improved
visibility of what is happening. (I intend to push that one, since it's
a trivial improvement.)
Now one way to fix that would be to do as Amit suggests elsewhere, ie.,
to add a link to parent trigger from child trigger, so we can search for
children whenever the parent is disabled. We'd also need a new index on
that column so that the searches are fast, and perhaps a boolean flag
("trghaschildren") to indicate that searches must be done.
(We could add an array of children OID instead, but designwise that
seems much worse.)
Another option is to rethink this feature from the ground up: instead of
cloning catalog rows for each children, maybe we should have the trigger
lookup code, when running DML on the child relation (the partition),
obtain trigger entries not only for the child relation itself but also
for its parents recursively -- so triggers defined in the parent are
fired for the partitions, too. I'm not sure what implications this has
for constraint triggers.
The behavior should be the same, except that you cannot modify the
trigger (firing conditions, etc) on the partition individually -- it
works at the level of the whole partitioned table instead.
For foreign key triggers to work properly, I think I'd propose that this
occurs only for non-internal triggers. For internal triggers,
particularly FK triggers, we continue with the current approach in that
patch which is to create trigger clones.
This seems more promising to me.
Thoughts?
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/02/15 6:26, Alvaro Herrera wrote:
Another option is to rethink this feature from the ground up: instead of
cloning catalog rows for each children, maybe we should have the trigger
lookup code, when running DML on the child relation (the partition),
obtain trigger entries not only for the child relation itself but also
for its parents recursively -- so triggers defined in the parent are
fired for the partitions, too. I'm not sure what implications this has
for constraint triggers.The behavior should be the same, except that you cannot modify the
trigger (firing conditions, etc) on the partition individually -- it
works at the level of the whole partitioned table instead.
Do you mean to fire these triggers only if the parent table (not a child
table/partition) is addressed in the DML, right? If the table directly
addressed in the DML is a partition whose parent has a row-level trigger,
then that trigger should not get fired I suppose.
Thanks,
Amit
Amit Langote wrote:
On 2018/02/15 6:26, Alvaro Herrera wrote:
Another option is to rethink this feature from the ground up: instead of
cloning catalog rows for each children, maybe we should have the trigger
lookup code, when running DML on the child relation (the partition),
obtain trigger entries not only for the child relation itself but also
for its parents recursively -- so triggers defined in the parent are
fired for the partitions, too. I'm not sure what implications this has
for constraint triggers.The behavior should be the same, except that you cannot modify the
trigger (firing conditions, etc) on the partition individually -- it
works at the level of the whole partitioned table instead.Do you mean to fire these triggers only if the parent table (not a child
table/partition) is addressed in the DML, right? If the table directly
addressed in the DML is a partition whose parent has a row-level trigger,
then that trigger should not get fired I suppose.
No, I think that would be strange and cause data inconsistencies.
Inserting directly into the partition is seen as a performance
optimization (compared to inserted into the partitioned table), so we
don't get to skip firing the triggers defined on the parent because the
behavior would become different. In other words, the performance
optimization breaks the database.
Example: suppose the trigger is used to maintain an audit record trail.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018/02/16 6:55, Alvaro Herrera wrote:
Amit Langote wrote:
On 2018/02/15 6:26, Alvaro Herrera wrote:
Another option is to rethink this feature from the ground up: instead of
cloning catalog rows for each children, maybe we should have the trigger
lookup code, when running DML on the child relation (the partition),
obtain trigger entries not only for the child relation itself but also
for its parents recursively -- so triggers defined in the parent are
fired for the partitions, too. I'm not sure what implications this has
for constraint triggers.The behavior should be the same, except that you cannot modify the
trigger (firing conditions, etc) on the partition individually -- it
works at the level of the whole partitioned table instead.Do you mean to fire these triggers only if the parent table (not a child
table/partition) is addressed in the DML, right? If the table directly
addressed in the DML is a partition whose parent has a row-level trigger,
then that trigger should not get fired I suppose.No, I think that would be strange and cause data inconsistencies.
Inserting directly into the partition is seen as a performance
optimization (compared to inserted into the partitioned table), so we
don't get to skip firing the triggers defined on the parent because the
behavior would become different. In other words, the performance
optimization breaks the database.
OK, that makes sense.
Thanks,
Amit
On 2/15/18 16:55, Alvaro Herrera wrote:
Amit Langote wrote:
Do you mean to fire these triggers only if the parent table (not a child
table/partition) is addressed in the DML, right? If the table directly
addressed in the DML is a partition whose parent has a row-level trigger,
then that trigger should not get fired I suppose.No, I think that would be strange and cause data inconsistencies.
Inserting directly into the partition is seen as a performance
optimization (compared to inserted into the partitioned table), so we
don't get to skip firing the triggers defined on the parent because the
behavior would become different. In other words, the performance
optimization breaks the database.Example: suppose the trigger is used to maintain an audit record trail.
Although this situation could probably be addressed by not giving
permission to write directly into the partitions, I can't think of an
example where one would want a trigger that is only fired when writing
into the partition root rather than into the partition directly.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera wrote:
Another option is to rethink this feature from the ground up: instead of
cloning catalog rows for each children, maybe we should have the trigger
lookup code, when running DML on the child relation (the partition),
obtain trigger entries not only for the child relation itself but also
for its parents recursively -- so triggers defined in the parent are
fired for the partitions, too.
I have written this, and it seems to work fine; it's attached.
Generally speaking, I like this better than my previously proposed
patch: having duplicate pg_trigger rows seems lame, in hindsight.
I haven't measured the performance loss, but we now scan pg_inherits
each time we build a relcache entry and relhastriggers=on. Can't be
good. In general, the pg_inherits stuff looks generally unnatural --
manually doing scans upwards (find parents) and downwards (find
partitions). It's messy and there are no nice abstractions.
Partitioning looks too much bolted-on still.
We could mitigate the performance loss to some extent by adding more to
RelationData. For example, a "is_partition" boolean would help: skip
searching pg_inherits for a relation that is not a partition. The
indexing patch already added some "has_superclass()" calls and they look
somewhat out of place. Also, we could add a syscache to pg_inherits.
Regarding making partitioning feel more natural, we could add some API
"list all ancestors", "list all descendents". Maybe I should have used
find_inheritance_children.
Some cutesy: scanning multiple parents looking for potential triggers
means the order of indexscan results no longer guarantees the correct
ordering. I had to add a qsort() there.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v4-0001-Allow-FOR-EACH-ROW-triggers-on-partitioned-tables.patchtext/plain; charset=us-asciiDownload+711-174
On 2018/02/23 8:52, Alvaro Herrera wrote:
We could mitigate the performance loss to some extent by adding more to
RelationData. For example, a "is_partition" boolean would help: skip
searching pg_inherits for a relation that is not a partition.
Unless I'm missing something, doesn't rd_rel->relispartition help?
Thanks,
Amit
Amit Langote wrote:
On 2018/02/23 8:52, Alvaro Herrera wrote:
We could mitigate the performance loss to some extent by adding more to
RelationData. For example, a "is_partition" boolean would help: skip
searching pg_inherits for a relation that is not a partition.Unless I'm missing something, doesn't rd_rel->relispartition help?
Uh, wow, how have I missed that all this time! Yes, it probably does.
I'll rework this tomorrow ... and the already committed index patch too,
I think.
Thanks
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Feb 23, 2018 at 11:32 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
Amit Langote wrote:
On 2018/02/23 8:52, Alvaro Herrera wrote:
We could mitigate the performance loss to some extent by adding more to
RelationData. For example, a "is_partition" boolean would help: skip
searching pg_inherits for a relation that is not a partition.Unless I'm missing something, doesn't rd_rel->relispartition help?
Uh, wow, how have I missed that all this time! Yes, it probably does.
I'll rework this tomorrow ... and the already committed index patch too,
I think.
BTW, not sure if you'd noticed but I had emailed about setting
relispartition on index partitions after you committed the first
indexes patch.
/messages/by-id/12085bc4-0bc6-0f3a-4c43-57fe0681772b@lab.ntt.co.jp
Thanks,
Amit
Amit Langote wrote:
On Fri, Feb 23, 2018 at 11:32 AM, Alvaro Herrera
<alvherre@alvh.no-ip.org> wrote:
Uh, wow, how have I missed that all this time! Yes, it probably does.
I'll rework this tomorrow ... and the already committed index patch too,
I think.BTW, not sure if you'd noticed but I had emailed about setting
relispartition on index partitions after you committed the first
indexes patch.
I hadn't noticed. These days sadly I'm not able to keep up with all
pgsql-hackers traffic, and I'm quite likely to miss things unless I'm
CCed. This seems true for many others, too.
Thanks for pointing it out.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services