PATCH: Add REINDEX tag to event triggers
Added my v1 patch to add REINDEX to event triggers.
I originally built this against pg15 but rebased to master for the patch
to hopefully make it easier for maintainers to merge. The rebase was
automatic so it should be easy to include into any recent version. I'd love
for this to land in pg16 if possible.
I added regression tests and they are passing. I also formatted the code
using the project tools. Docs are included too.
A few notes:
1. I tried to not touch the function parameters too much and opted to
create a convenience function that makes it easy to attach index or table
OIDs to the current event trigger list.
2. I made sure both concurrent and regular reindex of index, table,
database work as expected.
3. The ddl end command will make available all indexes that were modified
by the reindex command. This is a large list when you run "reindex
database" but works really well. I debated returning records of the table
or DB that were reindexed but that doesn't really make sense. Returning
each index fits my use case of building an audit record around the index
lifecycle (hence the motivation for the patch).
Thanks for your time,
Garrett Thornburg
Attachments:
v1-0001-Add-REINDEX-tag-to-event-triggers.patchapplication/octet-stream; name=v1-0001-Add-REINDEX-tag-to-event-triggers.patchDownload+170-4
On Thu, Jul 20, 2023 at 10:47:00PM -0600, Garrett Thornburg wrote:
Added my v1 patch to add REINDEX to event triggers.
I originally built this against pg15 but rebased to master for the patch
to hopefully make it easier for maintainers to merge. The rebase was
automatic so it should be easy to include into any recent version. I'd love
for this to land in pg16 if possible.
The development of Postgres 16 has finished last April and this
version is in a stabilization period. We are currently running the
first commit fest of PostgreSQL 17, though. I would recommend to
register your patch here so as it is not lost:
https://commitfest.postgresql.org/44/
I added regression tests and they are passing. I also formatted the code
using the project tools. Docs are included too.
Hmm. What are the use cases you have in mind where you need to know
the list of indexes listed by an event trigger like this? Just
monitoring to know which indexes have been rebuilt? We have a
table_rewrite which is equivalent to check if a relfilenode has
changed, so why not, I guess..
Assuming that ddl_command_start is not supported is incorrect. For
example, with your patch applied, if I do the following setup:
create event trigger regress_event_trigger on ddl_command_start
execute procedure test_event_trigger();
create function test_event_trigger() returns event_trigger as $$
BEGIN
RAISE NOTICE 'test_event_trigger: % %', tg_event, tg_tag;
END
$$ language plpgsql;
Then a REINDEX gives me that:
reindex table pg_class;
NOTICE: 00000: test_event_trigger: ddl_command_start REINDEX
The first paragraphs of event-trigger.sgml describe the following:
The <literal>ddl_command_start</literal> event occurs just before the
execution of a <literal>CREATE</literal>, <literal>ALTER</literal>, <literal>DROP</literal>,
<literal>SECURITY LABEL</literal>,
<literal>COMMENT</literal>, <literal>GRANT</literal> or <literal>REVOKE</literal>
[...]
The <literal>ddl_command_end</literal> event occurs just after the execution of
this same set of commands. To obtain more details on the <acronym>DDL</acronym>
So it would need a refresh to list REINDEX.
case T_ReindexStmt:
- lev = LOGSTMT_ALL; /* should this be DDL? */
+ lev = LOGSTMT_DDL;
This, unfortunately, may not be as right as it is simple because
REINDEX is not a DDL as far as I know (it is not in the SQL
standard).
--
Michael
On Fri, Jul 21, 2023 at 12:47 PM Garrett Thornburg <film42@gmail.com> wrote:
Added my v1 patch to add REINDEX to event triggers.
I originally built this against pg15 but rebased to master for the patch to hopefully make it easier for maintainers to merge. The rebase was automatic so it should be easy to include into any recent version. I'd love for this to land in pg16 if possible.
I added regression tests and they are passing. I also formatted the code using the project tools. Docs are included too.
A few notes:
1. I tried to not touch the function parameters too much and opted to create a convenience function that makes it easy to attach index or table OIDs to the current event trigger list.
2. I made sure both concurrent and regular reindex of index, table, database work as expected.
3. The ddl end command will make available all indexes that were modified by the reindex command. This is a large list when you run "reindex database" but works really well. I debated returning records of the table or DB that were reindexed but that doesn't really make sense. Returning each index fits my use case of building an audit record around the index lifecycle (hence the motivation for the patch).Thanks for your time,
Garrett Thornburg
main/postgres/src/backend/tcop/utility.c
535: /*
536: * standard_ProcessUtility itself deals only with utility commands for
537: * which we do not provide event trigger support. Commands that do have
538: * such support are passed down to ProcessUtilitySlow, which contains the
539: * necessary infrastructure for such triggers.
540: *
541: * This division is not just for performance: it's critical that the
542: * event trigger code not be invoked when doing START TRANSACTION for
543: * example, because we might need to refresh the event trigger cache,
544: * which requires being in a valid transaction.
545: */
so T_ReindexStmt should only be in ProcessUtilitySlow, if you want
to create an event trigger on reindex?
regression tests work fine. I even play with partitions.
On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
so T_ReindexStmt should only be in ProcessUtilitySlow, if you want
to create an event trigger on reindex?regression tests work fine. I even play with partitions.
It would be an idea to have some regression tests for partitions,
actually, so as some patterns around ReindexMultipleInternal() are
checked. We could have a REINDEX DATABASE in a TAP test with an event
trigger, as well, but I don't feel strongly about the need to do that
much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
partitions cover the multi-table case.
--
Michael
On Wed, Jul 26, 2023 at 7:51 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
so T_ReindexStmt should only be in ProcessUtilitySlow, if you want
to create an event trigger on reindex?regression tests work fine. I even play with partitions.
It would be an idea to have some regression tests for partitions,
actually, so as some patterns around ReindexMultipleInternal() are
checked. We could have a REINDEX DATABASE in a TAP test with an event
trigger, as well, but I don't feel strongly about the need to do that
much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
partitions cover the multi-table case.
--
Michael
quite verbose, copied from partition-info.sql. meet the expectation:
partitioned index will do nothing, partition index will trigger event
trigger.
------------------------------------------------
DROP EVENT TRIGGER IF EXISTS end_reindex_command CASCADE;
DROP EVENT TRIGGER IF EXISTS start_reindex_command CASCADE;
BEGIN;
CREATE OR REPLACE FUNCTION reindex_end_command()
RETURNS event_trigger AS $$
DECLARE
obj record;
BEGIN
raise notice 'begin of reindex_end_command';
FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
RAISE NOTICE
'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
,obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
RAISE NOTICE 'ddl_end_command -- REINDEX: %', pg_get_indexdef(obj.objid);
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION start_reindex_command()
RETURNS event_trigger AS $$
DECLARE
obj record;
BEGIN
FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
RAISE NOTICE
'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
, obj.command_tag, obj.object_type,obj.schema_name,obj.object_identity;
RAISE NOTICE 'ddl_start_command -- REINDEX: %',
pg_get_indexdef(obj.objid);
END LOOP;
raise notice 'end of start_reindex_command';
END;
$$ LANGUAGE plpgsql;
BEGIN;
CREATE EVENT TRIGGER end_reindex_command ON ddl_command_end
WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE reindex_end_command();
CREATE EVENT TRIGGER start_reindex_command ON ddl_command_start
WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE start_reindex_command();
COMMIT;
-- test Reindex Event Trigger
BEGIN;
drop table if EXISTS ptif_test CASCADE;
CREATE TABLE ptif_test (a int, b int) PARTITION BY range (a);
CREATE TABLE ptif_test0 PARTITION OF ptif_test
FOR VALUES FROM (minvalue) TO (0) PARTITION BY list (b);
CREATE TABLE ptif_test01 PARTITION OF ptif_test0 FOR VALUES IN (1);
CREATE TABLE ptif_test1 PARTITION OF ptif_test
FOR VALUES FROM (0) TO (100) PARTITION BY list (b);
CREATE TABLE ptif_test11 PARTITION OF ptif_test1 FOR VALUES IN (1);
CREATE TABLE ptif_test2 PARTITION OF ptif_test
FOR VALUES FROM (100) TO (200);
-- This partitioned table should remain with no partitions.
CREATE TABLE ptif_test3 PARTITION OF ptif_test
FOR VALUES FROM (200) TO (maxvalue) PARTITION BY list (b);
-- Test index partition tree
CREATE INDEX ptif_test_index ON ONLY ptif_test (a);
CREATE INDEX ptif_test0_index ON ONLY ptif_test0 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test0_index;
CREATE INDEX ptif_test01_index ON ptif_test01 (a);
ALTER INDEX ptif_test0_index ATTACH PARTITION ptif_test01_index;
CREATE INDEX ptif_test1_index ON ONLY ptif_test1 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test1_index;
CREATE INDEX ptif_test11_index ON ptif_test11 (a);
ALTER INDEX ptif_test1_index ATTACH PARTITION ptif_test11_index;
CREATE INDEX ptif_test2_index ON ptif_test2 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test2_index;
CREATE INDEX ptif_test3_index ON ptif_test3 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test3_index;
COMMIT;
--top level partitioned index. will recurse to each partition index.
REINDEX INDEX CONCURRENTLY public.ptif_test_index;
--ptif_test0 is partitioned table. it will index partition: ptif_test01_index
-- event trigger will log ptif_test01_index
REINDEX INDEX CONCURRENTLY public.ptif_test0_index;
--ptif_test1_index is partitioned index. it will index partition:
ptif_test11_index
-- event trigger will effect on partion index:ptif_test11_index
REINDEX INDEX CONCURRENTLY public.ptif_test1_index;
--ptif_test2 is a partition. event trigger will log ptif_test2_index
REINDEX INDEX CONCURRENTLY public.ptif_test2_index;
--no partitions. event trigger won't do anything.
REINDEX INDEX CONCURRENTLY public.ptif_test3_index;
reindex table ptif_test; --top level. will recurse to each partition index.
reindex table ptif_test0; -- will direct to ptif_test01
reindex table ptif_test01; -- will index it's associtaed index
reindex table ptif_test11; -- will index it's associtaed index
reindex table ptif_test2; -- will index it's associtaed index
reindex table ptif_test3; -- no partion, index won't do anything.
DROP EVENT TRIGGER IF EXISTS end_reindex_command CASCADE;
DROP EVENT TRIGGER IF EXISTS start_reindex_command CASCADE;
DROP FUNCTION IF EXISTS reindex_start_command;
DROP FUNCTION IF EXISTS reindex_end_command;
DROP TABLE if EXISTS ptif_test CASCADE;
-----------------------
Thanks! I added the patch to the commitfest. TIL.
Hmm. What are the use cases you have in mind where you need to know the
list of indexes listed by an event trigger like this?
Direct use case is general index health and rebuild. We had a recent
upgrade go poorly (our fault) that left a bunch of indexes in a corrupted
state. We ran a mass rebuild a few times. Would have been great to be able
to observe changes happening in scripts. That left us wishing there was a
way to monitor all changes around indexes. Figured I'd add support since my
first thread seemed somewhat supportive of the idea.
We have a table_rewrite which is equivalent to check if a relfilenode has
changed, so why not, I guess..
Yeah. I was actually debating if we need to add a new reindex event instead
of building on top of the DDL start/end. You are right that the
documentation does not include nonsql commands like reindex. Maybe I should
consider going that direction again?
Assuming that ddl_command_start is not supported is incorrect.
Also great callout. Using `SELECT * FROM pg_event_trigger_ddl_commands()`
in the event does not yield anything. Bad assumption there. I'll add a test
with my next patch version.
Re: LOGSTMT_ALL
Agree here. I think this is more for reference than anything else so I'm
fine to call it LOGSTMT_ALL, but that goes back to the previous question
around ddl events vs a new event similar to table_rewrite.
P.S. Sorry for the double post. Sent from the wrong email so the mailing
list didn't get the message.
On Tue, Jul 25, 2023 at 12:55 AM Michael Paquier <michael@paquier.xyz>
wrote:
Show quoted text
On Thu, Jul 20, 2023 at 10:47:00PM -0600, Garrett Thornburg wrote:
Added my v1 patch to add REINDEX to event triggers.
I originally built this against pg15 but rebased to master for the patch
to hopefully make it easier for maintainers to merge. The rebase was
automatic so it should be easy to include into any recent version. I'dlove
for this to land in pg16 if possible.
The development of Postgres 16 has finished last April and this
version is in a stabilization period. We are currently running the
first commit fest of PostgreSQL 17, though. I would recommend to
register your patch here so as it is not lost:
https://commitfest.postgresql.org/44/I added regression tests and they are passing. I also formatted the code
using the project tools. Docs are included too.Hmm. What are the use cases you have in mind where you need to know
the list of indexes listed by an event trigger like this? Just
monitoring to know which indexes have been rebuilt? We have a
table_rewrite which is equivalent to check if a relfilenode has
changed, so why not, I guess..Assuming that ddl_command_start is not supported is incorrect. For
example, with your patch applied, if I do the following setup:
create event trigger regress_event_trigger on ddl_command_start
execute procedure test_event_trigger();
create function test_event_trigger() returns event_trigger as $$
BEGIN
RAISE NOTICE 'test_event_trigger: % %', tg_event, tg_tag;
END
$$ language plpgsql;Then a REINDEX gives me that:
reindex table pg_class;
NOTICE: 00000: test_event_trigger: ddl_command_start REINDEXThe first paragraphs of event-trigger.sgml describe the following:
The <literal>ddl_command_start</literal> event occurs just before the
execution of a <literal>CREATE</literal>, <literal>ALTER</literal>,
<literal>DROP</literal>,
<literal>SECURITY LABEL</literal>,
<literal>COMMENT</literal>, <literal>GRANT</literal> or
<literal>REVOKE</literal>
[...]
The <literal>ddl_command_end</literal> event occurs just after the
execution of
this same set of commands. To obtain more details on the
<acronym>DDL</acronym>So it would need a refresh to list REINDEX.
case T_ReindexStmt: - lev = LOGSTMT_ALL; /* should this be DDL? */ + lev = LOGSTMT_DDL;This, unfortunately, may not be as right as it is simple because
REINDEX is not a DDL as far as I know (it is not in the SQL
standard).
--
Michael
Thank you for this, jian he! I will include it in the next patch version.
P.S. Sorry for the double post. Sent from the wrong email address so I'm
resending so the mailing list gets the email. My apologies!
On Wed, Jul 26, 2023 at 4:30 AM jian he <jian.universality@gmail.com> wrote:
Show quoted text
On Wed, Jul 26, 2023 at 7:51 AM Michael Paquier <michael@paquier.xyz>
wrote:On Tue, Jul 25, 2023 at 04:34:47PM +0800, jian he wrote:
so T_ReindexStmt should only be in ProcessUtilitySlow, if you want
to create an event trigger on reindex?regression tests work fine. I even play with partitions.
It would be an idea to have some regression tests for partitions,
actually, so as some patterns around ReindexMultipleInternal() are
checked. We could have a REINDEX DATABASE in a TAP test with an event
trigger, as well, but I don't feel strongly about the need to do that
much extra work in 090_reindexdb.pl or 091_reindexdb_all.pl if
partitions cover the multi-table case.
--
Michaelquite verbose, copied from partition-info.sql. meet the expectation:
partitioned index will do nothing, partition index will trigger event
trigger.
------------------------------------------------
DROP EVENT TRIGGER IF EXISTS end_reindex_command CASCADE;
DROP EVENT TRIGGER IF EXISTS start_reindex_command CASCADE;BEGIN;
CREATE OR REPLACE FUNCTION reindex_end_command()
RETURNS event_trigger AS $$
DECLARE
obj record;
BEGIN
raise notice 'begin of reindex_end_command';FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
RAISE NOTICE
'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
,obj.command_tag,
obj.object_type,obj.schema_name,obj.object_identity;
RAISE NOTICE 'ddl_end_command -- REINDEX: %',
pg_get_indexdef(obj.objid);
END LOOP;
END;
$$ LANGUAGE plpgsql;CREATE OR REPLACE FUNCTION start_reindex_command()
RETURNS event_trigger AS $$
DECLARE
obj record;
BEGIN
FOR obj IN SELECT * FROM pg_event_trigger_ddl_commands()
LOOP
RAISE NOTICE
'obj.command_tag:% obj.object_type:% obj.schema_name:%
obj.object_identity:%'
, obj.command_tag,
obj.object_type,obj.schema_name,obj.object_identity;
RAISE NOTICE 'ddl_start_command -- REINDEX: %',
pg_get_indexdef(obj.objid);
END LOOP;
raise notice 'end of start_reindex_command';
END;
$$ LANGUAGE plpgsql;BEGIN;
CREATE EVENT TRIGGER end_reindex_command ON ddl_command_end
WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE reindex_end_command();
CREATE EVENT TRIGGER start_reindex_command ON ddl_command_start
WHEN TAG IN ('REINDEX') EXECUTE PROCEDURE start_reindex_command();
COMMIT;-- test Reindex Event Trigger
BEGIN;
drop table if EXISTS ptif_test CASCADE;
CREATE TABLE ptif_test (a int, b int) PARTITION BY range (a);
CREATE TABLE ptif_test0 PARTITION OF ptif_test
FOR VALUES FROM (minvalue) TO (0) PARTITION BY list (b);
CREATE TABLE ptif_test01 PARTITION OF ptif_test0 FOR VALUES IN (1);
CREATE TABLE ptif_test1 PARTITION OF ptif_test
FOR VALUES FROM (0) TO (100) PARTITION BY list (b);
CREATE TABLE ptif_test11 PARTITION OF ptif_test1 FOR VALUES IN (1);CREATE TABLE ptif_test2 PARTITION OF ptif_test
FOR VALUES FROM (100) TO (200);
-- This partitioned table should remain with no partitions.
CREATE TABLE ptif_test3 PARTITION OF ptif_test
FOR VALUES FROM (200) TO (maxvalue) PARTITION BY list (b);-- Test index partition tree
CREATE INDEX ptif_test_index ON ONLY ptif_test (a);CREATE INDEX ptif_test0_index ON ONLY ptif_test0 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test0_index;CREATE INDEX ptif_test01_index ON ptif_test01 (a);
ALTER INDEX ptif_test0_index ATTACH PARTITION ptif_test01_index;CREATE INDEX ptif_test1_index ON ONLY ptif_test1 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test1_index;CREATE INDEX ptif_test11_index ON ptif_test11 (a);
ALTER INDEX ptif_test1_index ATTACH PARTITION ptif_test11_index;CREATE INDEX ptif_test2_index ON ptif_test2 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test2_index;CREATE INDEX ptif_test3_index ON ptif_test3 (a);
ALTER INDEX ptif_test_index ATTACH PARTITION ptif_test3_index;
COMMIT;--top level partitioned index. will recurse to each partition index.
REINDEX INDEX CONCURRENTLY public.ptif_test_index;--ptif_test0 is partitioned table. it will index partition:
ptif_test01_index
-- event trigger will log ptif_test01_index
REINDEX INDEX CONCURRENTLY public.ptif_test0_index;--ptif_test1_index is partitioned index. it will index partition:
ptif_test11_index
-- event trigger will effect on partion index:ptif_test11_index
REINDEX INDEX CONCURRENTLY public.ptif_test1_index;--ptif_test2 is a partition. event trigger will log ptif_test2_index
REINDEX INDEX CONCURRENTLY public.ptif_test2_index;--no partitions. event trigger won't do anything.
REINDEX INDEX CONCURRENTLY public.ptif_test3_index;reindex table ptif_test; --top level. will recurse to each partition
index.
reindex table ptif_test0; -- will direct to ptif_test01
reindex table ptif_test01; -- will index it's associtaed index
reindex table ptif_test11; -- will index it's associtaed index
reindex table ptif_test2; -- will index it's associtaed index
reindex table ptif_test3; -- no partion, index won't do anything.DROP EVENT TRIGGER IF EXISTS end_reindex_command CASCADE;
DROP EVENT TRIGGER IF EXISTS start_reindex_command CASCADE;
DROP FUNCTION IF EXISTS reindex_start_command;
DROP FUNCTION IF EXISTS reindex_end_command;
DROP TABLE if EXISTS ptif_test CASCADE;
-----------------------
I added a v2 patch for adding REINDEX to event triggers. The following has
changed:
1. I fixed the docs to note that ddl_command_start is supported for
REINDEX. Thanks, Michael!
2. I added Jian He's excellent partition table test and updated the
expectations to include that regression test.
What is still up for debate:
Michael pointed out that REINDEX probably isn't DDL because the docs only
specify commands in the standard like CREATE, ALTER, etc. It might be worth
creating a new event to trigger on (similar to what was done for
table_rewrite) to make a REINDEX trigger fit in a little nicer. Let me
know what you think here. Until then, I left the code as `lev =
LOGSTMT_DDL` for `T_ReindexStmt`.
Thanks,
Garrett
On Thu, Jul 20, 2023 at 10:47 PM Garrett Thornburg <film42@gmail.com> wrote:
Show quoted text
Added my v1 patch to add REINDEX to event triggers.
I originally built this against pg15 but rebased to master for the patch
to hopefully make it easier for maintainers to merge. The rebase was
automatic so it should be easy to include into any recent version. I'd love
for this to land in pg16 if possible.I added regression tests and they are passing. I also formatted the code
using the project tools. Docs are included too.A few notes:
1. I tried to not touch the function parameters too much and opted to
create a convenience function that makes it easy to attach index or table
OIDs to the current event trigger list.
2. I made sure both concurrent and regular reindex of index, table,
database work as expected.
3. The ddl end command will make available all indexes that were modified
by the reindex command. This is a large list when you run "reindex
database" but works really well. I debated returning records of the table
or DB that were reindexed but that doesn't really make sense. Returning
each index fits my use case of building an audit record around the index
lifecycle (hence the motivation for the patch).Thanks for your time,
Garrett Thornburg
Attachments:
v2-0001-Add-REINDEX-tag-to-event-triggers.patchapplication/octet-stream; name=v2-0001-Add-REINDEX-tag-to-event-triggers.patchDownload+327-4
Greetings
On 27.07.23 06:43, Garrett Thornburg wrote:
I added a v2 patch for adding REINDEX to event triggers. The following
has changed:1. I fixed the docs to note that ddl_command_start is supported for
REINDEX. Thanks, Michael!
2. I added Jian He's excellent partition table test and updated the
expectations to include that regression test.What is still up for debate:
Michael pointed out that REINDEX probably isn't DDL because the docs
only specify commands in the standard like CREATE, ALTER, etc. It
might be worth creating a new event to trigger on (similar to what was
done for table_rewrite) to make a REINDEX trigger fit in a little
nicer. Let me know what you think here. Until then, I left the code as
`lev = LOGSTMT_DDL` for `T_ReindexStmt`.Thanks,
Garrett
Thanks for the patch, Garrett!
I was unable to apply it in 7ef5f5f
<https://github.com/postgres/postgres/commit/7ef5f5fb324096c7822c922ad59fd7fdd76f57b1>
$ git apply -v v2-0001-Add-REINDEX-tag-to-event-triggers.patch
Checking patch doc/src/sgml/event-trigger.sgml...
Checking patch src/backend/commands/indexcmds.c...
error: while searching for:
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
bool isTopLevel);
static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg);
static Oid ReindexTable(RangeVar *relation, ReindexParams *params,
error: patch failed: src/backend/commands/indexcmds.c:94
error: src/backend/commands/indexcmds.c: patch does not apply
Checking patch src/backend/tcop/utility.c...
Checking patch src/include/tcop/cmdtaglist.h...
Checking patch src/test/regress/expected/event_trigger.out...
Hunk #1 succeeded at 556 (offset 2 lines).
Checking patch src/test/regress/sql/event_trigger.sql...
Am I missing something?
Jim
On Wed, Aug 30, 2023 at 8:38 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
Greetings
I was unable to apply it in 7ef5f5f
$ git apply -v v2-0001-Add-REINDEX-tag-to-event-triggers.patch
Checking patch doc/src/sgml/event-trigger.sgml...
Checking patch src/backend/commands/indexcmds.c...
error: while searching for:
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
bool isTopLevel);
static void RangeVarCallbackForReindexIndex(const RangeVar *relation,
Oid relId, Oid oldRelId, void *arg);
static Oid ReindexTable(RangeVar *relation, ReindexParams *params,error: patch failed: src/backend/commands/indexcmds.c:94
error: src/backend/commands/indexcmds.c: patch does not apply
Checking patch src/backend/tcop/utility.c...
Checking patch src/include/tcop/cmdtaglist.h...
Checking patch src/test/regress/expected/event_trigger.out...
Hunk #1 succeeded at 556 (offset 2 lines).
Checking patch src/test/regress/sql/event_trigger.sql...Am I missing something?
Jim
because the change made in here:
https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
I manually slight edited the patch content:
from
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
bool isTopLevel);
to
static List *ChooseIndexColumnNames(const List *indexElems);
static void ReindexIndex(const RangeVar *indexRelation, const
ReindexParams *params,
bool isTopLevel);
can you try the attached one.
Attachments:
v2-0002-Add-REINDEX-tag-to-event-triggers.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Add-REINDEX-tag-to-event-triggers.patchDownload+327-3
On 01.09.23 11:23, jian he wrote:
because the change made in here:
https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7I manually slight edited the patch content:
from
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
bool isTopLevel);
to
static List *ChooseIndexColumnNames(const List *indexElems);
static void ReindexIndex(const RangeVar *indexRelation, const
ReindexParams *params,
bool isTopLevel);can you try the attached one.
The patch applies cleanly. However, now the compiler complains about a
qualifier mismatch
indexcmds.c:2707:33: warning: assignment discards ‘const’ qualifier from
pointer target type [-Wdiscarded-qualifiers]
2707 | currentReindexStatement = stmt;
Perhaps 'const ReindexStmt *currentReindexStatement'?
Tests also work just fine. I also tested it against unique and partial
indexes, and it worked as expected.
Jim
On Fri, Sep 1, 2023 at 6:40 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
On 01.09.23 11:23, jian he wrote:
because the change made in here:
https://git.postgresql.org/cgit/postgresql.git/diff/src/backend/commands/indexcmds.c?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7I manually slight edited the patch content:
from
static List *ChooseIndexColumnNames(List *indexElems);
static void ReindexIndex(RangeVar *indexRelation, ReindexParams *params,
bool isTopLevel);
to
static List *ChooseIndexColumnNames(const List *indexElems);
static void ReindexIndex(const RangeVar *indexRelation, const
ReindexParams *params,
bool isTopLevel);can you try the attached one.
The patch applies cleanly. However, now the compiler complains about a
qualifier mismatchindexcmds.c:2707:33: warning: assignment discards ‘const’ qualifier from
pointer target type [-Wdiscarded-qualifiers]
2707 | cPerhaps 'const ReindexStmt *currentReindexStatement'?
Tests also work just fine. I also tested it against unique and partial
indexes, and it worked as expected.Jim
Thanks for pointing this out!
also due to commit
https://git.postgresql.org/cgit/postgresql.git/commit/?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
ExecReindex function input arguments also changed. so we need to
rebase this patch.
change to
currentReindexStatement = unconstify(ReindexStmt*, stmt);
seems to work for me. I tested it, no warning. But I am not 100% sure.
anyway I refactored the patch, making it git applyable
also change from "currentReindexStatement = stmt;" to
"currentReindexStatement = unconstify(ReindexStmt*, stmt);" due to
ExecReindex function changes.
Attachments:
v3-0001-Add-REINDEX-tag-to-event-triggers.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Add-REINDEX-tag-to-event-triggers.patchDownload+327-4
On 01.09.23 18:10, jian he wrote:
Thanks for pointing this out!
Thanks for the fix!
also due to commit
https://git.postgresql.org/cgit/postgresql.git/commit/?id=11af63fb48d278b86aa948a5b57f136ef03c2bb7
ExecReindex function input arguments also changed. so we need to
rebase this patch.change to
currentReindexStatement = unconstify(ReindexStmt*, stmt);
seems to work for me. I tested it, no warning. But I am not 100% sure.anyway I refactored the patch, making it git applyable
also change from "currentReindexStatement = stmt;" to
"currentReindexStatement = unconstify(ReindexStmt*, stmt);" due to
ExecReindex function changes.
LGTM. It applies and builds cleanly, all tests pass and documentation
also builds ok. The CFbot seems also much happier now :)
I tried this "small stress test" to see if the code was leaking .. but
it all looks ok to me:
DO $$
BEGIN
FOR i IN 1..10000 LOOP
REINDEX INDEX reindex_test.reindex_test1_idx1;
REINDEX TABLE reindex_test.reindex_tester1;
END LOOP;
END;
$$;
Jim
On Mon, Sep 04, 2023 at 08:00:52PM +0200, Jim Jones wrote:
LGTM. It applies and builds cleanly, all tests pass and documentation also
builds ok. The CFbot seems also much happier now :)
+ /*
+ * Open and lock the relation. ShareLock is sufficient since we only need
+ * to prevent schema and data changes in it. The lock level used here
+ * should match catalog's reindex_relation().
+ */
+ rel = try_table_open(relid, ShareLock);
I was eyeing at 0003, and this strikes me as incorrect. Sure, this
matches what reindex_relation() does, but you've missed that
CONCURRENTLY takes a lighter ShareUpdateExclusiveLock, and ShareLock
conflicts with it. See:
https://www.postgresql.org/docs/devel/explicit-locking.html
So, doesn't this disrupt a concurrent REINDEX?
--
Michael
On Fri, Oct 27, 2023 at 3:15 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 04, 2023 at 08:00:52PM +0200, Jim Jones wrote:
LGTM. It applies and builds cleanly, all tests pass and documentation also
builds ok. The CFbot seems also much happier now :)+ /* + * Open and lock the relation. ShareLock is sufficient since we only need + * to prevent schema and data changes in it. The lock level used here + * should match catalog's reindex_relation(). + */ + rel = try_table_open(relid, ShareLock);I was eyeing at 0003, and this strikes me as incorrect. Sure, this
matches what reindex_relation() does, but you've missed that
CONCURRENTLY takes a lighter ShareUpdateExclusiveLock, and ShareLock
conflicts with it. See:
https://www.postgresql.org/docs/devel/explicit-locking.htmlSo, doesn't this disrupt a concurrent REINDEX?
--
Michael
ReindexPartitions called ReindexMultipleInternal
ReindexRelationConcurrently add reindex_event_trigger_collect to cover
it. (line 3869)
ReindexIndex has the function reindex_event_trigger_collect. (line 2853)
reindex_event_trigger_collect_relation called in
ReindexMultipleInternal, ReindexTable (line 2979).
Both are "under concurrent is false" branches.
So it should be fine.
On Sat, Oct 28, 2023 at 12:15:22PM +0800, jian he wrote:
reindex_event_trigger_collect_relation called in
ReindexMultipleInternal, ReindexTable (line 2979).
Both are "under concurrent is false" branches.So it should be fine.
Sorry for the delay in replying.
Indeed, you're right. reindex_event_trigger_collect_relation() gets
only called in two paths for the non-concurrent cases just after
reindex_relation(), which is not a safe location to call it because
this may be used in CLUSTER or TRUNCATE, and the intention of the
patch is only to count for the objects in REINDEX.
Anyway, I think that the current implementation is incorrect. The
object is collected after the reindex is done, which is right.
However, an object may be reindexed but not be reported if it was
dropped between the reindex and its endwhen collecting all the objects
of a single relation. Perhaps it does not matter because the object
is gone, but that still looks incorrect to me because we finish by not
reporting everything we should. I think that we should put the
collection deeper into the stack, where we know that we are holding
the locks on the objects we are collecting. Another side effect is
that the patch seems to be missing toast table indexes, which are also
reindexed for a REINDEX TABLE, for instance. That's not right.
Actually, could there be an argument for pushing the collection down
to reindex_relation() instead to count for all the commands that?
Even better, reindex_index() would be a better candidate because it is
itself called within reindex_relation() for each individual indexes?
If a code path should not be covered for event triggers, we could use
a new REINDEXOPT_ to control that, optionally. In short, it looks
like we should try to reduce the number of paths calling
reindex_event_trigger_collect(), while collect_relation() ought to be
removed.
Should REINDEX TABLE CONCURRENTLY's coverage be extended so as two new
indexes are reported instead of just one?
REINDEX SCHEMA is a command that perhaps should be tested to collect
all the indexes rebuilt through it?
A side-effect of this patch is that it impacts ddl_command_start and
ddl_command_end with the addition of REINDEX. Mixing that with the
addition of a new event is strange. I think that the addition of
REINDEX to the existing events should be done first, as a separate
patch. Then a second patch should deal with the collection and the
support of the new event.
I have also dug into the archives to note that control commands have
been proposed in the past to be added as part of event triggers, and
it happens that I've mentioned in a comment that that this was a
concept perhaps contrary to what event triggers should do, as these
are intended for DDLs:
/messages/by-id/CAB7nPqTqZ2-YcNzOQ5KVBUJYHQ4kDSd4Q55Mc-fBzM8GH0bV2Q@mail.gmail.com
Philosophically, I'm open on all that and having more commands
depending on the cases that are satisfied, FWIW, but let's organize
the changes.
--
Michael
On Mon, Nov 20, 2023 at 3:34 PM Michael Paquier <michael@paquier.xyz> wrote:
On Sat, Oct 28, 2023 at 12:15:22PM +0800, jian he wrote:
reindex_event_trigger_collect_relation called in
ReindexMultipleInternal, ReindexTable (line 2979).
Both are "under concurrent is false" branches.So it should be fine.
Sorry for the delay in replying.
Indeed, you're right. reindex_event_trigger_collect_relation() gets
only called in two paths for the non-concurrent cases just after
reindex_relation(), which is not a safe location to call it because
this may be used in CLUSTER or TRUNCATE, and the intention of the
patch is only to count for the objects in REINDEX.Anyway, I think that the current implementation is incorrect. The
object is collected after the reindex is done, which is right.
However, an object may be reindexed but not be reported if it was
dropped between the reindex and its endwhen collecting all the objects
of a single relation. Perhaps it does not matter because the object
is gone, but that still looks incorrect to me because we finish by not
reporting everything we should. I think that we should put the
collection deeper into the stack, where we know that we are holding
the locks on the objects we are collecting. Another side effect is
that the patch seems to be missing toast table indexes, which are also
reindexed for a REINDEX TABLE, for instance. That's not right.Actually, could there be an argument for pushing the collection down
to reindex_relation() instead to count for all the commands that?
Even better, reindex_index() would be a better candidate because it is
itself called within reindex_relation() for each individual indexes?
If a code path should not be covered for event triggers, we could use
a new REINDEXOPT_ to control that, optionally. In short, it looks
like we should try to reduce the number of paths calling
reindex_event_trigger_collect(), while collect_relation() ought to be
removed.Should REINDEX TABLE CONCURRENTLY's coverage be extended so as two new
indexes are reported instead of just one?REINDEX SCHEMA is a command that perhaps should be tested to collect
all the indexes rebuilt through it?A side-effect of this patch is that it impacts ddl_command_start and
ddl_command_end with the addition of REINDEX. Mixing that with the
addition of a new event is strange. I think that the addition of
REINDEX to the existing events should be done first, as a separate
patch. Then a second patch should deal with the collection and the
support of the new event.I have also dug into the archives to note that control commands have
been proposed in the past to be added as part of event triggers, and
it happens that I've mentioned in a comment that that this was a
concept perhaps contrary to what event triggers should do, as these
are intended for DDLs:
/messages/by-id/CAB7nPqTqZ2-YcNzOQ5KVBUJYHQ4kDSd4Q55Mc-fBzM8GH0bV2Q@mail.gmail.comPhilosophically, I'm open on all that and having more commands
depending on the cases that are satisfied, FWIW, but let's organize
the changes.
--
Michael
Hi.
Top level func is ExecReIndex. it will call {ReindexIndex,
ReindexTable, ReindexMultipleTables}
then trace deep down, it will call {ReindexRelationConcurrently, reindex_index}.
Imitate the CREATE INDEX logic, we need pass `const ReindexStmt *stmt`
to all the following functions:
ReindexIndex
ReindexTable
ReindexMultipleTables
ReindexPartitions
ReindexMultipleInternal
ReindexRelationConcurrently
reindex_relation
reindex_index
because the EventTriggerCollectSimpleCommand needs the `(ReindexStmt
*) parsetree`.
and we call EventTriggerCollectSimpleCommand at reindex_index or
ReindexRelationConcurrently.
The new test will cover the following case.
reindex (concurrently) schema;
reindex (concurrently) partitioned index;
reindex (concurrently) partitioned table;
not sure this is the right approach. But now the reindex event
collection is after we call index_open.
same static function (reindex_event_trigger_collect) in
src/backend/commands/indexcmds.c and src/backend/catalog/index.c for
now.
given that ProcessUtilitySlow supports the event triggers facility.
so probably no need for another REINDEXOPT_ option?
I also added a test for disabled event triggers.
Attachments:
v4-0001-Add-REINDEX-tag-to-event-triggers.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Add-REINDEX-tag-to-event-triggers.patchDownload+242-39
On Fri, Nov 24, 2023 at 08:00:00AM +0800, jian he wrote:
not sure this is the right approach. But now the reindex event
collection is after we call index_open.
same static function (reindex_event_trigger_collect) in
src/backend/commands/indexcmds.c and src/backend/catalog/index.c for
now.
Knowing that there are two calls of reindex_relation() and four
callers of reindex_index(), passing the stmt looks acceptable to me to
be able to get more context from the reindex statement where a given
index has been rebuilt.
Anyway, as a whole, I am still confused by the shape of the patch..
This relies on pg_event_trigger_ddl_commands(), that reports a list of
DDL commands, but the use case you are proposing is to report a list
of the *indexes* rebuilt. So, in its current shape, we'd report the
equivalent of N DDL commands matching the N indexes rebuilt in a
single command. Shouldn't we use a new event type and a dedicated SQL
function to report the list of indexes newly rebuilt to get a full
list of the work that has happened?
given that ProcessUtilitySlow supports the event triggers facility.
so probably no need for another REINDEXOPT_ option?
Right, perhaps we don't need that if we just supply the stmt. If
there is no requirement for it, let's avoid that, then.
- ReindexMultipleTables(stmt->name, stmt->kind, ¶ms);
+ ReindexMultipleTables(stmt, stmt->name, stmt->kind, ¶ms);
If we begin including the stmt, there is no need for these extra
arguments in the extra routines of indexcmds.c.
As far as I can see, this patch is doing too much as presented. Could
you split the patch into more pieces, please? Based on v4 you have
sent, there are refactoring and basic piece parts like:
- Patch to make event triggers ddl_command_start and ddl_command_stop
react on ReindexStmt, so as a command is reported in
pg_event_trigger_ddl_commands().
- Patch for GetCommandLogLevel() to switch ReindexStmt from
LOGSTMT_ALL to LOGSTMT_DDL. True that this could be switched.
- Patch to refactor the routines of indexcmds.c and index.c to use the
ReindexStmt as argument, as a preparation of the next patch...
- Patch to add a new event type with a new SQL function to return a
list of the indexes rebuilt, with the ReindexStmt involved when the
index OID was trapped by the collection.
1) and 2) are a minimal feature in themselves, which may be enough to
satisfy your original case, and where you'd know when a REINDEX has
been run in event triggers. 3) and 4) are what you are trying to
achieve, to get the list of the indexes rebuilt knowing the context of
a given command.
--
Michael
On Fri, Nov 24, 2023 at 10:44 AM Michael Paquier <michael@paquier.xyz> wrote:
As far as I can see, this patch is doing too much as presented. Could
you split the patch into more pieces, please? Based on v4 you have
sent, there are refactoring and basic piece parts like:
- Patch to make event triggers ddl_command_start and ddl_command_stop
react on ReindexStmt, so as a command is reported in
pg_event_trigger_ddl_commands().
- Patch for GetCommandLogLevel() to switch ReindexStmt from
LOGSTMT_ALL to LOGSTMT_DDL. True that this could be switched.
- Patch to refactor the routines of indexcmds.c and index.c to use the
ReindexStmt as argument, as a preparation of the next patch...
- Patch to add a new event type with a new SQL function to return a
list of the indexes rebuilt, with the ReindexStmt involved when the
index OID was trapped by the collection.1) and 2) are a minimal feature in themselves, which may be enough to
satisfy your original case, and where you'd know when a REINDEX has
been run in event triggers. 3) and 4) are what you are trying to
achieve, to get the list of the indexes rebuilt knowing the context of
a given command.
--
Michael
hi.
v5-0001. changed the REINDEX command tag from event_trigger_ok: false
to event_trigger_ok: true.
Move ReindexStmt moves from standard_ProcessUtility to ProcessUtilitySlow.
By default ProcessUtilitySlow will call trigger related functions.
So the event trigger facility can support reindex statements.
v5-0002. In GetCommandLogLevel, change T_ReindexStmt from lev =
LOGSTMT_ALL to lev = LOGSTMT_DDL. so log_statement (enum) >= 'ddl'
will make the reindex statement be logged.
v5-0003. Refactor the following functions: {ReindexIndex,
ReindexTable, ReindexMultipleTables,
ReindexPartitions,ReindexMultipleInternal
,ReindexRelationConcurrently, reindex_relation,reindex_index} by
adding `const ReindexStmt *stmt` as their first argument.
This is for event trigger support reindex. We need to pass both the
newly generated indexId and the ReindexStmt to
EventTriggerCollectSimpleCommand right after the newly index gets
their lock. To do that, we have to refactor related functions.
v5-0004. Event trigger support reindex command implementation,
documentation, regress test, helper function pass reindex info to
function EventTriggerCollectSimpleCommand.
Attachments:
v5-0002-tag-ReindexStmt-as-ddl-in-GetCommandLogLevel.patchtext/x-patch; charset=US-ASCII; name=v5-0002-tag-ReindexStmt-as-ddl-in-GetCommandLogLevel.patchDownload+1-2
v5-0004-Event-trigger-support-reindex-command.patchtext/x-patch; charset=US-ASCII; name=v5-0004-Event-trigger-support-reindex-command.patchDownload+225-2
v5-0003-refactor-multiple-functions-to-support-tracking-r.patchtext/x-patch; charset=US-ASCII; name=v5-0003-refactor-multiple-functions-to-support-tracking-r.patchDownload+54-37
v5-0001-make-event-triggers-facility-react-to-ReindexStmt.patchtext/x-patch; charset=US-ASCII; name=v5-0001-make-event-triggers-facility-react-to-ReindexStmt.patchDownload+10-3
On Mon, Nov 27, 2023 at 11:00 AM jian he <jian.universality@gmail.com> wrote:
On Fri, Nov 24, 2023 at 10:44 AM Michael Paquier <michael@paquier.xyz> wrote:
hi.
v5-0001. changed the REINDEX command tag from event_trigger_ok: false
to event_trigger_ok: true.
Move ReindexStmt moves from standard_ProcessUtility to ProcessUtilitySlow.
By default ProcessUtilitySlow will call trigger related functions.
So the event trigger facility can support reindex statements.v5-0002. In GetCommandLogLevel, change T_ReindexStmt from lev =
LOGSTMT_ALL to lev = LOGSTMT_DDL. so log_statement (enum) >= 'ddl'
will make the reindex statement be logged.v5-0003. Refactor the following functions: {ReindexIndex,
ReindexTable, ReindexMultipleTables,
ReindexPartitions,ReindexMultipleInternal
,ReindexRelationConcurrently, reindex_relation,reindex_index} by
adding `const ReindexStmt *stmt` as their first argument.
This is for event trigger support reindex. We need to pass both the
newly generated indexId and the ReindexStmt to
EventTriggerCollectSimpleCommand right after the newly index gets
their lock. To do that, we have to refactor related functions.v5-0004. Event trigger support reindex command implementation,
documentation, regress test, helper function pass reindex info to
function EventTriggerCollectSimpleCommand.
I just started reviewing the patch. Some minor comments:
In patch 0001:
In standard_ProcessUtility(), since you are unconditionally calling
ProcessUtilitySlow() in case of T_ReindexStmt, you really don't need
the case statement for T_ReindexStmt just like all the other commands
which have event trigger support. It will call ProcessUtilitySlow() as
default.
In patch 0004:
No need to duplicate reindex_event_trigger_collect in indexcmds.c
since it is already present in index.c. Add it to index.h and make the
function extern so that it can be accessed in both index.c and
indexcmds.c
regards,
Ajin Cherian
Fujitsu Australia