Foreign key performance

Started by Kevin Brownover 22 years ago16 messages
#1Kevin Brown
kevin@sysexperts.com

I'm using 7.3.2 on Linux, with a decent amount of muscle behind it
(1.5 GHz PPro CPU, 1G mem, 20M/sec disks, xlog on different disk than
data).

I've got a database that has several foreign keys, and I'm copying a
bunch of data from an MS-SQL server into it via Perl DBI. I noticed
that inserts into this database are very slow, on the order of 100 per
second on this hardware. All the inserts are happening in a single
transaction. The postmaster I'm connected to appears to be CPU
limited, as it's pegging the CPU at a constant 85 percent or more.

I have no problem with that under normal circumstances (i.e., the
foreign key constraints are actively being enforced): it may well be
the nature of foreign keys, but the problem is this: all the keys are
DEFERRABLE INITIALLY DEFERRED and, on top of that, the Perl program
will SET CONSTRAINTS ALL DEFERRED at the beginning of the transaction.

If I remove all the foreign key constraints, my performance goes up to
700 inserts per second!

Why isn't the insert performance with all the constraints deferred
approximating that of the performance I get without the foreign keys??
If anything, I should get a big delay at transaction commit time while
all the foreign key constraints are checked (and, indeed, I get that
too), but the performance during the transaction prior to the commit
should be the same as it is without the foreign key constraints.

It's almost as if the foreign key constraints are being invoked and
the results ignored during the inserts...

In essence, this smells like a bug to me, but I don't know enough
about the internals to really call it that.

Any ideas on what can be done about this?

--
Kevin Brown kevin@sysexperts.com

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Kevin Brown (#1)
Re: [PERFORM] Foreign key performance

On Thu, 17 Apr 2003, Kevin Brown wrote:

I have no problem with that under normal circumstances (i.e., the
foreign key constraints are actively being enforced): it may well be
the nature of foreign keys, but the problem is this: all the keys are
DEFERRABLE INITIALLY DEFERRED and, on top of that, the Perl program
will SET CONSTRAINTS ALL DEFERRED at the beginning of the transaction.

If I remove all the foreign key constraints, my performance goes up to
700 inserts per second!

Why isn't the insert performance with all the constraints deferred
approximating that of the performance I get without the foreign keys??

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

On batch inserts to three tables each with a foreign key to a table
containing one row (and inserts of lots of that value), I saw a ratio of
approximately 1:1.7:7 for normal inserts:non-deferred fk:deferred fk on my
7.4 dev server.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#2)
Re: [PERFORM] Foreign key performance

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

At one time the deferred-trigger queue had an O(N^2) behavioral problem
for large N = number of pending trigger events. But I thought we'd
fixed that. What's the test case exactly? Can you get a profile with
gprof?

regards, tom lane

#4Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#3)
Re: [PERFORM] Foreign key performance

On Fri, 18 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

At one time the deferred-trigger queue had an O(N^2) behavioral problem
for large N = number of pending trigger events. But I thought we'd
fixed that. What's the test case exactly? Can you get a profile with
gprof?

I'm going to tomorrow hopefully - but it looks to me that we fixed one, but
possibly not another place where we read through the list unnecessarily
AFAICS. I think deferredTriggerInvokeEvents (when called with
immediate_only = true) is going to go through the entire list looking for
immediate triggers to fire after each statement. However, excepting set
constraints, any immediate triggers for any event added prior to this
statement will by necessity have already been run unless I'm missing
something, which means that we're often looking through entries that
aren't going to have any triggers to run now in any case.

Keeping a pointer to the end of the list as of last statement and going
through the list from there cut the time for the deferred case in half in
my simple test (about 3.3x the no fk and just under 2x the immediate).

#5Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Stephan Szabo (#4)
Re: [PERFORM] Foreign key performance

[Not sure this really is relevant for -performance at this point]

On Thu, 17 Apr 2003, Stephan Szabo wrote:

On Fri, 18 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

At one time the deferred-trigger queue had an O(N^2) behavioral problem
for large N = number of pending trigger events. But I thought we'd
fixed that. What's the test case exactly? Can you get a profile with
gprof?

I'm going to tomorrow hopefully - but it looks to me that we fixed one, but

Argh. I'm getting that state where gprof returns all 0s for times. I'm
pretty sure this has come up before along with how to get it to work, but
I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

In any case, the call list seemed reasonable. It's currently doing O(n^2)
calls to MemoryContextReset and deferredTriggerCheckState in InvokeEvents
I don't see anything else that's at that kind of number of calls (50
million calls for a backend that's only done 10000 inserts stands out a
bit). Going only from last statement seems to make it linear (I think my
attempt is checking 1 too many trigger values, need to change that
probably).

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#5)
Re: [PERFORM] Foreign key performance

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Argh. I'm getting that state where gprof returns all 0s for times. I'm
pretty sure this has come up before along with how to get it to work, but
I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

You're on Linux? You need to compile postmaster.c with -DLINUX_PROFILE.
But the call counts do sound pretty damning.

regards, tom lane

#7Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#6)
Re: [PERFORM] Foreign key performance

On Fri, 18 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Argh. I'm getting that state where gprof returns all 0s for times. I'm
pretty sure this has come up before along with how to get it to work, but
I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

You're on Linux? You need to compile postmaster.c with -DLINUX_PROFILE.

Yep, thanks. :)

But the call counts do sound pretty damning.

Yeah, but even with my hack last night it was still appreciably slower
than immediate constraints. Comparing the call counts in that function
for the immediate versus deferred(hacked) weren't giving me a good idea of
where that time was going.

#8Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Stephan Szabo (#7)
1 attachment(s)
Re: [PERFORM] Foreign key performance

On Fri, 18 Apr 2003, Stephan Szabo wrote:

On Fri, 18 Apr 2003, Tom Lane wrote:

But the call counts do sound pretty damning.

Yeah, but even with my hack last night it was still appreciably slower
than immediate constraints. Comparing the call counts in that function
for the immediate versus deferred(hacked) weren't giving me a good idea of
where that time was going.

This last was due to assert checking I think. AllocSetCheck was the big
time waster on the hacked deferred case. Turning off assert checking I
get:

Median over 3 100000 inserts in one transaction (excepting the original
code which took a really long time so I ran it once) from time psql ...

No Fk 24.14s
Immediate FK 42.80s
Original Deferred FK 1862.06s
Hacked Deferred FK 35.30s

The hack was just the keeping around the list pointer from the last run
through (see attached - passed simple fk tests and regression, but there
might be problems I don't see). Looking at the code, I also wonder if we
would get some gain by not allocating the per_tuple_context at the
beginning but only when a non-deferred constraint is found since otherwise
we're creating and destroying the context and possibly never using it.
The cost would presumably be some boolean tests inside the inner loop.

Attachments:

trigger.patchtext/plain; charset=US-ASCII; name=trigger.patchDownload
*** src/backend/commands/trigger.c	2003-04-19 12:41:24.000000000 -0700
--- src/backend/commands/trigger.hack1.c	2003-04-19 13:41:17.000000000 -0700
***************
*** 1626,1637 ****
--- 1626,1643 ----
   * Because this can grow pretty large, we don't use separate List nodes,
   * but instead thread the list through the dte_next fields of the member
   * nodes.  Saves just a few bytes per entry, but that adds up.
+  * 
+  * deftrig_events_imm holds the tail pointer as of the last 
+  * deferredTriggerInvokeEvents call, since generally immediate triggers
+  * can't exist before this point (since they should already have been
+  * called) excepting SET CONSTRAINTS
   *
   * XXX Need to be able to shove this data out to a file if it grows too
   *	   large...
   * ----------
   */
  static DeferredTriggerEvent deftrig_events;
+ static DeferredTriggerEvent deftrig_events_imm;
  static DeferredTriggerEvent deftrig_event_tail;
  
  
***************
*** 1844,1850 ****
  static void
  deferredTriggerInvokeEvents(bool immediate_only)
  {
! 	DeferredTriggerEvent event,
  				prev_event = NULL;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
--- 1850,1856 ----
  static void
  deferredTriggerInvokeEvents(bool immediate_only)
  {
! 	DeferredTriggerEvent event = NULL,
  				prev_event = NULL;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
***************
*** 1869,1880 ****
  	/* Make a per-tuple memory context for trigger function calls */
  	per_tuple_context =
  		AllocSetContextCreate(CurrentMemoryContext,
! 							  "DeferredTriggerTupleContext",
! 							  ALLOCSET_DEFAULT_MINSIZE,
! 							  ALLOCSET_DEFAULT_INITSIZE,
! 							  ALLOCSET_DEFAULT_MAXSIZE);
  
- 	event = deftrig_events;
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
--- 1875,1897 ----
  	/* Make a per-tuple memory context for trigger function calls */
  	per_tuple_context =
  		AllocSetContextCreate(CurrentMemoryContext,
! 			  "DeferredTriggerTupleContext",
! 			  ALLOCSET_DEFAULT_MINSIZE,
! 			  ALLOCSET_DEFAULT_INITSIZE,
! 			  ALLOCSET_DEFAULT_MAXSIZE);
! 
! 	/*
! 	 * If immediate_only is true, then the only events that could have fired
! 	 * are those since deftrig_events_imm.  
! 	 * The one exception to only since deftrig_events_imm would be 
! 	 * SET CONSTRAINTS ... IMMEDIATE so we set it to NULL in there
! 	 * and treat NULL as being equivalent to the entire list.
! 	 */
! 	if (immediate_only) 
! 		event = deftrig_events_imm;
! 	if (event == NULL)
! 		event = deftrig_events;
  
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
***************
*** 1993,2004 ****
--- 2010,2025 ----
  	/* Update list tail pointer in case we just deleted tail event */
  	deftrig_event_tail = prev_event;
  
+ 	/* Set the immediate event pointer to the tail */
+ 	deftrig_events_imm = prev_event;
+ 
  	/* Release working resources */
  	if (rel)
  		heap_close(rel, NoLock);
  	FreeTriggerDesc(trigdesc);
  	if (finfo)
  		pfree(finfo);
+ 
  	MemoryContextDelete(per_tuple_context);
  }
  
***************
*** 2274,2279 ****
--- 2295,2308 ----
  	}
  
  	/*
+ 	 * Reset the pointer for the immediate constraints to NULL (ie
+ 	 * use beginning of list).  This is really only necessary for
+ 	 * dealing when setting to IMMEDIATE, and could really point
+ 	 * the first constraint we changed, but for now this will do.
+ 	 */
+ 	deftrig_events_imm = NULL;
+ 
+ 	/*
  	 * SQL99 requires that when a constraint is set to IMMEDIATE, any
  	 * deferred checks against that constraint must be made when the SET
  	 * CONSTRAINTS command is executed -- i.e. the effects of the SET
#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#8)
Re: [PERFORM] Foreign key performance

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

The hack was just the keeping around the list pointer from the last run
through (see attached - passed simple fk tests and regression, but there
might be problems I don't see).

Shouldn't this patch update the comment in deferredTriggerInvokeEvents
(c. line 1860 in cvs tip)?

Looking at the code, I also wonder if we
would get some gain by not allocating the per_tuple_context at the
beginning but only when a non-deferred constraint is found since otherwise
we're creating and destroying the context and possibly never using it.

I doubt it's worth worrying over. Creation/destruction of a never-used
memory context is pretty cheap, I think.

regards, tom lane

#10Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Tom Lane (#9)
Re: [PERFORM] Foreign key performance

On Sat, 19 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

The hack was just the keeping around the list pointer from the last run
through (see attached - passed simple fk tests and regression, but there
might be problems I don't see).

Shouldn't this patch update the comment in deferredTriggerInvokeEvents
(c. line 1860 in cvs tip)?

Probably, since the second part of that is basically what this is. I'll
update and send updated patch tomorrow.

Looking at the code, I also wonder if we
would get some gain by not allocating the per_tuple_context at the
beginning but only when a non-deferred constraint is found since otherwise
we're creating and destroying the context and possibly never using it.

I doubt it's worth worrying over. Creation/destruction of a never-used
memory context is pretty cheap, I think.

Okay, sounds good enough for me. :)

#11Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Stephan Szabo (#10)
1 attachment(s)
Re: [PERFORM] Foreign key performance

On Sat, 19 Apr 2003, Stephan Szabo wrote:

On Sat, 19 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

The hack was just the keeping around the list pointer from the last run
through (see attached - passed simple fk tests and regression, but there
might be problems I don't see).

Shouldn't this patch update the comment in deferredTriggerInvokeEvents
(c. line 1860 in cvs tip)?

Probably, since the second part of that is basically what this is. I'll
update and send updated patch tomorrow.

Okay, this changes the second paragraph of that comment. I left in the
comment that's really similar next to where I actually do the selection of
which start point to use.

Attachments:

trigger.patchtext/plain; charset=US-ASCII; name=trigger.patchDownload
Index: src/backend/commands/trigger.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/backend/commands/trigger.c,v
retrieving revision 1.147
diff -c -r1.147 trigger.c
*** src/backend/commands/trigger.c	2003/03/31 20:47:51	1.147
--- src/backend/commands/trigger.c	2003/04/20 16:01:55
***************
*** 1626,1637 ****
--- 1626,1643 ----
   * Because this can grow pretty large, we don't use separate List nodes,
   * but instead thread the list through the dte_next fields of the member
   * nodes.  Saves just a few bytes per entry, but that adds up.
+  * 
+  * deftrig_events_imm holds the tail pointer as of the last 
+  * deferredTriggerInvokeEvents call, since generally immediate triggers
+  * can't exist before this point (since they should already have been
+  * called) excepting SET CONSTRAINTS
   *
   * XXX Need to be able to shove this data out to a file if it grows too
   *	   large...
   * ----------
   */
  static DeferredTriggerEvent deftrig_events;
+ static DeferredTriggerEvent deftrig_events_imm;
  static DeferredTriggerEvent deftrig_event_tail;
  
  
***************
*** 1844,1850 ****
  static void
  deferredTriggerInvokeEvents(bool immediate_only)
  {
! 	DeferredTriggerEvent event,
  				prev_event = NULL;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
--- 1850,1856 ----
  static void
  deferredTriggerInvokeEvents(bool immediate_only)
  {
! 	DeferredTriggerEvent event = NULL,
  				prev_event = NULL;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
***************
*** 1857,1880 ****
  	 * are going to discard the whole event queue on return anyway, so no
  	 * need to bother with "retail" pfree's.
  	 *
! 	 * In a scenario with many commands in a transaction and many
! 	 * deferred-to-end-of-transaction triggers, it could get annoying to
! 	 * rescan all the deferred triggers at each command end. To speed this
! 	 * up, we could remember the actual end of the queue at EndQuery and
! 	 * examine only events that are newer. On state changes we simply
! 	 * reset the saved position to the beginning of the queue and process
! 	 * all events once with the new states.
  	 */
  
  	/* Make a per-tuple memory context for trigger function calls */
  	per_tuple_context =
  		AllocSetContextCreate(CurrentMemoryContext,
! 							  "DeferredTriggerTupleContext",
! 							  ALLOCSET_DEFAULT_MINSIZE,
! 							  ALLOCSET_DEFAULT_INITSIZE,
! 							  ALLOCSET_DEFAULT_MAXSIZE);
  
- 	event = deftrig_events;
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
--- 1863,1894 ----
  	 * are going to discard the whole event queue on return anyway, so no
  	 * need to bother with "retail" pfree's.
  	 *
! 	 * In immediate_only is true, we only check from the end of the queue
! 	 * at the previous InvokeEvents and examine newer events. On state 
! 	 * changes we simply reset the saved position to the beginning of 
! 	 * the queue and process all events once with the new states.
  	 */
  
  	/* Make a per-tuple memory context for trigger function calls */
  	per_tuple_context =
  		AllocSetContextCreate(CurrentMemoryContext,
! 			  "DeferredTriggerTupleContext",
! 			  ALLOCSET_DEFAULT_MINSIZE,
! 			  ALLOCSET_DEFAULT_INITSIZE,
! 			  ALLOCSET_DEFAULT_MAXSIZE);
! 
! 	/*
! 	 * If immediate_only is true, then the only events that could have fired
! 	 * are those since deftrig_events_imm.  
! 	 * The one exception to only since deftrig_events_imm would be 
! 	 * SET CONSTRAINTS ... IMMEDIATE so we set it to NULL in there
! 	 * and treat NULL as being equivalent to the entire list.
! 	 */
! 	if (immediate_only) 
! 		event = deftrig_events_imm;
! 	if (event == NULL)
! 		event = deftrig_events;
  
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
***************
*** 1993,2004 ****
--- 2007,2022 ----
  	/* Update list tail pointer in case we just deleted tail event */
  	deftrig_event_tail = prev_event;
  
+ 	/* Set the immediate event pointer to the tail */
+ 	deftrig_events_imm = prev_event;
+ 
  	/* Release working resources */
  	if (rel)
  		heap_close(rel, NoLock);
  	FreeTriggerDesc(trigdesc);
  	if (finfo)
  		pfree(finfo);
+ 
  	MemoryContextDelete(per_tuple_context);
  }
  
***************
*** 2272,2277 ****
--- 2290,2303 ----
  
  		MemoryContextSwitchTo(oldcxt);
  	}
+ 
+ 	/*
+ 	 * Reset the pointer for the immediate constraints to NULL (ie
+ 	 * use beginning of list).  This is really only necessary for
+ 	 * dealing when setting to IMMEDIATE, and could really point
+ 	 * the first constraint we changed, but for now this will do.
+ 	 */
+ 	deftrig_events_imm = NULL;
  
  	/*
  	 * SQL99 requires that when a constraint is set to IMMEDIATE, any
#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#11)
Re: [PERFORM] Foreign key performance

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Okay, this changes the second paragraph of that comment. I left in the
comment that's really similar next to where I actually do the selection of
which start point to use.

This had a bit of a problem yet: the loop in deferredTriggerInvokeEvents
expects 'prev_event' to point to the list entry just before 'event'.
A nice byproduct of fixing that is we don't uselessly rescan the last list
entry. I also tried to improve the comments a little. You can see what
I actually applied at
http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/trigger.c

regards, tom lane

#13Kevin Brown
kevin@sysexperts.com
In reply to: Tom Lane (#12)
Re: [PERFORM] Foreign key performance

Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

Okay, this changes the second paragraph of that comment. I left in the
comment that's really similar next to where I actually do the selection of
which start point to use.

This had a bit of a problem yet: the loop in deferredTriggerInvokeEvents
expects 'prev_event' to point to the list entry just before 'event'.
A nice byproduct of fixing that is we don't uselessly rescan the last list
entry. I also tried to improve the comments a little. You can see what
I actually applied at

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/commands/trigger.c

Any chance of backporting these changes to 7_3_STABLE (when you're
satisfied they don't break anything)? Just looking at the CVS log for
trigger.c, it appears there have been enough changes since then that
it might not be easy to do (and since it's not necessarily a "bug fix"
as such, it might not qualify for backporting to a stable version).

Even if it's not something that can be put into another release of
7.3, it would certainly be useful to me. It might be useful to enough
people to justify releasing it as a patch on -patches, if nothing
else.

I'd do it myself but I don't understand the trigger code at all (and
if there's any documentation you can point me to that describes the
various functions and supporting data structures in trigger.c, that
would help a lot), and I'd rather not touch something like that until
I understand it thoroughly.

--
Kevin Brown kevin@sysexperts.com

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Brown (#13)
Re: [PERFORM] Foreign key performance

Kevin Brown <kevin@sysexperts.com> writes:

Any chance of backporting these changes to 7_3_STABLE (when you're
satisfied they don't break anything)? Just looking at the CVS log for
trigger.c, it appears there have been enough changes since then that
it might not be easy to do (and since it's not necessarily a "bug fix"
as such, it might not qualify for backporting to a stable version).

I'd be pretty hesitant to make such a change in the stable branch ---
at least not without a lot of testing. If you and others want to
provide such testing, go to it. The patch appears to apply cleanly
enough to 7.3, but here's an adjusted patch if fuzz makes you nervous...

regards, tom lane

*** trigger.c~	Sun Apr 20 20:28:55 2003
--- trigger.c	Sun Apr 20 20:29:13 2003
***************
*** 1461,1472 ****
--- 1461,1478 ----
   * Because this can grow pretty large, we don't use separate List nodes,
   * but instead thread the list through the dte_next fields of the member
   * nodes.  Saves just a few bytes per entry, but that adds up.
+  * 
+  * deftrig_events_imm holds the tail pointer as of the last 
+  * deferredTriggerInvokeEvents call; we can use this to avoid rescanning
+  * entries unnecessarily.  It is NULL if deferredTriggerInvokeEvents
+  * hasn't run since the last state change.
   *
   * XXX Need to be able to shove this data out to a file if it grows too
   *	   large...
   * ----------
   */
  static DeferredTriggerEvent deftrig_events;
+ static DeferredTriggerEvent deftrig_events_imm;
  static DeferredTriggerEvent deftrig_event_tail;
***************
*** 1680,1686 ****
  deferredTriggerInvokeEvents(bool immediate_only)
  {
  	DeferredTriggerEvent event,
! 				prev_event = NULL;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
  	TriggerDesc *trigdesc = NULL;
--- 1686,1692 ----
  deferredTriggerInvokeEvents(bool immediate_only)
  {
  	DeferredTriggerEvent event,
! 				prev_event;
  	MemoryContext per_tuple_context;
  	Relation	rel = NULL;
  	TriggerDesc *trigdesc = NULL;
***************
*** 1692,1704 ****
  	 * are going to discard the whole event queue on return anyway, so no
  	 * need to bother with "retail" pfree's.
  	 *
! 	 * In a scenario with many commands in a transaction and many
! 	 * deferred-to-end-of-transaction triggers, it could get annoying to
! 	 * rescan all the deferred triggers at each command end. To speed this
! 	 * up, we could remember the actual end of the queue at EndQuery and
! 	 * examine only events that are newer. On state changes we simply
! 	 * reset the saved position to the beginning of the queue and process
! 	 * all events once with the new states.
  	 */
  	/* Make a per-tuple memory context for trigger function calls */
--- 1698,1709 ----
  	 * are going to discard the whole event queue on return anyway, so no
  	 * need to bother with "retail" pfree's.
  	 *
! 	 * If immediate_only is true, we need only scan from where the end of
! 	 * the queue was at the previous deferredTriggerInvokeEvents call;
! 	 * any non-deferred events before that point are already fired.
! 	 * (But if the deferral state changes, we must reset the saved position
! 	 * to the beginning of the queue, so as to process all events once with
! 	 * the new states.  See DeferredTriggerSetState.)
  	 */

/* Make a per-tuple memory context for trigger function calls */
***************
*** 1709,1715 ****
ALLOCSET_DEFAULT_INITSIZE,
ALLOCSET_DEFAULT_MAXSIZE);

! 	event = deftrig_events;
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
--- 1714,1735 ----
  							  ALLOCSET_DEFAULT_INITSIZE,
  							  ALLOCSET_DEFAULT_MAXSIZE);
! 	/*
! 	 * If immediate_only is true, then the only events that could need firing
! 	 * are those since deftrig_events_imm.  (But if deftrig_events_imm is
! 	 * NULL, we must scan the entire list.)
! 	 */
! 	if (immediate_only && deftrig_events_imm != NULL)
! 	{
! 		prev_event = deftrig_events_imm;
! 		event = prev_event->dte_next;
! 	}
! 	else
! 	{
! 		prev_event = NULL;
! 		event = deftrig_events;
! 	}
! 
  	while (event != NULL)
  	{
  		bool		still_deferred_ones = false;
***************
*** 1830,1835 ****
--- 1850,1858 ----
  	/* Update list tail pointer in case we just deleted tail event */
  	deftrig_event_tail = prev_event;
+ 	/* Set the immediate event pointer for next time */
+ 	deftrig_events_imm = prev_event;
+ 
  	/* Release working resources */
  	if (rel)
  		heap_close(rel, NoLock);
***************
*** 1917,1922 ****
--- 1940,1946 ----
  	MemoryContextSwitchTo(oldcxt);

deftrig_events = NULL;
+ deftrig_events_imm = NULL;
deftrig_event_tail = NULL;
}

***************
*** 2146,2153 ****
* CONSTRAINTS command applies retroactively. This happens "for free"
* since we have already made the necessary modifications to the
* constraints, and deferredTriggerEndQuery() is called by
! * finish_xact_command().
*/
}

--- 2170,2180 ----
  	 * CONSTRAINTS command applies retroactively. This happens "for free"
  	 * since we have already made the necessary modifications to the
  	 * constraints, and deferredTriggerEndQuery() is called by
! 	 * finish_xact_command().  But we must reset deferredTriggerInvokeEvents'
! 	 * tail pointer to make it rescan the entire list, in case some deferred
! 	 * events are now immediately invokable.
  	 */
+ 	deftrig_events_imm = NULL;
  }
#15Kevin Brown
kevin@sysexperts.com
In reply to: Tom Lane (#14)
Re: [PERFORM] Foreign key performance

Tom Lane wrote:

Kevin Brown <kevin@sysexperts.com> writes:

Any chance of backporting these changes to 7_3_STABLE (when you're
satisfied they don't break anything)? Just looking at the CVS log for
trigger.c, it appears there have been enough changes since then that
it might not be easy to do (and since it's not necessarily a "bug fix"
as such, it might not qualify for backporting to a stable version).

I'd be pretty hesitant to make such a change in the stable branch ---
at least not without a lot of testing. If you and others want to
provide such testing, go to it. The patch appears to apply cleanly
enough to 7.3, but here's an adjusted patch if fuzz makes you
nervous...

Thanks, Tom. I've applied the patch to my server and it has so far
passed the few tests I've thrown at it so far (it has detected foreign
key violations in both immediate and deferred trigger mode). And just
so you know, it performs FAR better than the pre-patched version does
-- in the overall transaction I'm doing, I see very little difference
now between deferred triggers and no triggers!

--
Kevin Brown kevin@sysexperts.com

#16Jan Wieck
JanWieck@Yahoo.com
In reply to: Stephan Szabo (#5)
Re: [PERFORM] Foreign key performance

Stephan Szabo wrote:

[Not sure this really is relevant for -performance at this point]

On Thu, 17 Apr 2003, Stephan Szabo wrote:

On Fri, 18 Apr 2003, Tom Lane wrote:

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:

It appears (from some not terribly scientific experiments - see below)
that it's likely to be related to managing the deferred trigger queue
given that in my case at least running the constraints non-deferred was
negligible in comparison.

At one time the deferred-trigger queue had an O(N^2) behavioral problem
for large N = number of pending trigger events. But I thought we'd
fixed that. What's the test case exactly? Can you get a profile with
gprof?

I'm going to tomorrow hopefully - but it looks to me that we fixed one, but

Argh. I'm getting that state where gprof returns all 0s for times. I'm
pretty sure this has come up before along with how to get it to work, but
I couldn't find it in the archives. Someday I'll learn how to use gprof. :(

You have to save and restore the timers around the fork() under Linux.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck@Yahoo.com #