Compressing the AFTER TRIGGER queue

Started by Dean Rasheedover 14 years ago21 messageshackers
Jump to latest
#1Dean Rasheed
dean.a.rasheed@gmail.com

I've been thinking some more about the long-standing problem of the
AFTER TRIGGER queue using too much memory, and I think that the
situation can be improved by using some basic compression.

Currently each event added to the AFTER TRIGGER queue uses 10 bytes
per trigger per row for INSERTs and DELETEs, and 16 for UPDATEs. The
attached patch reduces this down to just 1 byte in a number of common
cases.

It works by optimising for the case where an event's CTID is same or
close to the previous event's CTID, and similarly for the event's
pointer to the shared trigger data (AfterTriggerSharedData).

If there is only 1 trigger on the table, it will use 1 byte for most
INSERTed rows. For UPDATEs and DELETEs, the space needed will vary
according to the table's layout and traversal order. Given a not too
fragmented table, and a sequential or bitmap heap scan, it should use
1-2 bytes per row. This will increase for more random traversal orders
- the theoretical maximum is 7 bytes per row, but that would take a
pretty pathological case.

If there are multiple triggers firing for each row, then each
additional trigger event will take 1 byte, provided the triggers fire
unconditionally (no WHERE clauses). If the triggers fire more
randomly, due to WHERE clauses, then some of the trigger events may
take 2 bytes each (for up to 256 triggers on the table) or 3 bytes
each if there are more triggers than that.

The compression code is pretty simple, so should only use a few
additional cycles. In my testing so far, it seems to make no
detectable difference to performance (actually it seems to be 2-3%
faster for a 50M row INSERT, presumably because the reduced memory
usage leads to fewer cache misses).

For comparison, saving a 1MB AfterTriggerEventChunk produced by the
current code to a file and compressing it with gzip gives a
compression ratio of around 4:1, and bzip2 gives around 8:1. I've not
tried the pg_lzcompress code, but at first glance it is hard to
imagine that it would be as good as gzip, and it is not so suited to
on-the-fly compression/decompression.

It still needs more testing, but the initial results are encouraging.

Thoughts?

Regards,
Dean

Attachments:

after-triggers.patchapplication/octet-stream; name=after-triggers.patchDownload+495-211
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#1)
Re: Compressing the AFTER TRIGGER queue

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

I've been thinking some more about the long-standing problem of the
AFTER TRIGGER queue using too much memory, and I think that the
situation can be improved by using some basic compression.

Currently each event added to the AFTER TRIGGER queue uses 10 bytes
per trigger per row for INSERTs and DELETEs, and 16 for UPDATEs. The
attached patch reduces this down to just 1 byte in a number of common
cases.

Ummm ... I only read the data structure comments, not the code, but I
don't see where you store the second CTID for an update event?

regards, tom lane

#3Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#2)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 17:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

I've been thinking some more about the long-standing problem of the
AFTER TRIGGER queue using too much memory, and I think that the
situation can be improved by using some basic compression.

Currently each event added to the AFTER TRIGGER queue uses 10 bytes
per trigger per row for INSERTs and DELETEs, and 16 for UPDATEs. The
attached patch reduces this down to just 1 byte in a number of common
cases.

Ummm ... I only read the data structure comments, not the code, but I
don't see where you store the second CTID for an update event?

Ah yes, I forgot to mention that bit. I'm using
&(tuple1.t_data->t_ctid) to get the second CTID from the old tuple. Is
that safe? As far as I could see, this would match the new tuple after
a heap update.

Regards,
Dean

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#3)
Re: Compressing the AFTER TRIGGER queue

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 1 August 2011 17:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ummm ... I only read the data structure comments, not the code, but I
don't see where you store the second CTID for an update event?

Ah yes, I forgot to mention that bit. I'm using
&(tuple1.t_data->t_ctid) to get the second CTID from the old tuple. Is
that safe?

Hmmmm ... not sure. It seems a bit scary, but on the other hand we
should be able to assume that the updating subtransaction hasn't been
rolled back (else surely we shouldn't be firing the trigger). So in
principle it seems like the t_ctid link can't have been replaced.
This will foreclose any ideas about collapsing t_ctid link chains,
if anyone had it in mind to do that.

regards, tom lane

#5Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: Compressing the AFTER TRIGGER queue

On Mon, Aug 1, 2011 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 1 August 2011 17:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ummm ... I only read the data structure comments, not the code, but I
don't see where you store the second CTID for an update event?

Ah yes, I forgot to mention that bit. I'm using
&(tuple1.t_data->t_ctid) to get the second CTID from the old tuple. Is
that safe?

Hmmmm ... not sure.  It seems a bit scary, but on the other hand we
should be able to assume that the updating subtransaction hasn't been
rolled back (else surely we shouldn't be firing the trigger).  So in
principle it seems like the t_ctid link can't have been replaced.
This will foreclose any ideas about collapsing t_ctid link chains,
if anyone had it in mind to do that.

Don't we already do that when pruning HOT chains?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Robert Haas (#5)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 18:36, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Aug 1, 2011 at 1:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 1 August 2011 17:49, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ummm ... I only read the data structure comments, not the code, but I
don't see where you store the second CTID for an update event?

Ah yes, I forgot to mention that bit. I'm using
&(tuple1.t_data->t_ctid) to get the second CTID from the old tuple. Is
that safe?

Hmmmm ... not sure.  It seems a bit scary, but on the other hand we
should be able to assume that the updating subtransaction hasn't been
rolled back (else surely we shouldn't be firing the trigger).  So in
principle it seems like the t_ctid link can't have been replaced.
This will foreclose any ideas about collapsing t_ctid link chains,
if anyone had it in mind to do that.

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Regards,
Dean

#7Robert Haas
robertmhaas@gmail.com
In reply to: Dean Rasheed (#6)
Re: Compressing the AFTER TRIGGER queue

On Mon, Aug 1, 2011 at 1:42 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Hmmmm ... not sure.  It seems a bit scary, but on the other hand we
should be able to assume that the updating subtransaction hasn't been
rolled back (else surely we shouldn't be firing the trigger).  So in
principle it seems like the t_ctid link can't have been replaced.
This will foreclose any ideas about collapsing t_ctid link chains,
if anyone had it in mind to do that.

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Hmm, true.

I worry a bit that this might foreclose possible future optimization
of the "self update" case, which is a known pain point. Am I wrong to
worry?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#7)
Re: Compressing the AFTER TRIGGER queue

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 1, 2011 at 1:42 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Hmm, true.

I worry a bit that this might foreclose possible future optimization
of the "self update" case, which is a known pain point. Am I wrong to
worry?

I think it might be OK if you explicitly verify that xmin/cmin of the
linked-to tuple matches the (sub)transaction/command that queued the
trigger event. I don't recall whether the trigger code does that
already; I think there is some related test but it might not be that
strict.

There's also a definitional issue involved: if a transaction updates the
same tuple twice, in the presence of a deferred update trigger for the
table, is it supposed to (eventually) fire the trigger for both update
actions or only the last one? I have a feeling we might already be
locked into the second choice, but if not, this would probably force it.

regards, tom lane

#9Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#8)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 18:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 1, 2011 at 1:42 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Hmm, true.

I worry a bit that this might foreclose possible future optimization
of the "self update" case, which is a known pain point.  Am I wrong to
worry?

I think it might be OK if you explicitly verify that xmin/cmin of the
linked-to tuple matches the (sub)transaction/command that queued the
trigger event.  I don't recall whether the trigger code does that
already; I think there is some related test but it might not be that
strict.

There's also a definitional issue involved: if a transaction updates the
same tuple twice, in the presence of a deferred update trigger for the
table, is it supposed to (eventually) fire the trigger for both update
actions or only the last one?  I have a feeling we might already be
locked into the second choice, but if not, this would probably force it.

Do you mean this sort of case:

DROP TABLE IF EXISTS foo;
CREATE TABLE foo(a int);
INSERT INTO foo VALUES(1);

CREATE OR REPLACE FUNCTION foo_trig_fn() RETURNS trigger AS
$$
BEGIN
RAISE NOTICE 'In foo_trig_fn(): old.a=%, new.a=%', old.a, new.a;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE CONSTRAINT TRIGGER foo_trig AFTER UPDATE ON foo
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW EXECUTE PROCEDURE foo_trig_fn();

BEGIN;
UPDATE foo SET a=a+1;
UPDATE foo SET a=a+1;
COMMIT;

In this case we currently fire the trigger twice (once for each
update) when the transaction commits, and the new code behaves the
same. So at commit time you get:

NOTICE: In foo_trig_fn(): old.a=1, new.a=2
NOTICE: In foo_trig_fn(): old.a=2, new.a=3

Thinking back to the deferred PK checking trigger, I thought that in
this sort of case it is the trigger's responsibility to check that the
tuple(s) it is given are not dead.

Regards,
Dean

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#9)
Re: Compressing the AFTER TRIGGER queue

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 1 August 2011 18:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 1, 2011 at 1:42 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Hmm, true.

On reflection I think we're probably worrying over nothing. The only
way that the t_ctid link might have gotten changed (given that the
original updating subtransaction hasn't been aborted) is if we were
willing to prune an inserted-and-then-deleted-in-same-xact tuple out of
the t_ctid link chain while its parent xact is still running. However,
it is unsafe to do that unless you know for certain that the xact in
question has no snapshots that could see the tuple as live. VACUUM
certainly doesn't know that, and neither does pruneheap.c, and I don't
really foresee us introducing enough bookkeeping so that they would know
it. (If we did try to do that, we could handle the problem by
considering that queueing of trigger events introduces a quasi-snapshot
that requires the relevant tuples to remain available.)

So I think this is probably OK as long as it's adequately documented in
the code. Just for luck, though, we should add an xmin/cmin match check
to the code if there's not one already.

However, this means that Dean is not simply adding some self-contained
compression logic; he's eliminating information from the data structure
on the grounds that he can get it from elsewhere. I think that that
ought to be treated as a separate patch, and that the costs/benefits
of the "compressed" data structure ought to be evaluated relative to the
situation with the second CTID removed but the data structure not
otherwise changed.

regards, tom lane

#11Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#10)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 19:56, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

On 1 August 2011 18:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Aug 1, 2011 at 1:42 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

Don't we already do that when pruning HOT chains?

I thought that only happens after the transaction is committed, and
old enough, whereas the trigger code only needs to follow the chain in
the updating transaction.

Hmm, true.

On reflection I think we're probably worrying over nothing.  The only
way that the t_ctid link might have gotten changed (given that the
original updating subtransaction hasn't been aborted) is if we were
willing to prune an inserted-and-then-deleted-in-same-xact tuple out of
the t_ctid link chain while its parent xact is still running.  However,
it is unsafe to do that unless you know for certain that the xact in
question has no snapshots that could see the tuple as live.  VACUUM
certainly doesn't know that, and neither does pruneheap.c, and I don't
really foresee us introducing enough bookkeeping so that they would know
it.  (If we did try to do that, we could handle the problem by
considering that queueing of trigger events introduces a quasi-snapshot
that requires the relevant tuples to remain available.)

So I think this is probably OK as long as it's adequately documented in
the code.  Just for luck, though, we should add an xmin/cmin match check
to the code if there's not one already.

I don't see any such existing check.
Is this the same as checking that xmax/cmax on the old tuple matches
xmin/cmin on the linked-to tuple? If so, that's pretty
straightforward. Otherwise I'm not sure where the xmin/cmin to check
against would come from, without adding more information to the queues
somewhere.

However, this means that Dean is not simply adding some self-contained
compression logic; he's eliminating information from the data structure
on the grounds that he can get it from elsewhere.  I think that that
ought to be treated as a separate patch, and that the costs/benefits
of the "compressed" data structure ought to be evaluated relative to the
situation with the second CTID removed but the data structure not
otherwise changed.

OK, so I should split this into 2 patches?
Even without the compression, it's probably worth the 16 -> 10 byte
reduction that would result from removing the 2nd CTID in the UPDATE
case, and that part would be a pretty small patch.

Regards,
Dean

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dean Rasheed (#11)
Re: Compressing the AFTER TRIGGER queue

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

OK, so I should split this into 2 patches?
Even without the compression, it's probably worth the 16 -> 10 byte
reduction that would result from removing the 2nd CTID in the UPDATE
case, and that part would be a pretty small patch.

Yeah, my point exactly. The rest of it might or might not be worth the
extra complication.

regards, tom lane

#13Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#10)
Re: Compressing the AFTER TRIGGER queue

On Mon, Aug 1, 2011 at 7:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, this means that Dean is not simply adding some self-contained
compression logic; he's eliminating information from the data structure
on the grounds that he can get it from elsewhere.  I think that that
ought to be treated as a separate patch, and that the costs/benefits
of the "compressed" data structure ought to be evaluated relative to the
situation with the second CTID removed but the data structure not
otherwise changed.

I would prefer an approach where we store the events in a buffer,
which gets added to the main event queue when it fills/block number
changes/etc. That way we can apply intelligence to the actual
compression format used, yet retain all required info in the queued
event. We might also be able to ensure the event queue is ordered more
naturally by block number, to ensure we have reasonable readahead
while re-traversing the event queue - its not just the size of the
event queue that is the problem here.

I'm also nervous of compression formats that rely too heavily on
earlier data, so any corner case bug in one of the formats will screw
the whole data structure. I'd prefer a more modular approach that is
easier to test in regression tests.

The cases we have problems with are the bulk cases, so we should be
specifically optimising for that. For example, if we are running a
COPY/INSERT SELECT or a seqscan DELETE then we will queue a trigger
for every tuple in a block. The best compression and flexibility in
that case is to store a bitmap since that will average out at about 1
bit per row, with variable length bitmaps. Which is about 8 times
better compression ratio than originally suggested, without any loss
of metadata.

We should also optimise the HOT UPDATE case.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#14Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#12)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 20:53, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dean Rasheed <dean.a.rasheed@gmail.com> writes:

OK, so I should split this into 2 patches?
Even without the compression, it's probably worth the 16 -> 10 byte
reduction that would result from removing the 2nd CTID in the UPDATE
case, and that part would be a pretty small patch.

Yeah, my point exactly.  The rest of it might or might not be worth the
extra complication.

OK, here's a patch for the first bit - just removing the second CTID
in the UPDATE case, and including a sanity check of the new tuple's
xmin and cmin.

It passes all the regression tests. I also tested it by doing a 10M
row UPDATE x=x+1 on a deferrable PK, and it gave about the expected
reduction in memory usage, with no difference in run time.

I'll test out the additional compression separately.

Regards,
Dean

Attachments:

after-triggers-1.patchapplication/octet-stream; name=after-triggers-1.patchDownload+87-62
#15Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Simon Riggs (#13)
Re: Compressing the AFTER TRIGGER queue

On 1 August 2011 21:02, Simon Riggs <simon@2ndquadrant.com> wrote:

I would prefer an approach where we store the events in a buffer,
which gets added to the main event queue when it fills/block number
changes/etc. That way we can apply intelligence to the actual
compression format used, yet retain all required info in the queued
event. We might also be able to ensure the event queue is ordered more
naturally by block number, to ensure we have reasonable readahead
while re-traversing the event queue - its not just the size of the
event queue that is the problem here.

I tried a similar approach before
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php)
but one of the problems was that, for user-defined triggers at least,
it was thought that the order of firing for the triggers needed to
match the row update order, which is difficult to achieve with a
bitmap.

I also worry about the runtime cost of anything too complicated. What
I was aiming for with this patch was a more modest compression that
wouldn't add too many additional cycles (or complexity) to the
existing code (I need to do more testing to confirm that that is
indeed the case).

The biggest recent complaint I found in this area was this thread:
http://archives.postgresql.org/pgsql-bugs/2010-12/msg00002.php, which
was a one billion row insert. Personally, if I ever insert that much
data I always drop and re-create all the constraints and triggers, but
it would be nice if we could achieve that sort of size on reasonably
modern hardware.

I'm also nervous of compression formats that rely too heavily on
earlier data, so any corner case bug in one of the formats will screw
the whole data structure. I'd prefer a more modular approach that is
easier to test in regression tests.

Nearly all compression formats rely on earlier data. In this case each
event only depends on the previous event, so it just needs testing
with various classes of difference between successive rows, perhaps
with some C test code to make up CTIDs testing the corner cases.

The cases we have problems with are the bulk cases, so we should be
specifically optimising for that. For example, if we are running a
COPY/INSERT SELECT or a seqscan DELETE then we will queue a trigger
for every tuple in a block.

Agreed. That's what I tried to target with my code too - so it
achieves its best compression with sequential data.

The best compression and flexibility in

that case is to store a bitmap since that will average out at about 1
bit per row, with variable length bitmaps. Which is about 8 times
better compression ratio than originally suggested, without any loss
of metadata.

Yeah that's probably possible in specific cases, but I'm still not
sure how to make it meet the full requirements of the after trigger
queue.

Regards,
Dean

Show quoted text

We should also optimise the HOT UPDATE case.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Dean Rasheed (#15)
Re: Compressing the AFTER TRIGGER queue

On Tue, Aug 2, 2011 at 12:28 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On 1 August 2011 21:02, Simon Riggs <simon@2ndquadrant.com> wrote:

I would prefer an approach where we store the events in a buffer,
which gets added to the main event queue when it fills/block number
changes/etc. That way we can apply intelligence to the actual
compression format used, yet retain all required info in the queued
event. We might also be able to ensure the event queue is ordered more
naturally by block number, to ensure we have reasonable readahead
while re-traversing the event queue - its not just the size of the
event queue that is the problem here.

I tried a similar approach before
(http://archives.postgresql.org/pgsql-hackers/2009-10/msg00464.php)
but one of the problems was that, for user-defined triggers at least,
it was thought that the order of firing for the triggers needed to
match the row update order, which is difficult to achieve with a
bitmap.

Luckily, I think what I said 2 years ago was correct.
http://archives.postgresql.org/pgsql-hackers/2009-10/msg01528.php

We just need a way to say that trigger execution order is not
important, which we know to be true for FK checks - or any STABLE
function.

I think that label is "STABLE" so we don't even need to invent new
function labels.

The cases we have problems with are the bulk cases, so we should be
specifically optimising for that. For example, if we are running a
COPY/INSERT SELECT or a seqscan DELETE then we will queue a trigger
for every tuple in a block.

Agreed. That's what I tried to target with my code too - so it
achieves its best compression with sequential data.

Maybe, but you're still storing 1 byte per row at least, often more.
Whereas 1 bit would be much better compression.

 The best compression and flexibility in

that case is to store a bitmap since that will average out at about 1
bit per row, with variable length bitmaps. Which is about 8 times
better compression ratio than originally suggested, without any loss
of metadata.

Yeah that's probably possible in specific cases, but I'm still not
sure how to make it meet the full requirements of the after trigger
queue.

I think you'd better explain what use case you are trying to optimise
for. It seems unlikely that you will come up with a compression scheme
that will fit all cases.

The only cases that seem a problem to me are
* bulk RI checks
* large writes on tables using trigger based replication
maybe you have others?

If you are trying to optimise trigger based replication, I would be
inclined to look at immediate execution triggers. If we insert into a
log table then that will only take effect if the transaction commits.
That way we would just bypass the after trigger queue entirely. I see
nothing in the SQL Standard that prevents that.

Bitmaps work better than the current suggestion for optimising bulk RI
checks. If you used immediate execution triggers you could optimise
bulk RI checks against small tables even further by just storing a
local hash table of already acquired row locks, again avoiding the
need to build up a huge after trigger event queue in the first place.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#17Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#16)
Re: Compressing the AFTER TRIGGER queue

On Aug 2, 2011, at 7:09 AM, Simon Riggs wrote:

The best compression and flexibility in

that case is to store a bitmap since that will average out at about 1
bit per row, with variable length bitmaps. Which is about 8 times
better compression ratio than originally suggested, without any loss
of metadata.

Yeah that's probably possible in specific cases, but I'm still not
sure how to make it meet the full requirements of the after trigger
queue.

I think you'd better explain what use case you are trying to optimise
for. It seems unlikely that you will come up with a compression scheme
that will fit all cases.

The only cases that seem a problem to me are
* bulk RI checks
* large writes on tables using trigger based replication
maybe you have others?

Not sure how much this relates to this discussion, but I have often wished we had AFTER FOR EACH STATEMENT triggers that provided OLD and NEW recordsets you could make use of. Sometimes it's very valuably to be able to look at *all* the rows that changed in a transaction in one shot.
--
Jim C. Nasby, Database Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

#18Robert Haas
robertmhaas@gmail.com
In reply to: Jim Nasby (#17)
Re: Compressing the AFTER TRIGGER queue

On Wed, Aug 3, 2011 at 6:05 PM, Jim Nasby <jim@nasby.net> wrote:

Not sure how much this relates to this discussion, but I have often wished we had AFTER FOR EACH STATEMENT triggers that provided OLD and NEW recordsets you could make use of. Sometimes it's very valuably to be able to look at *all* the rows that changed in a transaction in one shot.

Yeah, that would be awesome. I think some of our competitors provide
exactly that feature...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jim Nasby (#17)
Re: Compressing the AFTER TRIGGER queue

Excerpts from Jim Nasby's message of mié ago 03 18:05:21 -0400 2011:

Not sure how much this relates to this discussion, but I have often wished we had AFTER FOR EACH STATEMENT triggers that provided OLD and NEW recordsets you could make use of. Sometimes it's very valuably to be able to look at *all* the rows that changed in a transaction in one shot.

I have a mind to implement this sometime.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#20Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#18)
Re: Compressing the AFTER TRIGGER queue

Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Aug 3, 2011 at 6:05 PM, Jim Nasby <jim@nasby.net> wrote:

Not sure how much this relates to this discussion, but I have
often wished we had AFTER FOR EACH STATEMENT triggers that
provided OLD and NEW recordsets you could make use of. Sometimes
it's very valuably to be able to look at *all* the rows that
changed in a transaction in one shot.

Yeah, that would be awesome. I think some of our competitors
provide exactly that feature...

If I remember correctly, MS SQL Server and Sybase ASE provide
INSERTED and DELETED relations in triggers instead of NEW and OLD
records. In a FOR EACH ROW trigger the relation contains only one
row.

This is related to the thread on BEFORE triggers, in that these
products require that you UPDATE the row in the base table to modify
it (normally by joining to the INSERTED relation), making the latest
values available to other trigger code, and providing a clear
distinction between the values coming in to the trigger and the
latest values in the database.

-Kevin

#21Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Kevin Grittner (#20)