Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Started by Artus de benqueover 8 years ago13 messages
#1Artus de benque
artusdebenque@gmail.com

Summary:
*The trigger suppress_redundant_updates_trigger does not suppress updates
of strings larger than 16000 bit (or 2000 octet) with equal values*

Platform information

- PostgreSQL version: 9.6.3, server 9.5.6
- C library version: glibc 2.19 on docker container running
PostgreSQL server (glibc 2.25 on hosting linux)
- uname -a => Linux adb 4.11.3-1-ARCH #1 SMP PREEMPT Sun May 28
10:40:17 CEST 2017 x86_64 GNU/Linux

Although the behavior was observed on various systems.

Full steps to reproduce executed
$ psql -h localhost -U postgres
psql (9.6.3, server 9.5.6)
Type "help" for help.
postgres=# CREATE DATABASE test_db;
CREATE DATABASE
postgres=# CREATE TABLE test_table (id int, field text);
CREATE TABLE
postgres=# INSERT INTO test_table VALUES (1, 'hi');
INSERT 0 1
postgres=# UPDATE test_table SET field = 'hi' WHERE id = 1;
UPDATE 1
postgres=# CREATE TRIGGER suppress_redundant_updates BEFORE UPDATE ON
test_table FOR EACH ROW EXECUTE PROCEDURE
suppress_redundant_updates_trigger();
CREATE TRIGGER
postgres=# UPDATE test_table SET field = 'hi' WHERE id = 1;
UPDATE 0
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1 *<--- BUG: expected 0, as we ran the same update twice*

// other tests:
test_db=# UPDATE test_table SET field = rpad('', 2000, 'a') WHERE id = 1;
UPDATE 1
test_db=# UPDATE test_table SET field = rpad('', 2000, 'a') WHERE id = 1;
UPDATE 0 *<--- exactly 2000 octets, so OK*

test_db=# UPDATE test_table SET field = rpad('', 1000, 'ó') || 'a' WHERE id
= 1;
UPDATE 1
test_db=# UPDATE test_table SET field = rpad('', 1000, 'ó') || 'a' WHERE id
= 1;
UPDATE 1 *<--- BUG: because 'ó' is 2 octet long*

#2Dilip Kumar
dilipbalaut@gmail.com
In reply to: Artus de benque (#1)
Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

On Mon, Jun 19, 2017 at 5:20 PM, Artus de benque
<artusdebenque@gmail.com> wrote:

postgres=# UPDATE test_table SET field = 'hi' WHERE id = 1;
UPDATE 0
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1 <--- BUG: expected 0, as we ran the same update twice

Seems like in "suppress_redundant_updates_trigger" we are comparing
toasted tuple with the new tuple and that is the cause of the bug.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dilip Kumar (#2)
Re: [BUGS] Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Dilip Kumar <dilipbalaut@gmail.com> writes:

On Mon, Jun 19, 2017 at 5:20 PM, Artus de benque
<artusdebenque@gmail.com> wrote:

postgres=# UPDATE test_table SET field = 'hi' WHERE id = 1;
UPDATE 0
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1
test_db=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;
UPDATE 1 <--- BUG: expected 0, as we ran the same update twice

Seems like in "suppress_redundant_updates_trigger" we are comparing
toasted tuple with the new tuple and that is the cause of the bug.

I don't think it's a bug, I think it's an intentional design tradeoff.
To suppress an update in this case, the trigger would have to grovel
through the individual fields and detoast them before comparing.
That would add a lot of cycles, and only seldom add successes.

Possibly we should adjust the documentation so that it doesn't imply
that this trigger guarantees to suppress every no-op update.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#3)
Re: [HACKERS] Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

On Mon, Jun 19, 2017 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Seems like in "suppress_redundant_updates_trigger" we are comparing
toasted tuple with the new tuple and that is the cause of the bug.

I don't think it's a bug, I think it's an intentional design tradeoff.
To suppress an update in this case, the trigger would have to grovel
through the individual fields and detoast them before comparing.
That would add a lot of cycles, and only seldom add successes.

Possibly we should adjust the documentation so that it doesn't imply
that this trigger guarantees to suppress every no-op update.

That doesn't sound like a very plausible argument to me. I don't
think that a proposal to add a function named
sometimes_suppress_redundant_updates_trigger() would've attracted many
votes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#4)
Re: [HACKERS] Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 19, 2017 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think it's a bug, I think it's an intentional design tradeoff.
To suppress an update in this case, the trigger would have to grovel
through the individual fields and detoast them before comparing.
That would add a lot of cycles, and only seldom add successes.

Possibly we should adjust the documentation so that it doesn't imply
that this trigger guarantees to suppress every no-op update.

That doesn't sound like a very plausible argument to me. I don't
think that a proposal to add a function named
sometimes_suppress_redundant_updates_trigger() would've attracted many
votes.

You'd be wrong. The entire point of this trigger is to save cycles,
so having it eat a lot of cycles only to fail is not an improvement.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6Artus de benque
artusdebenque@gmail.com
In reply to: Tom Lane (#5)
Re: [HACKERS] Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Hi,

It looks like you know what is happening, but I found that I have made an
error in my original assumption: (while the steps to reproduce are still
valid)

The size of the string at which the trigger does not work as expected
varies, depending on the size of the other fields in the row.

The 'limit size' is lower if I set bigger values in another text field in
the same row (and it seems that it is reached when going above 2000 octet
for the texts cells added up).

Sorry if this is noise, and thank you for looking into the bug (or
documentation error).

Regards,

Artus de Benque

Le lun. 19 juin 2017 à 18:20, Tom Lane <tgl@sss.pgh.pa.us> a écrit :

Show quoted text

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 19, 2017 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think it's a bug, I think it's an intentional design tradeoff.
To suppress an update in this case, the trigger would have to grovel
through the individual fields and detoast them before comparing.
That would add a lot of cycles, and only seldom add successes.

Possibly we should adjust the documentation so that it doesn't imply
that this trigger guarantees to suppress every no-op update.

That doesn't sound like a very plausible argument to me. I don't
think that a proposal to add a function named
sometimes_suppress_redundant_updates_trigger() would've attracted many
votes.

You'd be wrong. The entire point of this trigger is to save cycles,
so having it eat a lot of cycles only to fail is not an improvement.

regards, tom lane

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#5)
Re: [HACKERS] Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 19, 2017 at 11:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I don't think it's a bug, I think it's an intentional design tradeoff.
To suppress an update in this case, the trigger would have to grovel
through the individual fields and detoast them before comparing.
That would add a lot of cycles, and only seldom add successes.

Possibly we should adjust the documentation so that it doesn't imply
that this trigger guarantees to suppress every no-op update.

That doesn't sound like a very plausible argument to me. I don't
think that a proposal to add a function named
sometimes_suppress_redundant_updates_trigger() would've attracted many
votes.

You'd be wrong. The entire point of this trigger is to save cycles,
so having it eat a lot of cycles only to fail is not an improvement.

I suppose that either behavior may be desirable depending on
circumstances. Maybe it is possible to have each installed trigger be
configurable so that it can select either behavior. (Maybe use the
trigger argument as a column list, and for each column in the list, do a
full detoast and compare instead of relying on toast pointer equality).

The current behavior seems more convenient in more cases, and so should
remain the default.

But this sounds like an additional feature, not a bug.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#8J Chapman Flack
jflack@math.purdue.edu
In reply to: Dilip Kumar (#2)
Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

On 06/19/2017 11:40 AM, Dilip Kumar wrote:

... Artus de benque ... wrote:

...=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;

Seems like in "suppress_redundant_updates_trigger" we are comparing
toasted tuple with the new tuple and that is the cause of the bug.

Something still puzzles me about this, though, maybe only because
I don't know enough about TOAST.

The size of 'field' ends up 2001, or just over the threshold where
TOASTing will be attempted at all. The report doesn't mention changing
the strategy from the default EXTENDED, so won't the first thing
attempted be compression? Won't that succeed spectacularly, since the
test string is a single character 2001 times, probably producing
a compressed string a handful of bytes long, well under the threshold,
obviating any need to go further with TOAST pointers?

Is the compression algorithm nondeterministic? Is there some way
that compressing the same 2001*'a' on two occasions would produce
compressed strings that don't match?

What exactly is s_r_u_t() comparing, in the case where the TOASTed
value has been compressed, but not out-of-lined?

-Chap

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: J Chapman Flack (#8)
Re: [HACKERS] Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

J Chapman Flack <jflack@math.purdue.edu> writes:

On 06/19/2017 11:40 AM, Dilip Kumar wrote:

... Artus de benque ... wrote:

...=# UPDATE test_table SET field = rpad('', 2001, 'a') WHERE id = 1;

Seems like in "suppress_redundant_updates_trigger" we are comparing
toasted tuple with the new tuple and that is the cause of the bug.

Something still puzzles me about this, though, maybe only because
I don't know enough about TOAST.

The size of 'field' ends up 2001, or just over the threshold where
TOASTing will be attempted at all. The report doesn't mention changing
the strategy from the default EXTENDED, so won't the first thing
attempted be compression? Won't that succeed spectacularly, since the
test string is a single character 2001 times, probably producing
a compressed string a handful of bytes long, well under the threshold,
obviating any need to go further with TOAST pointers?

Right, given the facts at hand, the stored old tuple has probably
got a compressed-in-line version of "field". However, the *new*
tuple is a transient tuple containing the uncompressed result of
rpad(). We don't bother to try to compress fields or shove them
out-of-line until after all the BEFORE ROW triggers are done ---
if we did, the effort might just be wasted, if the triggers change
those fields or cancel the update altogether. So the trigger is
seeing a compressed vs. an uncompressed version of the field value,
and since it's just doing a dumb bitwise compare, they don't look
equal.

As I mentioned upthread, it'd certainly be possible for the trigger
to iterate through the fields in a datatype-aware fashion and undo
compression or out-of-lineing before the comparison. But that would
eat a lot more cycles than the current implementation, and it seems
dubious that it's worth it. If the trigger is succeeding (ie,
detecting a no-op update) often enough that it would be worth that,
you've really got an application-stupidity problem to fix.

Is the compression algorithm nondeterministic?

I don't think so.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: [HACKERS] Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Tom Lane wrote:

As I mentioned upthread, it'd certainly be possible for the trigger
to iterate through the fields in a datatype-aware fashion and undo
compression or out-of-lineing before the comparison. But that would
eat a lot more cycles than the current implementation, and it seems
dubious that it's worth it. If the trigger is succeeding (ie,
detecting a no-op update) often enough that it would be worth that,
you've really got an application-stupidity problem to fix.

ISTM the whole point of suppress_redundant_updates_trigger is to cope
with application stupidity.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#10)
Re: [HACKERS] Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

... If the trigger is succeeding (ie,
detecting a no-op update) often enough that it would be worth that,
you've really got an application-stupidity problem to fix.

ISTM the whole point of suppress_redundant_updates_trigger is to cope
with application stupidity.

I think it's a suitable band-aid for limited amounts of stupidity.
But it eliminates only a small fraction of the total overhead involved
in a useless update command. So I remain of the opinion that if that's
happening a lot, you're better off fixing the problem somewhere upstream.

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#12David G. Johnston
david.g.johnston@gmail.com
In reply to: Tom Lane (#11)
Re: [HACKERS] Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

On Mon, Jun 19, 2017 at 2:10 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Tom Lane wrote:

... If the trigger is succeeding (ie,
detecting a no-op update) often enough that it would be worth that,
you've really got an application-stupidity problem to fix.

ISTM the whole point of suppress_redundant_updates_trigger is to cope
with application stupidity.

I think it's a suitable band-aid for limited amounts of stupidity.
But it eliminates only a small fraction of the total overhead involved
in a useless update command. So I remain of the opinion that if that's
happening a lot, you're better off fixing the problem somewhere upstream.

At first glance I think I'd rather have it do the correct thing all of
the time, even if it takes longer, so that my only trade-off decision
is whether to improve performance by fixing the application.

Ideally if the input tuple wouldn't require compression we wouldn't
bother to decompress the stored tuple.

David J.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#13Chapman Flack
chap@anastigmatix.net
In reply to: David G. Johnston (#12)
Re: [BUGS] Re: Postgresql bug report - unexpected behavior of suppress_redundant_updates_trigger

On 06/19/2017 05:19 PM, David G. Johnston wrote:

At first glance I think I'd rather have it do the correct thing all of
the time, even if it takes longer, so that my only trade-off decision
is whether to improve performance by fixing the application.

Ideally if the input tuple wouldn't require compression we wouldn't
bother to decompress the stored tuple.

That looks like one reasonable elimination check.

I wonder how much closer it could get with some changes that wouldn't
necessarily use many more cycles.

One might be a less_easy queue; marching through the tuple
comparing fields, if one is found to be TOASTed, throw it
on the queue and march on. Only if all the easy ones matched
is there any point in looking at the queue.

At that point, there could be a tunable for how much effort
to expend. Perhaps I'm willing to decompress an inline value,
but not retrieve an out-of-line one? For the TOAST compression
algorithm I'm not sure of the balance between compression
and decompression effort; I know gzip decompression is pretty cheap.

-Chap

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers