fast defaults in heap_getattr vs heap_deform_tuple

Started by Andres Freundabout 7 years ago14 messageshackers
Jump to latest
#1Andres Freund
andres@anarazel.de

Hi,

While working on the patch to slotify trigger.c I got somewhat confused
by the need to expand tuples in trigger.c:

static HeapTuple
GetTupleForTrigger(EState *estate,
EPQState *epqstate,
ResultRelInfo *relinfo,
ItemPointer tid,
LockTupleMode lockmode,
TupleTableSlot **newSlot)
{
...
if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
result = heap_expand_tuple(&tuple, relation->rd_att);
else
result = heap_copytuple(&tuple);
ReleaseBuffer(buffer);

There's no explanation why GetTupleForTrigger() needs expanding tuples,
but most other parts of postgres don't. Normally deforming ought to take
care of expanding missing attrs.

As far as I can tell, the reason that it's needed is that heap_gettar()
wasn't updated along the rest of the functions. heap_deform_tuple(),
heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.

That, to me, makes no sense. The reason that we see a difference in
regression test output is that composite_to_json() uses heap_getattr(),
if it used heap_deform_tuple (which'd be considerably faster), we'd get
the default value.

Am I missing something?

Greetings,

Andres Freund

#2Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:

While working on the patch to slotify trigger.c I got somewhat confused
by the need to expand tuples in trigger.c:

static HeapTuple
GetTupleForTrigger(EState *estate,
EPQState *epqstate,
ResultRelInfo *relinfo,
ItemPointer tid,
LockTupleMode lockmode,
TupleTableSlot **newSlot)
{
...
if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
result = heap_expand_tuple(&tuple, relation->rd_att);
else
result = heap_copytuple(&tuple);
ReleaseBuffer(buffer);

There's no explanation why GetTupleForTrigger() needs expanding tuples,
but most other parts of postgres don't. Normally deforming ought to take
care of expanding missing attrs.

As far as I can tell, the reason that it's needed is that heap_gettar()
wasn't updated along the rest of the functions. heap_deform_tuple(),
heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.

That, to me, makes no sense. The reason that we see a difference in
regression test output is that composite_to_json() uses heap_getattr(),
if it used heap_deform_tuple (which'd be considerably faster), we'd get
the default value.

Am I missing something?

Indeed, a hacky patch fixing heap_getattr makes the heap_expand_tuple()
in trigger.c unnecessary. Attached. I think we need to do something
along those lines.

Andrew, I think it'd be good to do a ground up review of the fast
defaults patch. There's been a fair number of issues in it, and this is
a another pretty fundamental issue.

Greetings,

Andres Freund

Attachments:

poc-fixme-heap_getattr.difftext/x-diff; charset=us-asciiDownload+11-5
#3Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#2)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Him

On 2019-02-01 14:49:05 -0800, Andres Freund wrote:

+#ifdef IAM_THE_WRONG_FIX
if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
result = heap_expand_tuple(&tuple, relation->rd_att);
else
result = heap_copytuple(&tuple);
+#else
+	result = heap_copytuple(&tuple);
+#endif

This was added in

commit 7636e5c60fea83a9f3cd2ad278c0819b98941c74
Author: Andrew Dunstan <andrew@dunslane.net>
Date: 2018-09-24 16:11:24 -0400

Fast default trigger and expand_tuple fixes

Ensure that triggers get properly filled in tuples for the OLD value.
Also fix the logic of detecting missing null values. The previous logic
failed to detect a missing null column before the first missing column
with a default. Fixing this has simplified the logic a bit.

Regression tests are added to test changes. This should ensure better
coverage of expand_tuple().

Original bug reports, and some code and test scripts from Tomas Vondra

Backpatch to release 11.

Unfortunately the commit doesn't reference the discussion. That appears
to have been at
/messages/by-id/224e4807-395d-9fc5-2934-d5f85130f1f0@2ndQuadrant.com

Greetings,

Andres Freund

#4Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#1)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:

While working on the patch to slotify trigger.c I got somewhat confused
by the need to expand tuples in trigger.c:

static HeapTuple
GetTupleForTrigger(EState *estate,
EPQState *epqstate,
ResultRelInfo *relinfo,
ItemPointer tid,
LockTupleMode lockmode,
TupleTableSlot **newSlot)
{
...
if (HeapTupleHeaderGetNatts(tuple.t_data) < relation->rd_att->natts)
result = heap_expand_tuple(&tuple, relation->rd_att);
else
result = heap_copytuple(&tuple);
ReleaseBuffer(buffer);

There's no explanation why GetTupleForTrigger() needs expanding tuples,
but most other parts of postgres don't. Normally deforming ought to take
care of expanding missing attrs.

As far as I can tell, the reason that it's needed is that heap_gettar()
wasn't updated along the rest of the functions. heap_deform_tuple(),
heap_attisnull() etc look for missing attrs, but heap_getattr() doesn't.

That, to me, makes no sense. The reason that we see a difference in
regression test output is that composite_to_json() uses heap_getattr(),
if it used heap_deform_tuple (which'd be considerably faster), we'd get
the default value.

Am I missing something?

This breaks HOT (and probably also foreign keys), when fast default
columns are set to NULL, because HeapDetermineModifiedColumns() gets the
values with heap_getattr(), which returns a spurious NULL for the old
value (instead of the fast default value). That then would compare equal
to the new column value set to NULL.

Greetings,

Andres Freund

#5Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#4)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-02 05:35:21 -0800, Andres Freund wrote:

This breaks HOT (and probably also foreign keys), when fast default
columns are set to NULL, because HeapDetermineModifiedColumns() gets the
values with heap_getattr(), which returns a spurious NULL for the old
value (instead of the fast default value). That then would compare equal
to the new column value set to NULL.

Repro:

BEGIN;
CREATE TABLE t();
INSERT INTO t DEFAULT VALUES;
ALTER TABLE t ADD COLUMN a int default 1;
CREATE INDEX ON t(a);
UPDATE t SET a = NULL;

SET LOCAL enable_seqscan = true;
SELECT * FROM t WHERE a IS NULL;
SET LOCAL enable_seqscan = false;
SELECT * FROM t WHERE a IS NULL;
ROLLBACK;

output:
...
UPDATE 1
SET
┌────────┐
│ a │
├────────┤
│ (null) │
└────────┘
(1 row)

SET
┌───┐
│ a │
├───┤
└───┘
(0 rows)

Greetings,

Andres Freund

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#2)
Re: fast defaults in heap_getattr vs heap_deform_tuple

On 2/1/19 5:49 PM, Andres Freund wrote:

Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:

Andrew, I think it'd be good to do a ground up review of the fast
defaults patch. There's been a fair number of issues in it, and this is
a another pretty fundamental issue.

OK. Will try to organise it.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#6)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote:

On 2/1/19 5:49 PM, Andres Freund wrote:

Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:

Andrew, I think it'd be good to do a ground up review of the fast
defaults patch. There's been a fair number of issues in it, and this is
a another pretty fundamental issue.

OK. Will try to organise it.

Cool. Here's the patch that I, after some commit message polishing, plan
to commit to 11 and master to fix the issue at hand.

Greetings,

Andres Freund

Attachments:

0001-Fix-heap_getattr-handling-of-fast-defaults.patchtext/x-diff; charset=us-asciiDownload+52-12
#8Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#7)
Re: fast defaults in heap_getattr vs heap_deform_tuple

On 2019-Feb-05, Andres Freund wrote:

@@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull);
/*
* Return the missing value of an attribute, or NULL if there isn't one.
*/
-static Datum
+Datum
getmissingattr(TupleDesc tupleDesc,
int attnum, bool *isnull)

This is a terrible name for an exported function -- let's change it
before it gets exported. Heck, even heap_getmissingattr() would be
better.

I notice that with this patch, heap_getattr() obtains a new Assert()
that the attr being fetched is no further than tupledesc->natts.
It previously just returned null for that case. Maybe we should change
it so that it returns null if an attr beyond end-of-array is fetched?
(I think in non-assert builds, it would dereference past the AttrMissing
array.)

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#9Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#7)
Re: fast defaults in heap_getattr vs heap_deform_tuple

On 2019-02-05 08:57:05 -0800, Andres Freund wrote:

Hi,

On 2019-02-05 10:14:48 -0500, Andrew Dunstan wrote:

On 2/1/19 5:49 PM, Andres Freund wrote:

Hi,

On 2019-02-01 08:24:04 -0800, Andres Freund wrote:

Andrew, I think it'd be good to do a ground up review of the fast
defaults patch. There's been a fair number of issues in it, and this is
a another pretty fundamental issue.

OK. Will try to organise it.

Cool. Here's the patch that I, after some commit message polishing, plan
to commit to 11 and master to fix the issue at hand.

And pushed.

- Andres

#10Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andres Freund (#9)
Re: fast defaults in heap_getattr vs heap_deform_tuple

"Andres" == Andres Freund <andres@anarazel.de> writes:

Cool. Here's the patch that I, after some commit message polishing,
plan to commit to 11 and master to fix the issue at hand.

Andres> And pushed.

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

--
Andrew (irc:RhodiumToad)

#11Andres Freund
andres@anarazel.de
In reply to: Andrew Gierth (#10)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote:

"Andres" == Andres Freund <andres@anarazel.de> writes:

Cool. Here's the patch that I, after some commit message polishing,
plan to commit to 11 and master to fix the issue at hand.

Andres> And pushed.

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

There was some IRC conversation about this:

RhodiumToad | andres: ping?
RhodiumToad | n/m, sent mail
andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go
| through GetTupleForTrigger(). So extensions need to be recompiled either way.
andres | (I'll write that in a bit on list, in meeting now)
RhodiumToad | right, but those other calls were already broken.
RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix
Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
| pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum
RhodiumToad | it's not an encouraged interface

I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
the only case, and the extensions named above don't seem like they'd
specifically affected by the trigger path, but I'm not sure.

Greetings,

Andres Freund

#12Christoph Berg
myon@debian.org
In reply to: Andres Freund (#11)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
| pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum

I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
the only case, and the extensions named above don't seem like they'd
specifically affected by the trigger path, but I'm not sure.

I'm not following closely enough to say anything about which fix is
the best, but if this isn't a "recompile the world" case, we can get
the above list of packages rebuilt. It would be rather unpleasant to
have to schedule that for 50 packages, though.

Are rebuilt extension binaries compatible with older servers that do
not have the fix yet?

Christoph

#13Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#8)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-05 22:44:38 -0300, Alvaro Herrera wrote:

On 2019-Feb-05, Andres Freund wrote:

@@ -82,7 +80,7 @@ static Datum getmissingattr(TupleDesc tupleDesc, int attnum, bool *isnull);
/*
* Return the missing value of an attribute, or NULL if there isn't one.
*/
-static Datum
+Datum
getmissingattr(TupleDesc tupleDesc,
int attnum, bool *isnull)

This is a terrible name for an exported function -- let's change it
before it gets exported. Heck, even heap_getmissingattr() would be
better.

I don't really aggree. Note that the relevant datastructure is named
AttrMissing, and that the function isn't relevant for heap tuple. So I
don't really see what we'd otherwise name it? I think if we wanted to
improve this we should start with AttrMissing and not this function.

I notice that with this patch, heap_getattr() obtains a new Assert()
that the attr being fetched is no further than tupledesc->natts.
It previously just returned null for that case. Maybe we should change
it so that it returns null if an attr beyond end-of-array is fetched?
(I think in non-assert builds, it would dereference past the AttrMissing
array.)

Hm, it seems like there's plenty issues with accessing datums after the
end of the desc before this change, as well as after this change. Note
that slot_getattr() (in 11, it looks a bit different in master) already
calls getmissingattr(). And that's much more commonly used. Therefore
I'm disinclined to see this as a problem?

Greetings,

Andres Freund

#14Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#11)
Re: fast defaults in heap_getattr vs heap_deform_tuple

Hi,

On 2019-02-06 03:39:19 -0800, Andres Freund wrote:

On 2019-02-06 10:25:56 +0000, Andrew Gierth wrote:

"Andres" == Andres Freund <andres@anarazel.de> writes:

Cool. Here's the patch that I, after some commit message polishing,
plan to commit to 11 and master to fix the issue at hand.

Andres> And pushed.

Sorry I didn't spot this earlier, but... in the backpatch to v11, is the
expansion of the trigger tuple really unnecessary now? Since
heap_getattr is a macro, C-language triggers that aren't recompiled
won't see the default? So perhaps the expansion should be left in?

There was some IRC conversation about this:

RhodiumToad | andres: ping?
RhodiumToad | n/m, sent mail
andres | RhodiumToad: IDK, there's a lot of other heap_getattr calls, and most of them don't go
| through GetTupleForTrigger(). So extensions need to be recompiled either way.
andres | (I'll write that in a bit on list, in meeting now)
RhodiumToad | right, but those other calls were already broken.
RhodiumToad | whereas in a trigger, they'd have worked previously, but then break with your fix
Myon | the list of packages in Debian where heap_getattr appears is postgresql-11, pglogical,
| pgextwlist, citus, postgresql-plsh, wal2json, pg-cron, postgresql-rum
RhodiumToad | it's not an encouraged interface

I'm mildly disinclined to re-add the heap_expand_tuple, because it's not
the only case, and the extensions named above don't seem like they'd
specifically affected by the trigger path, but I'm not sure.

I put the expansion back for 11. Seems harmless enough. I'm planning to
send a reply to the minor release info to -packagers informing people
that extensions compiled after the minor release might not be compatible
with older servers (because they depend on getmissingattr() which
previously wasn't an exposed function)

Greetings,

Andres Freund