Refactoring speculative insertion with unique indexes a little

Started by Peter Geogheganalmost 11 years ago49 messageshackers

pg@bowt.ie

almost 11 years ago

Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE
executor/storage infrastructure) uses checkUnique ==
UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant
originally only used by deferred unique constraints. It occurred to me
that this has a number of disadvantages:

* It confuses separation of concerns. Pushing down this information to
the nbtree AM makes it clear why it's slightly special from a
speculative insertion point of view. For example, the nbtree AM does
not actually require "livelock insurance" (for unique indexes,
although in principle not for nbtree-based exclusion constraints,
which are possible).

* UNIQUE_CHECK_PARTIAL is not only not the same thing as
UNIQUE_CHECK_SPECULATIVE (a new constant for the enum). It's also
naturally mutually exclusive with it (since we do not and cannot
support deferred unique constraints as arbiters). Let's represent this
directly.

* It makes a conflict not detected by the pre-check always insert an
index tuple, even though that occurs after a point where it's already
been established that the pointed-to TID is doomed -- it must go on to
be super deleted. Why bother bloating the index?

I'm actually not really motivated by wanting to reduce bloat here
(that was always something that I thought was a non-issue with *any*
implemented speculative insertion prototype [1]https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29 -- Peter Geoghegan). Rather, by actually
physically inserting an index tuple unnecessarily, the implication is
that it makes sense to do so (perhaps for roughly the same reason it
makes sense with deferred unique constraints, or some other
complicated and subtle reason.). AFAICT that implication is incorrect,
though; I see no reason why inserting that index tuple serves any
purpose, and it clearly *can* be avoided with little effort.

Attached patch updates speculative insertion along these lines.

In passing, I have make ExecInsertIndexTuples() give up when a
speculative insertion conflict is detected. Again, this is not about
bloat prevention; it's about making the code easier to understand by
not doing something that is unnecessary, and in avoiding that also
avoiding the implication that it is necessary. There are already
enough complicated interactions that *are* necessary (e.g. "livelock
avoidance" for exclusion constraints). Let us make our intent clearer.

The patch also updates the AM interface documentation (the part
covering unique indexes). It seems quite natural to me to document the
theory of operation for speculative insertion there.

Thoughts?

[1]: https://wiki.postgresql.org/wiki/Value_locking#.232._.22Promise.22_heap_tuples_.28Heikki_Linnakangas.29 -- Peter Geoghegan
--
Peter Geoghegan

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

almost 11 years ago

In reply to: Peter Geoghegan (#1)

Re: Refactoring speculative insertion with unique indexes a little

On 06/11/2015 02:19 AM, Peter Geoghegan wrote:

Currently, speculative insertion (the INSERT ... ON CONFLICT DO UPDATE
executor/storage infrastructure) uses checkUnique ==
UNIQUE_CHECK_PARTIAL for unique indexes, which is a constant
originally only used by deferred unique constraints. It occurred to me
that this has a number of disadvantages:

* It confuses separation of concerns. Pushing down this information to
the nbtree AM makes it clear why it's slightly special from a
speculative insertion point of view. For example, the nbtree AM does
not actually require "livelock insurance" (for unique indexes,
although in principle not for nbtree-based exclusion constraints,
which are possible).

* UNIQUE_CHECK_PARTIAL is not only not the same thing as
UNIQUE_CHECK_SPECULATIVE (a new constant for the enum). It's also
naturally mutually exclusive with it (since we do not and cannot
support deferred unique constraints as arbiters). Let's represent this
directly.

* It makes a conflict not detected by the pre-check always insert an
index tuple, even though that occurs after a point where it's already
been established that the pointed-to TID is doomed -- it must go on to
be super deleted. Why bother bloating the index?

I'm actually not really motivated by wanting to reduce bloat here
(that was always something that I thought was a non-issue with *any*
implemented speculative insertion prototype [1]). Rather, by actually
physically inserting an index tuple unnecessarily, the implication is
that it makes sense to do so (perhaps for roughly the same reason it
makes sense with deferred unique constraints, or some other
complicated and subtle reason.). AFAICT that implication is incorrect,
though; I see no reason why inserting that index tuple serves any
purpose, and it clearly *can* be avoided with little effort.

I agree it would be cleaner to have a separate CHECK_UNIQUE_XXX code for
speculative insertions. You've defined CHECK_UNIQUE_SPECULATIVE as "like
CHECK_UNIQUE_PARTIAL, but you don't have to insert the index tuple if
there's a conflict". I think it'd be better to define it as "like
CHECK_UNIQUE_YES, but return FALSE instead of throwing an error on
conflict". The difference is that the aminsert would not be allowed to
return FALSE when there is no conflict.

Actually, even without this patch, a dummy implementation of aminsert
that always returns FALSE in CHECK_UNIQUE_PARTIAL mode would be legal
per the docs, but would loop forever. So that would be nice to fix or
document away, even though it's not a problem for B-tree currently.

Attached patch updates speculative insertion along these lines.

In passing, I have make ExecInsertIndexTuples() give up when a
speculative insertion conflict is detected. Again, this is not about
bloat prevention; it's about making the code easier to understand by
not doing something that is unnecessary, and in avoiding that also
avoiding the implication that it is necessary. There are already
enough complicated interactions that *are* necessary (e.g. "livelock
avoidance" for exclusion constraints). Let us make our intent clearer.

Hmm, not sure I like those changes. ExecInsertIndexTuples's behaviour
now depends on whether specConflict is passed, but that seems weird as
specConflict is an output parameter.

The patch also updates the AM interface documentation (the part
covering unique indexes). It seems quite natural to me to document the
theory of operation for speculative insertion there.

Yeah, although I find the explanation pretty long-winded and difficult
to understand ;-).

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Refactoring speculative insertion with unique indexes a little

Attachments: