INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0
Attached is a cumulative patch set - version 2.0 of INSERT ... ON
CONFLICT {UPDATE | IGNORE}.
This revision does not offer a variant implementing approach #1 to
value locking [1]https://wiki.postgresql.org/wiki/Value_locking#.231._Heavyweight_page_locking_.28Peter_Geoghegan.29 (only approach #2), since maintaining both
approaches in parallel has about outlived its usefulness.
I'm calling this version 2.0 because it has RLS support. This is
significant because AFAICT it's the last feature that needs to have
interactions with UPSERT considered. I've worked through a rather long
list of existing interrelated features, implementing support in each
case. I've had feedback from others on what behavior is appropriate
when that wasn't obvious, and have made sure those areas had
appropriate support. This now includes RLS, but past revisions added
support for inheritance, updatable views, statement-level triggers,
postgres_fdw, column-level privileges, partial indexes, exclusion
constraints, and more. Basically, I think we're done with discussing
those aspects, and the semantics/syntax in general, or are pretty
close to done. Certainly, support for these other interrelated
features is quite comprehensive at this point. Now the core mechanism
of the patch should be discussed in detail. The general structure and
design is also interesting. After months and months of discussion, it
now seems very likely that the semantics offered are the right ones.
Since even before V1.0 was posted back in August, that's all that
we've discussed, really (apart from the recent back and forth with
Heikki on value locking bugs, of course).
I've approached RLS along the lines Stephen seemed to think would work
best following extensive discussion [2]/messages/by-id/20150109214041.GK3062@tamriel.snowman.net, or at least I believe that
I've produced RLS support that is what we informally agreed on. All
security barrier quals are treated as WITH CHECK OPTIONs in the
context of ON CONFLICT UPDATE. INSERTs don't have to deal with
UPDATE-related policies/WITH CHECK OPTIONs, but when the update path
is taken, both the INSERT and UPDATE related policies must both pass.
They must pass for the tuple that necessitated taking the UPDATE path
(the locked tuple to be updated), and also the finished tuple added
back to the relation by ExecUpdate(). There are 3 possible calls to
ExecWithCheckOptions() in the context of INSERT ... ON CONFLICT
UPDATE. Those 2 that I just mentioned, that involve UPDATE *and*
INSERT WITH CHECK options, and also the ExecInsert()
ExecWithCheckOptions() call.
RLS support is provided in a separate cumulative commit in the hope
that this makes it easier to review by a subject matter expert.
Documentation [3]http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createpolicy.html and tests covering RLS are provided, of course.
I also include various bugfixes to approach #2 to value locking (these
were all previously separately posted, but are now integrated into the
main ON CONFLICT commit). Specifically, these are fixes for the bugs
that emerged thanks to Jeff Janes' great work on stress testing [4]https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan.
With these fixes, I have been unable to reproduce any problem with
this patch with the test suite, even after many days of running the
script on a quad-core server, with constant concurrent VACUUM runs,
etc. I think that we still need to think about the issues that
transpired with exclusion constraints, but since I couldn't find
another problem with an adapted version of Jeff's tool that tested
exclusion constraints, I'm inclined to think that it should be
possible to support exclusion constraints for the IGNORE variant.
It would be great to have more input on stress testing from Jeff.
Thoughts?
[1]: https://wiki.postgresql.org/wiki/Value_locking#.231._Heavyweight_page_locking_.28Peter_Geoghegan.29
[2]: /messages/by-id/20150109214041.GK3062@tamriel.snowman.net
[3]: http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/on-conflict-docs/sql-createpolicy.html
[4]: https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan
--
Peter Geoghegan
Attachments:
0008-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchtext/x-patch; charset=US-ASCII; name=0008-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchDownload
From 57cecef6abf7e9f7162696ec3b8847809a85e9b9 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 26 Sep 2014 20:59:04 -0700
Subject: [PATCH 8/8] User-visible documentation for INSERT ... ON CONFLICT
{UPDATE | IGNORE}
INSERT ... ON CONFLICT {UPDATE | IGNORE} is documented as a new clause
of the INSERT command. Some potentially surprising interactions with
triggers are noted -- BEFORE INSERT per-row triggers must fire without
the INSERT path necessarily being taken, for example.
All the existing features that INSERT ... ON CONFLICT {UPDATE | IGNORE}
interacts with have these interactions noted. This includes
postgres_fdw, updatable views, table inheritance, RLS and partial unique
indexes.
Finally, a user-level description of the new "MVCC violation" that the
ON CONFLICT UPDATE variant sometimes requires has been added to "Chapter
13 - Concurrency Control", beside existing commentary on READ COMMITTED
mode's special handling of concurrent updates. The new "MVCC violation"
introduced seems somewhat distinct from the existing one (i.e. READ
COMMITTED's handling of when an UPDATE affects a concurrently
updated/deleted tuple, which internally uses a mechanism called
EvalPlanQual()), because in READ COMMITTED mode it is no longer
necessary for any row version to be conventionally visible to the
command's MVCC snapshot for an UPDATE of the row to occur (or for the
row to be locked, should the UPDATE's WHERE clause not be satisfied).
---
doc/src/sgml/ddl.sgml | 23 +++
doc/src/sgml/fdwhandler.sgml | 8 +
doc/src/sgml/keywords.sgml | 7 +
doc/src/sgml/mvcc.sgml | 24 +++
doc/src/sgml/plpgsql.sgml | 14 +-
doc/src/sgml/postgres-fdw.sgml | 8 +
doc/src/sgml/protocol.sgml | 13 +-
doc/src/sgml/ref/alter_policy.sgml | 7 +-
doc/src/sgml/ref/create_policy.sgml | 50 +++--
doc/src/sgml/ref/create_rule.sgml | 6 +-
doc/src/sgml/ref/create_table.sgml | 5 +-
doc/src/sgml/ref/create_trigger.sgml | 5 +-
doc/src/sgml/ref/create_view.sgml | 33 ++-
doc/src/sgml/ref/insert.sgml | 373 ++++++++++++++++++++++++++++++++--
doc/src/sgml/ref/set_constraints.sgml | 6 +-
doc/src/sgml/trigger.sgml | 49 ++++-
16 files changed, 573 insertions(+), 58 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 570a003..7b43a10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2428,9 +2428,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
+ There is limited inheritance support for <command>INSERT</command>
+ commands with <literal>ON CONFLICT</> clauses. Tables with
+ children are not generally accepted as targets. One notable
+ exception is that such tables are accepted as targets for
+ <command>INSERT</command> commands with <literal>ON CONFLICT
+ IGNORE</> clauses, provided a unique index inference clause was
+ omitted (which implies that there is no concern about
+ <emphasis>which</> unique index any would-be conflict might arise
+ from). However, tables that happen to be inheritance children are
+ accepted as targets for all variants of <command>INSERT</command>
+ with <literal>ON CONFLICT</>.
+ </para>
+
+ <para>
All check constraints and not-null constraints on a parent table are
automatically inherited by its children. Other types of constraints
(unique, primary key, and foreign key constraints) are not inherited.
+ Therefore, <command>INSERT</command> with <literal>ON CONFLICT</>
+ unique index inference considers only unique constraints/indexes
+ directly associated with the child
+ table.
</para>
<para>
@@ -2515,6 +2533,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
+ <literal>INSERT</literal> with an <literal>ON CONFLICT
+ UPDATE</literal> clause does not support the
+ <literal>ONLY</literal> notation, and so in effect tables with
+ inheritance children are not supported for the <literal>ON
+ CONFLICT</literal> variant.
Commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
typically only work on individual, physical tables and do not
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..0c3dcb5 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,14 @@ GetForeignServerByName(const char *name, bool missing_ok);
source provides.
</para>
+ <para>
+ <command>INSERT</> with an <literal>ON CONFLICT</> clause is not supported
+ with a unique index inference specification (this implies that <literal>ON
+ CONFLICT UPDATE</> is never supported, since the specification is
+ mandatory there). When planning an <command>INSERT</>,
+ <function>PlanForeignModify</> should reject these cases.
+ </para>
+
</sect1>
</chapter>
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index b0dfd5f..ea58211 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -854,6 +854,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>CONFLICT</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>CONNECT</token></entry>
<entry></entry>
<entry>reserved</entry>
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0d6867..5e310d7 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -326,6 +326,30 @@
</para>
<para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</> clause is
+ another special case. In Read Committed mode, the implementation will
+ either insert or update each row proposed for insertion, with either one of
+ those two outcomes guaranteed. This is a useful guarantee for many
+ use-cases, but it implies that further liberties must be taken with
+ snapshot isolation. Should a conflict originate in another transaction
+ whose effects are not visible to the <command>INSERT</command>, the
+ <command>UPDATE</command> may affect that row, even though it may be the
+ case that <emphasis>no</> version of that row is conventionally visible to
+ the command. In the same vein, if the secondary search condition of the
+ command (an explicit <literal>WHERE</> clause) is supplied, it is only
+ evaluated on the most recent row version, which is not necessarily the
+ version conventionally visible to the command (if indeed there is a row
+ version conventionally visible to the command at all).
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT IGNORE</> clause may
+ have insertion not proceed for a row due to the outcome of another
+ transaction whose effects are not visible to the <command>INSERT</command>
+ snapshot. Again, this is only the case in Read Committed mode.
+ </para>
+
+ <para>
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands on the same rows it is trying to update, but it
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 69a0885..59a5945 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2607,7 +2607,11 @@ END;
<para>
This example uses exception handling to perform either
- <command>UPDATE</> or <command>INSERT</>, as appropriate:
+ <command>UPDATE</> or <command>INSERT</>, as appropriate. It is
+ recommended that applications use <command>INSERT</> with
+ <literal>ON CONFLICT UPDATE</> rather than actually emulating this
+ pattern. This example serves only to illustrate use of
+ <application>PL/pgSQL</application> control flow structures:
<programlisting>
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
@@ -3771,9 +3775,11 @@ RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;
<command>INSERT</> and <command>UPDATE</> operations, the return value
should be <varname>NEW</>, which the trigger function may modify to
support <command>INSERT RETURNING</> and <command>UPDATE RETURNING</>
- (this will also affect the row value passed to any subsequent triggers).
- For <command>DELETE</> operations, the return value should be
- <varname>OLD</>.
+ (this will also affect the row value passed to any subsequent triggers,
+ or passed to a special <varname>EXCLUDED</> alias reference within
+ an <command>INSERT</> statement with an <literal>ON CONFLICT UPDATE</>
+ clause). For <command>DELETE</> operations, the return
+ value should be <varname>OLD</>.
</para>
<para>
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fa39661 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -69,6 +69,14 @@
</para>
<para>
+ Note that <filename>postgres_fdw</> currently lacks support for
+ <command>INSERT</command> statements with an <literal>ON CONFLICT
+ UPDATE</> clause. However, the <literal>ON CONFLICT IGNORE</>
+ clause is supported, provided a unique index inference specification
+ is omitted.
+ </para>
+
+ <para>
It is generally recommended that the columns of a foreign table be declared
with exactly the same data types, and collations if applicable, as the
referenced columns of the remote table. Although <filename>postgres_fdw</>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..a198182 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2998,9 +2998,16 @@ CommandComplete (B)
<literal>INSERT <replaceable>oid</replaceable>
<replaceable>rows</replaceable></literal>, where
<replaceable>rows</replaceable> is the number of rows
- inserted. <replaceable>oid</replaceable> is the object ID
- of the inserted row if <replaceable>rows</replaceable> is 1
- and the target table has OIDs;
+ inserted. However, if and only if <literal>ON CONFLICT
+ UPDATE</> is specified, then the tag is <literal>UPSERT
+ <replaceable>oid</replaceable>
+ <replaceable>rows</replaceable></literal>, where
+ <replaceable>rows</replaceable> is the number of rows inserted
+ <emphasis>or updated</emphasis>.
+ <replaceable>oid</replaceable> is the object ID of the
+ inserted row if <replaceable>rows</replaceable> is 1 and the
+ target table has OIDs, and (for the <literal>UPSERT</literal>
+ tag), the row was actually inserted rather than updated;
otherwise <replaceable>oid</replaceable> is 0.
</para>
diff --git a/doc/src/sgml/ref/alter_policy.sgml b/doc/src/sgml/ref/alter_policy.sgml
index 796035e..86bda92 100644
--- a/doc/src/sgml/ref/alter_policy.sgml
+++ b/doc/src/sgml/ref/alter_policy.sgml
@@ -93,8 +93,11 @@ ALTER POLICY <replaceable class="parameter">name</replaceable> ON <replaceable c
The USING expression for the policy. This expression will be added as a
security-barrier qualification to queries which use the table
automatically. If multiple policies are being applied for a given
- table then they are all combined and added using OR. The USING
- expression applies to records which are being retrieved from the table.
+ table then they are all combined and added using OR (except as noted in
+ the <xref linkend="sql-createpolicy"> documentation for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ The USING expression applies to records which are being retrieved from the
+ table.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 8ef8556..fcfcb02 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -63,7 +63,8 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
Policies can be applied for specific commands or for specific roles. The
default for newly created policies is that they apply for all commands and
roles, unless otherwise specified. If multiple policies apply to a given
- query, they will be combined using OR.
+ query, they will be combined using OR (except as noted for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
</para>
<para>
@@ -237,6 +238,19 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
as it only ever applies in cases where records are being added to the
relation.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>INSERT</literal> policy WITH
+ CHECK expression also passes for both any existing tuple in the target
+ table that necessitates that the <literal>UPDATE</literal> path be
+ taken, and the final tuple added back into the relation.
+ <literal>INSERT</literal> policies are separately combined using
+ <literal>OR</literal>, and this distinct set of policy expressions must
+ always pass, regardless of whether any or all <literal>UPDATE</literal>
+ policies also pass (in the same tuple check). However, successfully
+ inserted tuples are not subject to <literal>UPDATE</literal> policy
+ enforcement.
+ </para>
</listitem>
</varlistentry>
@@ -245,18 +259,28 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
<listitem>
<para>
Using <literal>UPDATE</literal> for a policy means that it will apply
- to <literal>UPDATE</literal> commands. As <literal>UPDATE</literal>
- involves pulling an existing record and then making changes to some
- portion (but possibly not all) of the record, the
- <literal>UPDATE</literal> policy accepts both a USING expression and
- a WITH CHECK expression. The USING expression will be used to
- determine which records the <literal>UPDATE</literal> command will
- see to operate against, while the <literal>WITH CHECK</literal>
- expression defines what rows are allowed to be added back into the
- relation (similar to the <literal>INSERT</literal> policy).
- Any rows whose resulting values do not pass the
- <literal>WITH CHECK</literal> expression will cause an ERROR and the
- entire command will be aborted.
+ to <literal>UPDATE</literal> commands (or auxiliary <literal>ON
+ CONFLICT UPDATE</literal> clauses of <literal>INSERT</literal>
+ commands). As <literal>UPDATE</literal> involves pulling an existing
+ record and then making changes to some portion (but possibly not all)
+ of the record, the <literal>UPDATE</literal> policy accepts both a
+ USING expression and a WITH CHECK expression. The USING expression
+ will be used to determine which records the <literal>UPDATE</literal>
+ command will see to operate against, while the <literal>WITH
+ CHECK</literal> expression defines what rows are allowed to be added
+ back into the relation (similar to the <literal>INSERT</literal>
+ policy). Any rows whose resulting values do not pass the <literal>WITH
+ CHECK</literal> expression will cause an ERROR and the entire command
+ will be aborted.
+ </para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>UPDATE</literal> policy
+ USING expression always be treated as a WITH CHECK
+ expression. This <literal>UPDATE</literal> policy must
+ always pass, regardless of whether any
+ <literal>INSERT</literal> policy also passes in the same
+ tuple check.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml
index 677766a..9b5c740 100644
--- a/doc/src/sgml/ref/create_rule.sgml
+++ b/doc/src/sgml/ref/create_rule.sgml
@@ -136,7 +136,11 @@ CREATE [ OR REPLACE ] RULE <replaceable class="parameter">name</replaceable> AS
<para>
The event is one of <literal>SELECT</literal>,
<literal>INSERT</literal>, <literal>UPDATE</literal>, or
- <literal>DELETE</literal>.
+ <literal>DELETE</literal>. Note that an
+ <command>INSERT</command> containing an <literal>ON
+ CONFLICT</literal> clause is unsupported. Consider using an
+ updatable view instead, which have limited support for
+ <literal>ON CONFLICT IGNORE</literal> only.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 299cce8..a9c1124 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -708,7 +708,10 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<literal>EXCLUDE</>, and
<literal>REFERENCES</> (foreign key) constraints accept this
clause. <literal>NOT NULL</> and <literal>CHECK</> constraints are not
- deferrable.
+ deferrable. Note that constraints that were created with this
+ clause cannot be used as arbiters of whether or not to take the
+ alternative path with an <command>INSERT</command> statement
+ that includes an <literal>ON CONFLICT UPDATE</> clause.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index 29b815c..26a0986 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -76,7 +76,10 @@ CREATE [ CONSTRAINT ] TRIGGER <replaceable class="PARAMETER">name</replaceable>
executes once for any given operation, regardless of how many rows
it modifies (in particular, an operation that modifies zero rows
will still result in the execution of any applicable <literal>FOR
- EACH STATEMENT</literal> triggers).
+ EACH STATEMENT</literal> triggers). Note that since
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is considered an <command>INSERT</command> statement, no
+ <command>UPDATE</command> statement level trigger will be fired.
</para>
<para>
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 5dadab1..599c1cb 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -286,8 +286,9 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
<para>
Simple views are automatically updatable: the system will allow
<command>INSERT</>, <command>UPDATE</> and <command>DELETE</> statements
- to be used on the view in the same way as on a regular table. A view is
- automatically updatable if it satisfies all of the following conditions:
+ to be used on the view in the same way as on a regular table (aside from
+ the limitations on ON CONFLICT noted below). A view is automatically
+ updatable if it satisfies all of the following conditions:
<itemizedlist>
<listitem>
@@ -383,6 +384,34 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
not need any permissions on the underlying base relations (see
<xref linkend="rules-privileges">).
</para>
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT</> clause
+ is only supported on updatable views under specific circumstances.
+ If a set of columns/expressions has been provided with which to
+ infer a unique index to consider as the arbiter of whether the
+ statement ultimately takes an alternative path - if a would-be
+ duplicate violation in some particular unique index is tacitly
+ taken as provoking an alternative <command>UPDATE</command> or
+ <literal>IGNORE</> path - then updatable views are not supported.
+ Since this specification is already mandatory for
+ <command>INSERT</command> with <literal>ON CONFLICT UPDATE</>,
+ this implies that only the <literal>ON CONFLICT IGNORE</> variant
+ is supported, and only when there is no such specification. For
+ example:
+ </para>
+ <para>
+<programlisting>
+-- Unsupported:
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'foo') ON CONFLICT (key)
+ UPDATE SET val = EXCLUDED.val;
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'bar') ON CONFLICT (key)
+ IGNORE;
+
+-- Supported (note the omission of "key" column):
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'baz') ON CONFLICT
+ IGNORE;
+</programlisting>
+ </para>
</refsect2>
</refsect1>
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a3cccb9..40b7566 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -24,6 +24,14 @@ PostgreSQL documentation
[ WITH [ RECURSIVE ] <replaceable class="parameter">with_query</replaceable> [, ...] ]
INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) [, ...] | <replaceable class="PARAMETER">query</replaceable> }
+ [ ON CONFLICT [ ( { <replaceable class="parameter">column_name_index</replaceable> | ( <replaceable class="parameter">expression_index</replaceable> ) } [, ...] [ WHERE <replaceable class="PARAMETER">index_condition</replaceable> ] ) ]
+ { IGNORE | UPDATE
+ SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } |
+ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] )
+ } [, ...]
+ [ WHERE <replaceable class="PARAMETER">condition</replaceable> ]
+ }
+ ]
[ RETURNING * | <replaceable class="parameter">output_expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
</synopsis>
</refsynopsisdiv>
@@ -32,9 +40,15 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<title>Description</title>
<para>
- <command>INSERT</command> inserts new rows into a table.
- One can insert one or more rows specified by value expressions,
- or zero or more rows resulting from a query.
+ <command>INSERT</command> inserts new rows into a table. One can
+ insert one or more rows specified by value expressions, or zero or
+ more rows resulting from a query. An alternative path
+ (<literal>IGNORE</literal> or <literal>UPDATE</literal>) can
+ optionally be specified, to be taken in the event of detecting that
+ proceeding with insertion would result in a conflict (i.e. a
+ conflicting tuple already exists). The alternative path is
+ considered individually for each row proposed for insertion, and is
+ taken (or not taken) once per row.
</para>
<para>
@@ -59,25 +73,214 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</para>
<para>
+ The optional <literal>ON CONFLICT</> clause specifies a path to
+ take as an alternative to raising a conflict related error.
+ <literal>ON CONFLICT IGNORE</> simply avoids inserting any
+ individual row when it is determined that a conflict related error
+ would otherwise need to be raised. <literal>ON CONFLICT UPDATE</>
+ has the system take an <command>UPDATE</command> path in respect of
+ such rows instead. <literal>ON CONFLICT UPDATE</> guarantees an
+ atomic <command>INSERT</command> or <command>UPDATE</command>
+ outcome - provided there is no incidental error, one of those two
+ outcomes is guaranteed, even under high concurrency.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> optionally accepts a
+ <literal>WHERE</> clause <replaceable>condition</>. When provided,
+ the statement only proceeds with updating if the
+ <replaceable>condition</> is satisfied. Otherwise, unlike a
+ conventional <command>UPDATE</command>, the row is still locked for
+ update. Note that the <replaceable>condition</> is evaluated last,
+ after a conflict has been identified as a candidate to update.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> is effectively an auxiliary query of
+ its parent <command>INSERT</command>. Two aliases are visible to
+ the auxiliary query only - <varname>TARGET</> and
+ <varname>EXCLUDED</>. The first alias is just a standard alias for
+ the target relation in the context of the auxiliary query, while
+ the second alias refers to rows originally proposed for insertion.
+ Both aliases can be used in the auxiliary query targetlist and
+ <literal>WHERE</> clause. This allows expressions (in particular,
+ assignments) to reference rows originally proposed for insertion.
+ Note that the effects of all per-row <literal>BEFORE INSERT</>
+ triggers are carried forward. This is particularly useful for
+ multi-insert <literal>ON CONFLICT UPDATE</> statements; when
+ inserting or updating multiple rows, constants or parameter values
+ need only appear once.
+ </para>
+
+ <para>
+ There are several restrictions on the <literal>ON CONFLICT
+ UPDATE</> clause that do not apply to <command>UPDATE</command>
+ statements. Subqueries may not appear in either the
+ <command>UPDATE</command> targetlist, nor its <literal>WHERE</>
+ clause (although simple multi-assignment expressions are
+ supported). <literal>WHERE CURRENT OF</> cannot be used. In
+ general, only columns in the target table, and excluded values
+ originally proposed for insertion may be referenced. Operators and
+ functions may be used freely, though.
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is a <quote>deterministic</quote> statement. This means
+ that the command will not be allowed to affect any single existing
+ row more than once; a cardinality violation error will be raised
+ when this situation arises. Rows proposed for insertion should not
+ duplicate each other in terms of attributes constrained by the
+ conflict-arbitrating unique index. Note that the ordinary rules
+ for unique indexes with regard to null apply analogously to whether
+ or not an arbitrating unique index indicates if the alternative
+ path should be taken. This means that when a null value appears in
+ any uniquely constrained tuple's attribute in an
+ <command>INSERT</command> statement with <literal>ON CONFLICT
+ UPDATE</literal>, rows proposed for insertion will never take the
+ alternative path (provided that a <literal>BEFORE ROW
+ INSERT</literal> trigger does not make null values non-null before
+ insertion); the statement will always insert, assuming there is no
+ unrelated error. Note that merely locking a row (by having it not
+ satisfy the <literal>WHERE</> clause <replaceable>condition</>)
+ does not count towards whether or not the row has been affected
+ multiple times (and whether or not a cardinality violation error is
+ raised). However, the implementation checks for cardinality
+ violations after locking the row, and before updating (or
+ considering updating), so a cardinality violation may be raised
+ despite the fact that the row would not otherwise have gone on to
+ be updated if and only if the existing row was updated by the
+ <literal>ON CONFLICT UPDATE</literal> command at least once
+ already.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> requires a <emphasis>unique index
+ inference</emphasis> specification, which consists of one or more
+ <replaceable class="PARAMETER">column_name_index</replaceable>
+ columns and/or <replaceable
+ class="PARAMETER">expression_index</replaceable> expressions on
+ columns, appearing between parenthesis. These are used to infer a
+ single unique index to limit pre-checking for conflicts to (if no
+ appropriate index is available, an error is raised). A subset of
+ the table to limit the check for conflicts to can optionally also
+ be specified using <replaceable
+ class="PARAMETER">index_condition</replaceable>. Note that any
+ available unique index must only cover at least that subset in
+ order to be arbitrate taking the alternative path; it need not
+ match exactly, and so a non-partial unique index that otherwise
+ matches is applicable. <literal>ON CONFLICT IGNORE</> makes an
+ inference specification optional; omitting the specification
+ indicates a total indifference to where any conflict could occur,
+ which isn't always appropriate. At times, it may be desirable for
+ <literal>ON CONFLICT IGNORE</> to <emphasis>not</emphasis> suppress
+ a conflict related error associated with an index where that isn't
+ explicitly anticipated. Note that <literal>ON CONFLICT UPDATE</>
+ assignment may result in a uniqueness violation, just as with a
+ conventional <command>UPDATE</command>.
+ </para>
+
+ <para>
+ Columns and/or expressions appearing in a unique index inference
+ specification must match all the columns/expressions of some
+ existing unique index on <replaceable
+ class="PARAMETER">table_name</replaceable> - there can be no
+ columns/expressions from the unique index that do not appear in the
+ inference specification, nor can there be any columns/expressions
+ appearing in the inference specification that do not appear in the
+ unique index definition. However, the order of the
+ columns/expressions in the index definition, or whether or not the
+ index definition specified <literal>NULLS FIRST</> or
+ <literal>NULLS LAST</>, or the internal sort order of each column
+ (whether <literal>DESC</> or <literal>ASC</> were specified) are
+ all irrelevant. Deferred unique constraints are not supported as
+ arbiters of whether an alternative <literal>ON CONFLICT</> path
+ should be taken.
+ </para>
+
+ <para>
+ The definition of a conflict for the purposes of <literal>ON
+ CONFLICT</> is somewhat subtle, although the exact definition is
+ seldom of great interest. A conflict is either a unique violation
+ from a unique constraint (or unique index), or an exclusion
+ violation from an exclusion constraint. Only unique indexes can be
+ inferred with a unique index inference specification, which is
+ required for the <command>UPDATE</command> variant, so in effect
+ only unique constraints (and unique indexes) are supported by the
+ <command>UPDATE</command> variant. In contrast to the rules around
+ certain other SQL clauses, like the <literal>DISTINCT</literal>
+ clause, the definition of a duplicate (a conflict) is based on
+ whatever unique indexes happen to be defined on columns on the
+ table. This means that if a user-defined type has multiple sort
+ orders, and the "equals" operator of any of those available sort
+ orders happens to be inconsistent (which goes against an unenforced
+ convention of <productname>PostgreSQL</productname>), the exact
+ behavior depends on the choice of operator class when the unique
+ index was created initially, and not any other consideration such
+ as the default operator class for the type of each indexed column.
+ If there are multiple unique indexes available that seem like
+ equally suitable candidates, but with inconsistent definitions of
+ "equals", then the system chooses whatever it estimates to be the
+ cheapest one to use as an arbiter of taking the alternative
+ <command>UPDATE</command>/<literal>IGNORE</literal> path.
+ </para>
+
+ <para>
+ The optional <replaceable
+ class="PARAMETER">index_condition</replaceable> can be used to
+ allow the inference specification to infer that a partial unique
+ index can be used. Any unique index that otherwise satisfies the
+ inference specification, while also covering at least all the rows
+ in the table covered by <replaceable
+ class="PARAMETER">index_condition</replaceable> may be used. It is
+ recommended that the partial index predicate of the unique index
+ intended to be used as the arbiter of taking the alternative path
+ be matched exactly, but this is not required. Note that an error
+ will be raised if an arbiter unique index is chosen that does not
+ cover the tuple or tuples ultimately proposed for insertion.
+ However, an overly specific <replaceable
+ class="PARAMETER">index_condition</replaceable> does not imply that
+ arbitrating conflicts will be limited to the subset of rows covered
+ by the inferred unique index corresponding to <replaceable
+ class="PARAMETER">index_condition</replaceable>.
+ </para>
+
+ <para>
The optional <literal>RETURNING</> clause causes <command>INSERT</>
- to compute and return value(s) based on each row actually inserted.
+ to compute and return value(s) based on each row actually inserted
+ (or updated, if an <literal>ON CONFLICT UPDATE</> clause was used).
This is primarily useful for obtaining values that were supplied by
defaults, such as a serial sequence number. However, any expression
using the table's columns is allowed. The syntax of the
<literal>RETURNING</> list is identical to that of the output list
- of <command>SELECT</>.
+ of <command>SELECT</>. Only rows that were successfully inserted
+ or updated will be returned. If a row was locked but not updated
+ because an <literal>ON CONFLICT UPDATE</> <literal>WHERE</> clause
+ did not pass, the row will not be returned. Since
+ <literal>RETURNING</> is not part of the <command>UPDATE</>
+ auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
+ (<varname>TARGET</> and <varname>EXCLUDED</>) may not be
+ referenced; only the row as it exists after updating (or
+ inserting) is returned.
</para>
<para>
You must have <literal>INSERT</literal> privilege on a table in
- order to insert into it. If a column list is specified, you only
- need <literal>INSERT</literal> privilege on the listed columns.
- Use of the <literal>RETURNING</> clause requires <literal>SELECT</>
- privilege on all columns mentioned in <literal>RETURNING</>.
- If you use the <replaceable
- class="PARAMETER">query</replaceable> clause to insert rows from a
- query, you of course need to have <literal>SELECT</literal> privilege on
- any table or column used in the query.
+ order to insert into it, as well as <literal>UPDATE
+ privilege</literal> if and only if <literal>ON CONFLICT UPDATE</>
+ is specified. If a column list is specified, you only need
+ <literal>INSERT</literal> privilege on the listed columns.
+ Similarly, when <literal>ON CONFLICT UPDATE</> is specified, you
+ only need <literal>UPDATE</> privilege on the column(s) that are
+ listed to be updated, as well as SELECT privilege on any column
+ whose values are read in the <literal>ON CONFLICT UPDATE</>
+ expressions or <replaceable>condition</>. Use of the
+ <literal>RETURNING</> clause requires <literal>SELECT</> privilege
+ on all columns mentioned in <literal>RETURNING</>. If you use the
+ <replaceable class="PARAMETER">query</replaceable> clause to insert
+ rows from a query, you of course need to have
+ <literal>SELECT</literal> privilege on any table or column used in
+ the query.
</para>
</refsect1>
@@ -121,7 +324,54 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
The name of a column in the table named by <replaceable class="PARAMETER">table_name</replaceable>.
The column name can be qualified with a subfield name or array
subscript, if needed. (Inserting into only some fields of a
- composite column leaves the other fields null.)
+ composite column leaves the other fields null.) When
+ referencing a column with <literal>ON CONFLICT UPDATE</>, do not
+ include the table's name in the specification of a target
+ column. For example, <literal>INSERT ... ON CONFLICT UPDATE tab
+ SET TARGET.col = 1</> is invalid.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name_index</replaceable></term>
+ <listitem>
+ <para>
+ The name of a <replaceable
+ class="PARAMETER">table_name</replaceable> column (with several
+ columns potentially named). These are used to infer a
+ particular unique index defined on <replaceable
+ class="PARAMETER">table_name</replaceable>. This requires
+ <literal>ON CONFLICT UPDATE</> and <literal>ON CONFLICT
+ IGNORE</> to assume that all expected sources of uniqueness
+ violations originate within the columns/rows constrained by the
+ unique index. When this is omitted, (which is forbidden with
+ the <literal>ON CONFLICT UPDATE</> variant), the system checks
+ for sources of uniqueness violations ahead of time in all unique
+ indexes. Otherwise, only a single specified unique index is
+ checked ahead of time, and uniqueness violation errors can
+ appear for conflicts originating in any other unique index. If
+ a unique index cannot be inferred, an error is raised.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">expression_index</replaceable></term>
+ <listitem>
+ <para>
+ Equivalent to <replaceable
+ class="PARAMETER">column_name_index</replaceable>, but used to
+ infer a particular expressional index instead.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">index_condition</replaceable></term>
+ <listitem>
+ <para>
+ Used to allow inference of partial unique indexes.
</para>
</listitem>
</varlistentry>
@@ -167,12 +417,25 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</varlistentry>
<varlistentry>
+ <term><replaceable class="PARAMETER">condition</replaceable></term>
+ <listitem>
+ <para>
+ An expression that returns a value of type <type>boolean</type>.
+ Only rows for which this expression returns <literal>true</>
+ will be updated, although all rows will be locked when the
+ <literal>ON CONFLICT UPDATE</> path is taken.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+
<term><replaceable class="PARAMETER">output_expression</replaceable></term>
<listitem>
<para>
An expression to be computed and returned by the <command>INSERT</>
- command after each row is inserted. The expression can use any
- column names of the table named by <replaceable class="PARAMETER">table_name</replaceable>.
+ command after each row is inserted (not updated). The
+ expression can use any column names of the table named by
+ <replaceable class="PARAMETER">table_name</replaceable>.
Write <literal>*</> to return all columns of the inserted row(s).
</para>
</listitem>
@@ -198,20 +461,29 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<screen>
INSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
</screen>
+ However, in the event of an <literal>ON CONFLICT UPDATE</> clause
+ (but <emphasis>not</emphasis> in the event of an <literal>ON
+ CONFLICT IGNORE</> clause), the command tag reports the number of
+ rows inserted or updated together, of the form
+<screen>
+UPSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
+</screen>
The <replaceable class="parameter">count</replaceable> is the number
of rows inserted. If <replaceable class="parameter">count</replaceable>
is exactly one, and the target table has OIDs, then
<replaceable class="parameter">oid</replaceable> is the
- <acronym>OID</acronym> assigned to the inserted row. Otherwise
- <replaceable class="parameter">oid</replaceable> is zero.
+ <acronym>OID</acronym>
+ assigned to the inserted row (but not if there is only a single
+ updated row). Otherwise <replaceable
+ class="parameter">oid</replaceable> is zero..
</para>
<para>
If the <command>INSERT</> command contains a <literal>RETURNING</>
clause, the result will be similar to that of a <command>SELECT</>
statement containing the columns and values defined in the
- <literal>RETURNING</> list, computed over the row(s) inserted by the
- command.
+ <literal>RETURNING</> list, computed over the row(s) inserted or
+ updated by the command.
</para>
</refsect1>
@@ -311,7 +583,63 @@ WITH upd AS (
RETURNING *
)
INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
-</programlisting></para>
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Assumes a unique
+ index has been defined that constrains values appearing in the
+ <literal>did</literal> column. Note that an <varname>EXCLUDED</>
+ expression is used to reference values originally proposed for
+ insertion:
+<programlisting>
+ INSERT INTO distributors (did, dname)
+ VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc')
+ ON CONFLICT (did) UPDATE SET dname = EXCLUDED.dname
+</programlisting>
+ </para>
+ <para>
+ Insert a distributor, or do nothing for rows proposed for insertion
+ when an existing, excluded row (a row with a matching constrained
+ column or columns after before row insert triggers fire) exists.
+ Example assumes a unique index has been defined that constrains
+ values appearing in the <literal>did</literal> column (although
+ since the <literal>IGNORE</> variant was used, the specification of
+ columns to infer a unique index from is not mandatory):
+<programlisting>
+ INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH')
+ ON CONFLICT (did) IGNORE
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Example assumes
+ a unique index has been defined that constrains values appearing in
+ the <literal>did</literal> column. <literal>WHERE</> clause is
+ used to limit the rows actually updated (any existing row not
+ updated will still be locked, though):
+<programlisting>
+ -- Don't update existing distributors based in a certain ZIP code
+ INSERT INTO distributors (did, dname) VALUES (8, 'Anvil Distribution')
+ ON CONFLICT (did) UPDATE
+ SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'
+ WHERE TARGET.zipcode != '21201'
+</programlisting>
+ </para>
+ <para>
+ Insert new distributor if possible; otherwise
+ <literal>IGNORE</literal>. Example assumes a unique index has been
+ defined that constrains values appearing in the
+ <literal>did</literal> column on a subset of rows where the
+ <literal>is_active</literal> boolean column evaluates to
+ <literal>true</literal>:
+<programlisting>
+ -- This statement could infer a partial unique index on did
+ -- with a predicate of WHERE is_active, but it could also
+ -- just use a regular unique constraint on did if that was
+ -- all that was available.
+ INSERT INTO distributors (did, dname) VALUES (9, 'Antwerp Design')
+ ON CONFLICT (did WHERE is_active) IGNORE
+</programlisting>
+ </para>
</refsect1>
<refsect1>
@@ -321,7 +649,8 @@ INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
<command>INSERT</command> conforms to the SQL standard, except that
the <literal>RETURNING</> clause is a
<productname>PostgreSQL</productname> extension, as is the ability
- to use <literal>WITH</> with <command>INSERT</>.
+ to use <literal>WITH</> with <command>INSERT</>, and the ability to
+ specify an alternative path with <literal>ON CONFLICT</>.
Also, the case in
which a column name list is omitted, but not all the columns are
filled from the <literal>VALUES</> clause or <replaceable>query</>,
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
index 7c31871..1e0a2f8 100644
--- a/doc/src/sgml/ref/set_constraints.sgml
+++ b/doc/src/sgml/ref/set_constraints.sgml
@@ -69,7 +69,11 @@ SET CONSTRAINTS { ALL | <replaceable class="parameter">name</replaceable> [, ...
<para>
Currently, only <literal>UNIQUE</>, <literal>PRIMARY KEY</>,
<literal>REFERENCES</> (foreign key), and <literal>EXCLUDE</>
- constraints are affected by this setting.
+ constraints are affected by this setting. Note that constraints
+ that were created with this clause cannot be used as arbiters of
+ whether or not to take the alternative path with an
+ <command>INSERT</command> statement that includes an <literal>ON
+ CONFLICT UPDATE</> clause.
<literal>NOT NULL</> and <literal>CHECK</> constraints are
always checked immediately when a row is inserted or modified
(<emphasis>not</> at the end of the statement).
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f94aea1..5141690 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -40,14 +40,17 @@
On tables and foreign tables, triggers can be defined to execute either
before or after any <command>INSERT</command>, <command>UPDATE</command>,
or <command>DELETE</command> operation, either once per modified row,
- or once per <acronym>SQL</acronym> statement.
- <command>UPDATE</command> triggers can moreover be set to fire only if
- certain columns are mentioned in the <literal>SET</literal> clause of the
- <command>UPDATE</command> statement.
- Triggers can also fire for <command>TRUNCATE</command> statements.
- If a trigger event occurs, the trigger's function is called at the
- appropriate time to handle the event. Foreign tables do not support the
- TRUNCATE statement at all.
+ or once per <acronym>SQL</acronym> statement. If an
+ <command>INSERT</command> contains an <literal>ON CONFLICT UPDATE</>
+ clause, it is possible that the effects of a BEFORE insert trigger and
+ a BEFORE update trigger can both be applied twice, if a reference to
+ an <varname>EXCLUDED</> column appears. <command>UPDATE</command>
+ triggers can moreover be set to fire only if certain columns are
+ mentioned in the <literal>SET</literal> clause of the
+ <command>UPDATE</command> statement. Triggers can also fire for
+ <command>TRUNCATE</command> statements. If a trigger event occurs,
+ the trigger's function is called at the appropriate time to handle the
+ event. Foreign tables do not support the TRUNCATE statement at all.
</para>
<para>
@@ -119,6 +122,36 @@
</para>
<para>
+ If an <command>INSERT</command> contains an <literal>ON CONFLICT
+ UPDATE</> clause, it is possible that the effects of all row-level
+ <literal>BEFORE</> <command>INSERT</command> triggers and all
+ row-level BEFORE <command>UPDATE</command> triggers can both be
+ applied in a way that is apparent from the final state of the updated
+ row, if an <varname>EXCLUDED</> column is referenced. There need not
+ be an <varname>EXCLUDED</> column reference for both sets of BEFORE
+ row-level triggers to execute, though. The possibility of surprising
+ outcomes should be considered when there are both <literal>BEFORE</>
+ <command>INSERT</command> and <literal>BEFORE</>
+ <command>UPDATE</command> row-level triggers that both affect a row
+ being inserted/updated (this can still be problematic if the
+ modifications are more or less equivalent if they're not also
+ idempotent). Note that statement-level <command>UPDATE</command>
+ triggers are executed when <literal>ON CONFLICT UPDATE</> is
+ specified, regardless of whether or not any rows were affected by
+ the <command>UPDATE</command>. An <command>INSERT</command> with
+ an <literal>ON CONFLICT UPDATE</> clause will execute
+ statement-level <literal>BEFORE</> <command>INSERT</command>
+ triggers first, then statement-level <literal>BEFORE</>
+ <command>UPDATE</command> triggers, followed by statement-level
+ <literal>AFTER</> <command>UPDATE</command> triggers and finally
+ statement-level <literal>AFTER</> <command>INSERT</command>
+ triggers. <literal>ON CONFLICT UPDATE</> is not supported on
+ views (Only <literal>ON CONFLICT IGNORE</> is supported on
+ updatable views); therefore, unpredictable interactions with
+ <literal>INSTEAD OF</> triggers are not possible.
+ </para>
+
+ <para>
Trigger functions invoked by per-statement triggers should always
return <symbol>NULL</symbol>. Trigger functions invoked by per-row
triggers can return a table row (a value of
--
1.9.1
0007-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchtext/x-patch; charset=US-ASCII; name=0007-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchDownload
From 830af00d957e9a92a72cbdd613573cc05bcd0512 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:16:11 -0700
Subject: [PATCH 7/8] Internal documentation for INSERT ... ON CONFLICT {UPDATE
| IGNORE}
Includes documentation for executor README. A high-level handling of
approach #2 to value locking also appears there, since in contrast with
design #1, that is something that lives in the head of the executor.
---
src/backend/executor/README | 49 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 8afa1e3..0c351c5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -200,3 +200,52 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
that only the first result row of an SRF counts, because all subsequent
rows will result in attempts to re-update an already updated target row.
This is historical behavior and seems not worth changing.)
+
+Speculative insertion
+---------------------
+
+Speculative insertion is a process that the executor manages for the
+benefit of INSERT...ON CONFLICT UPDATE... . The basic idea is that
+values within AMs (that do not currently exist) are "speculatively
+locked". If a consensus to insert emerges among all unique indexes,
+we proceed with physical index tuple insertion for each unique index
+in turn, releasing value locks as each physical insertion is
+performed. Otherwise, we must UPDATE the existing value (or IGNORE).
+"Value locks" are implemented using special "speculative heap tuples",
+that represent an attempt to lock values (with special handling for
+race conditions).
+
+"Speculative insertion" is prepared to release "value locks" when a
+conflict occurs. This prevents "unprincipled deadlocks". In essence,
+we cannot allow other xacts to wait on our speculatively-inserted
+tuple as if it was a properly inserted tuple. They'd have to wait
+until xact end, which might be too long, while also implying
+"unprincipled deadlocks". We are prepared for conflicts both when
+"value locking", and when row locking.
+
+When we UPDATE, value locks are released before an opportunistic
+attempt at locking a conclusively visible conflicting tuple occurs. If
+this process fails, we retry. We may retry indefinitely. Failing to
+release value locks serves no practical purpose, since they don't
+prevent many types of conflicts that the UPDATE case must care about,
+and is actively harmful, since it will result in unprincipled
+deadlocking under high concurrency.
+
+The representation of the UPDATE query tree is as a separate query
+tree, auxiliary to the main INSERT query tree, and its plan is not
+formally a subplan of the parent INSERT's. Rather, the plan's state
+is used selectively by its parent.
+
+Having successfully locked a definitively visible tuple, we update it,
+applying the EvalPlanQual() query execution mechanism to the latest
+(at just determined by an amcanunique AM) conclusively visible, now
+locked tuple. Earlier versions are not evaluated against our qual,
+and we never directly walk the update chain in the event of the tuple
+being deleted/updated (which is conceptually a conflict). The process
+simply restarts without making useful progress in the present
+iteration. It is sometimes necessary to UPDATE a row where no row
+version is visible, so it seems inconsistent to require that earlier
+versions (including a version that may exist that is visible to our
+command's MVCC snapshot) must satisfy the qual just because there
+happened to be a version visible, where otherwise no evaluation would
+occur.
--
1.9.1
0006-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0006-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 05f039373613f85856c43f4dbcbd8bb0bb5c632f Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:11:15 -0700
Subject: [PATCH 6/8] Tests for INSERT ... ON CONFLICT {UPDATE | IGNORE}
Add dedicated isolation tests for both UPDATE and IGNORE variants,
illustrating the "MVCC violation" that allows a READ COMMITTED
transaction's UPDATE to succeed in updating a tuple with no version
visible to its command's MVCC snapshot. Add regression tests, which for
the most part are intended to exercise interactions with other features
(e.g. updatable views, inheritance, triggers, RLS).
Add a few general purpose smoke tests too, testing everything from
EXPLAIN output to unique index inference (expression indexes, partial
indexes, etc).
---
contrib/postgres_fdw/expected/postgres_fdw.out | 7 +
contrib/postgres_fdw/sql/postgres_fdw.sql | 3 +
.../isolation/expected/insert-conflict-ignore.out | 23 ++
.../expected/insert-conflict-update-2.out | 23 ++
.../expected/insert-conflict-update-3.out | 26 +++
.../isolation/expected/insert-conflict-update.out | 23 ++
src/test/isolation/isolation_schedule | 4 +
.../isolation/specs/insert-conflict-ignore.spec | 41 ++++
.../isolation/specs/insert-conflict-update-2.spec | 41 ++++
.../isolation/specs/insert-conflict-update-3.spec | 69 ++++++
.../isolation/specs/insert-conflict-update.spec | 40 ++++
src/test/regress/expected/insert_conflict.out | 242 +++++++++++++++++++++
src/test/regress/expected/privileges.out | 7 +-
src/test/regress/expected/rowsecurity.out | 96 ++++++++
src/test/regress/expected/rules.out | 21 ++
src/test/regress/expected/subselect.out | 22 ++
src/test/regress/expected/triggers.out | 102 ++++++++-
src/test/regress/expected/updatable_views.out | 4 +
src/test/regress/expected/update.out | 27 +++
src/test/regress/expected/with.out | 74 +++++++
src/test/regress/input/constraints.source | 5 +
src/test/regress/output/constraints.source | 15 +-
src/test/regress/parallel_schedule | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/insert_conflict.sql | 192 ++++++++++++++++
src/test/regress/sql/privileges.sql | 5 +-
src/test/regress/sql/rowsecurity.sql | 74 +++++++
src/test/regress/sql/rules.sql | 14 ++
src/test/regress/sql/subselect.sql | 14 ++
src/test/regress/sql/triggers.sql | 69 +++++-
src/test/regress/sql/updatable_views.sql | 2 +
src/test/regress/sql/update.sql | 14 ++
src/test/regress/sql/with.sql | 37 ++++
33 files changed, 1330 insertions(+), 8 deletions(-)
create mode 100644 src/test/isolation/expected/insert-conflict-ignore.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-2.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-3.out
create mode 100644 src/test/isolation/expected/insert-conflict-update.out
create mode 100644 src/test/isolation/specs/insert-conflict-ignore.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-2.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-3.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update.spec
create mode 100644 src/test/regress/expected/insert_conflict.out
create mode 100644 src/test/regress/sql/insert_conflict.sql
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 583cce7..5133386 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2327,6 +2327,13 @@ INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
ERROR: duplicate key value violates unique constraint "t1_pkey"
DETAIL: Key ("C 1")=(11) already exists.
CONTEXT: Remote SQL command: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
ERROR: new row for relation "T 1" violates check constraint "c2positive"
DETAIL: Failing row contains (1111, -2, null, null, null, null, ft1 , null).
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83e8fa7..e01d34e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -372,6 +372,9 @@ UPDATE ft2 SET c2 = c2 + 600 WHERE c1 % 10 = 8 AND c1 < 1200 RETURNING *;
ALTER TABLE "S 1"."T 1" ADD CONSTRAINT c2positive CHECK (c2 >= 0);
INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
UPDATE ft1 SET c2 = -c2 WHERE c1 = 1; -- c2positive
diff --git a/src/test/isolation/expected/insert-conflict-ignore.out b/src/test/isolation/expected/insert-conflict-ignore.out
new file mode 100644
index 0000000..e6cc2a1
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-ignore.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: ignore1 ignore2 c1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step c1: COMMIT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore1
+step c2: COMMIT;
+
+starting permutation: ignore1 ignore2 a1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step a1: ABORT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-2.out b/src/test/isolation/expected/insert-conflict-update-2.out
new file mode 100644
index 0000000..6a5ddfe
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-2.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-3.out b/src/test/isolation/expected/insert-conflict-update-3.out
new file mode 100644
index 0000000..29dd8b0
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-3.out
@@ -0,0 +1,26 @@
+Parsed test spec with 2 sessions
+
+starting permutation: update2 insert1 c2 select1surprise c1
+step update2: UPDATE colors SET is_active = true WHERE key = 1;
+step insert1:
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key; <waiting ...>
+step c2: COMMIT;
+step insert1: <... completed>
+key color is_active
+
+1 Red f
+2 Green f
+3 Blue f
+step select1surprise: SELECT * FROM colors ORDER BY key;
+key color is_active
+
+1 Brown t
+2 Green f
+3 Blue f
+step c1: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update.out b/src/test/isolation/expected/insert-conflict-update.out
new file mode 100644
index 0000000..6976124
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index c055a53..50948a2 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -16,6 +16,10 @@ test: fk-deadlock2
test: eval-plan-qual
test: lock-update-delete
test: lock-update-traversal
+test: insert-conflict-ignore
+test: insert-conflict-update
+test: insert-conflict-update-2
+test: insert-conflict-update-3
test: delete-abort-savept
test: delete-abort-savept-2
test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/insert-conflict-ignore.spec b/src/test/isolation/specs/insert-conflict-ignore.spec
new file mode 100644
index 0000000..fde43b3
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-ignore.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT IGNORE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions during INSERT...ON CONFLICT IGNORE.
+#
+# The convention here is that session 1 always ends up inserting, and session 2
+# always ends up ignoring.
+
+setup
+{
+ CREATE TABLE ints (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE ints;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore1" { INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore2" { INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; }
+step "select2" { SELECT * FROM ints; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# Regular case where one session block-waits on another to determine if it
+# should proceed with an insert or ignore.
+permutation "ignore1" "ignore2" "c1" "select2" "c2"
+permutation "ignore1" "ignore2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-2.spec b/src/test/isolation/specs/insert-conflict-update-2.spec
new file mode 100644
index 0000000..3e6e944
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-2.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test shows a plausible scenario in which the user might wish to UPDATE a
+# value that is also constrained by the unique index that is the arbiter of
+# whether the alternative path should be taken.
+
+setup
+{
+ CREATE TABLE upsert (key text not null, payload text);
+ CREATE UNIQUE INDEX ON upsert(lower(key));
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. The user can still usefully UPDATE
+# a column constrained by a unique index, as the example illustrates.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-3.spec b/src/test/isolation/specs/insert-conflict-update-3.spec
new file mode 100644
index 0000000..94ae3df
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-3.spec
@@ -0,0 +1,69 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# Other INSERT...ON CONFLICT UPDATE isolation tests illustrate the "MVCC
+# violation" added to facilitate the feature, whereby a
+# not-visible-to-our-snapshot tuple can be updated by our command all the same.
+# This is generally needed to provide a guarantee of a successful INSERT or
+# UPDATE in READ COMMITTED mode. This MVCC violation is quite distinct from
+# the putative "MVCC violation" that has existed in PostgreSQL for many years,
+# the EvalPlanQual() mechanism, because that mechanism always starts from a
+# tuple that is visible to the command's MVCC snapshot. This test illustrates
+# a slightly distinct user-visible consequence of the same MVCC violation
+# generally associated with INSERT...ON CONFLICT UPDATE. The impact of the
+# MVCC violation goes a little beyond updating MVCC-invisible tuples.
+#
+# With INSERT...ON CONFLICT UPDATE, the UPDATE predicate is only evaluated
+# once, on this conclusively-locked tuple, and not any other version of the
+# same tuple. It is therefore possible (in READ COMMITTED mode) that the
+# predicate "fail to be satisfied" according to the command's MVCC snapshot.
+# It might simply be that there is no row version visible, but it's also
+# possible that there is some row version visible, but only as a version that
+# doesn't satisfy the predicate. If, however, the conclusively-locked version
+# satisfies the predicate, that's good enough, and the tuple is updated. The
+# MVCC-snapshot-visible row version is denied the opportunity to prevent the
+# UPDATE from taking place, because we don't walk the UPDATE chain in the usual
+# way.
+
+setup
+{
+ CREATE TABLE colors (key int4 PRIMARY KEY, color text, is_active boolean);
+ INSERT INTO colors (key, color, is_active) VALUES(1, 'Red', false);
+ INSERT INTO colors (key, color, is_active) VALUES(2, 'Green', false);
+ INSERT INTO colors (key, color, is_active) VALUES(3, 'Blue', false);
+}
+
+teardown
+{
+ DROP TABLE colors;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" {
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key;}
+step "select1surprise" { SELECT * FROM colors ORDER BY key; }
+step "c1" { COMMIT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "update2" { UPDATE colors SET is_active = true WHERE key = 1; }
+step "c2" { COMMIT; }
+
+# Perhaps surprisingly, the session 1 MVCC-snapshot-visible tuple (the tuple
+# with the pre-populated color 'Red') is denied the opportunity to prevent the
+# UPDATE from taking place -- only the conclusively-locked tuple version
+# matters, and so the tuple with key value 1 was updated to 'Brown' (but not
+# tuple with key value 2, since nothing changed there):
+permutation "update2" "insert1" "c2" "select1surprise" "c1"
diff --git a/src/test/isolation/specs/insert-conflict-update.spec b/src/test/isolation/specs/insert-conflict-update.spec
new file mode 100644
index 0000000..6529a0c
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update.spec
@@ -0,0 +1,40 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions.
+
+setup
+{
+ CREATE TABLE upsert (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. Notably, this entails updating a
+# tuple while there is no version of that tuple visible to the updating
+# session's snapshot. This is permitted only in READ COMMITTED mode.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
new file mode 100644
index 0000000..bd35585
--- /dev/null
+++ b/src/test/regress/expected/insert_conflict.out
@@ -0,0 +1,242 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+ QUERY PLAN
+---------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+(3 rows)
+
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+ QUERY PLAN
+---------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+ Filter: (fruit <> 'Cawesh'::text)
+(4 rows)
+
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+ QUERY PLAN
+----------------------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+ Filter: ((excluded.fruit) <> 'Elderberry'::text)
+(4 rows)
+
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+ QUERY PLAN
+--------------------------------------------------
+ [ +
+ { +
+ "Plan": { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Insert", +
+ "Relation Name": "insertconflicttest", +
+ "Alias": "insertconflicttest", +
+ "Arbiter Index": "key_index", +
+ "Plans": [ +
+ { +
+ "Node Type": "Result", +
+ "Parent Relationship": "Member" +
+ }, +
+ { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Conflict Update", +
+ "Parent Relationship": "Member", +
+ "Relation Name": "insertconflicttest",+
+ "Alias": "insertconflicttest", +
+ "Filter": "(fruit <> 'Lime'::text)" +
+ } +
+ ] +
+ } +
+ } +
+ ]
+(1 row)
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+ERROR: ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from
+LINE 1: ...nsert into insertconflicttest values (1, 'Apple') on conflic...
+ ^
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+ERROR: invalid reference to FROM-clause entry for table "insertconflicttest"
+LINE 1: ...(1, 'Apple') on conflict (key) update set fruit = insertconf...
+ ^
+HINT: Perhaps you meant to reference the table alias "excluded".
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index key_index;
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index comp_key_index;
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_key_index;
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_comp_key_index;
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+ERROR: duplicate key value violates unique constraint "fruit_index"
+DETAIL: Key (fruit)=(Peach) already exists.
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+drop index key_index;
+drop index fruit_index;
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+ERROR: partial arbiter unique index has predicate that does not cover tuple proposed for insertion
+DETAIL: ON CONFLICT inference clause implies that the tuple proposed for insertion actually be covered by partial predicate for index "partial_key_index".
+HINT: ON CONFLICT inference clause must infer a unique index that covers the final tuple, after BEFORE ROW INSERT triggers fire.
+drop index partial_key_index;
+-- Cleanup
+drop table insertconflicttest;
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+create table capitals (
+ state char(2)
+) inherits (cities);
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+-- Tests proper for inheritance:
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+-- Succeeds:
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 5359dd8..213076b 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -269,7 +269,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
ERROR: permission denied for relation atest2
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -367,6 +367,11 @@ UPDATE atest5 SET one = 8; -- fail
ERROR: permission denied for relation atest5
UPDATE atest5 SET three = 5, one = 2; -- fail
ERROR: permission denied for relation atest5
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+ERROR: permission denied for relation atest5
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
+ERROR: permission denied for relation atest5
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
GRANT SELECT (one,two,blue) ON atest6 TO regressuser4;
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 1bb3132..2fe2631 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -1180,6 +1180,102 @@ NOTICE: f_leak => yyyyyy
(3 rows)
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though):
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+DETAIL: Failing row contains (22, 11, 2, rls_regress_user2, mediocre novel).
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------
+ 33 | 22 | 1 | rls_regress_user1 | okay science fiction
+(1 row)
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+DETAIL: Failing row contains (33, 22, 1, rls_regress_user3, okay science fiction).
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+SET SESSION AUTHORIZATION rls_regress_user1;
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+ERROR: new row violates WITH CHECK OPTION for "document"
+DETAIL: Failing row contains (2, 11, 2, rls_regress_user2, my first novel).
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------
+ 2 | 11 | 2 | rls_regress_user1 | my first novel
+(1 row)
+
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+------------------
+ 78 | 11 | 1 | rls_regress_user1 | some other novel
+(1 row)
+
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+DETAIL: Failing row contains (78, 33, 1, rls_regress_user1, some other novel).
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+ERROR: new row violates WITH CHECK OPTION for "document"
+DETAIL: Failing row contains (78, 11, 1, rls_regress_user2, some other novel).
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------------------
+ 88 | 33 | 1 | rls_regress_user1 | technology book, can only insert
+(1 row)
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 80c3351..ce016c0 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1123,6 +1123,10 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
SELECT * FROM shoelace_obsolete WHERE sl_avail = 0;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
+ERROR: ON CONFLICT is not supported with rules
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
------------+----------+------------+--------+----------+-----------
@@ -2352,6 +2356,23 @@ DETAIL: Key (id3a, id3c)=(1, 13) is not present in table "rule_and_refint_t2".
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
+ERROR: relation "shoelace" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
where (((rule_and_refint_t3.id3a = new.id3a)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index b14410f..9ba3a44 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -639,6 +639,28 @@ from
(0 rows)
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 1: ...conflict (key) update set val = 'unsupported ' || (select f1...
+ ^
+select * from upsert;
+ key | val
+-----+-----
+ 1 | val
+(1 row)
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: on conflict (key) update set val = (select u from aa)
+ ^
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
create temp table outer_7597 (f1 int4, f2 int4);
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index f1a5fde..77dfa06 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -274,7 +274,7 @@ drop sequence ttdummy_seq;
-- tests for per-statement triggers
--
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
BEGIN
@@ -291,6 +291,14 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
--
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
+NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+NOTICE: trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
INSERT INTO main_table DEFAULT VALUES;
@@ -305,6 +313,8 @@ NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, lev
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
@@ -1731,3 +1741,93 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (1,black)
+WARNING: after insert (new): (1,black)
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (2,red)
+WARNING: before insert (new, modified): (3,"red trig modified")
+WARNING: after insert (new): (3,"red trig modified")
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (3,orange)
+WARNING: before update (old): (3,"red trig modified")
+WARNING: before update (new): (3,"updated red trig modified")
+WARNING: after update (old): (3,"updated red trig modified")
+WARNING: after update (new): (3,"updated red trig modified")
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (4,green)
+WARNING: before insert (new, modified): (5,"green trig modified")
+WARNING: after insert (new): (5,"green trig modified")
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (5,purple)
+WARNING: before update (old): (5,"green trig modified")
+WARNING: before update (new): (5,"updated green trig modified")
+WARNING: after update (old): (5,"updated green trig modified")
+WARNING: after update (new): (5,"updated green trig modified")
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (6,white)
+WARNING: before insert (new, modified): (7,"white trig modified")
+WARNING: after insert (new): (7,"white trig modified")
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (7,pink)
+WARNING: before update (old): (7,"white trig modified")
+WARNING: before update (new): (7,"updated white trig modified")
+WARNING: after update (old): (7,"updated white trig modified")
+WARNING: after update (new): (7,"updated white trig modified")
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (8,yellow)
+WARNING: before insert (new, modified): (9,"yellow trig modified")
+WARNING: after insert (new): (9,"yellow trig modified")
+select * from upsert;
+ key | color
+-----+-----------------------------
+ 1 | black
+ 3 | updated red trig modified
+ 5 | updated green trig modified
+ 7 | updated white trig modified
+ 9 | yellow trig modified
+(5 rows)
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 80c5706..22b5bc1 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -215,6 +215,10 @@ INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
DETAIL: View columns that are not columns of their base relation are not updatable.
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
+ERROR: relation "rw_view15" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index 1de2a86..58714ac 100644
--- a/src/test/regress/expected/update.out
+++ b/src/test/regress/expected/update.out
@@ -147,4 +147,31 @@ SELECT a, b, char_length(c) FROM update_test;
42 | 12 | 10000
(4 rows)
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a NOT IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE EXISTS(SELECT b FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ALL(SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ANY(SELECT a FROM update_test);
+ ^
DROP TABLE update_test;
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index 06b372b..81d664e 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -1806,6 +1806,80 @@ SELECT * FROM y;
-400
(22 rows)
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+ k | v | a
+---+--------+---
+ 0 | insert | 0
+ 0 | insert | 0
+(2 rows)
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+ k | v
+----+------------------
+ 0 | insert
+ 1 | 1 v, now update
+ 2 | insert
+ 3 | insert
+ 4 | 4 v, now update
+ 5 | insert
+ 6 | insert
+ 7 | 7 v, now update
+ 8 | insert
+ 9 | insert
+ 10 | 10 v, now update
+ 11 | insert
+ 12 | insert
+ 13 | 13 v, now update
+ 14 | insert
+ 15 | insert
+ 16 | 16 v, now update
+(17 rows)
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ...ICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a ...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+DROP TABLE z;
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
INSERT INTO y SELECT generate_series(1, 3);
diff --git a/src/test/regress/input/constraints.source b/src/test/regress/input/constraints.source
index 8ec0054..46bce36 100644
--- a/src/test/regress/input/constraints.source
+++ b/src/test/regress/input/constraints.source
@@ -292,6 +292,11 @@ INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+
SELECT '' AS five, * FROM UNIQUE_TBL;
DROP TABLE UNIQUE_TBL;
diff --git a/src/test/regress/output/constraints.source b/src/test/regress/output/constraints.source
index 0d32a9eab..add3f0c 100644
--- a/src/test/regress/output/constraints.source
+++ b/src/test/regress/output/constraints.source
@@ -421,16 +421,23 @@ INSERT INTO UNIQUE_TBL VALUES (4, 'four');
INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+ERROR: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
+HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
SELECT '' AS five, * FROM UNIQUE_TBL;
- five | i | t
-------+---+-------
+ five | i | t
+------+---+--------------------
| 1 | one
| 2 | two
| 4 | four
- | 5 | one
| | six
| | seven
-(6 rows)
+ | 5 | five-upsert-update
+ | 6 | six-upsert-insert
+(7 rows)
DROP TABLE UNIQUE_TBL;
CREATE TABLE UNIQUE_TBL (i int, t text,
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e0ae2f2..528d3b7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -36,6 +36,7 @@ test: geometry horology regex oidjoins type_sanity opr_sanity
# These four each depend on the previous one
# ----------
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7f762bd..b7c8f53 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -50,6 +50,7 @@ test: oidjoins
test: type_sanity
test: opr_sanity
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql
new file mode 100644
index 0000000..472d4ab
--- /dev/null
+++ b/src/test/regress/sql/insert_conflict.sql
@@ -0,0 +1,192 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+
+drop index key_index;
+
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index comp_key_index;
+
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index expr_key_index;
+
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+
+drop index expr_comp_key_index;
+
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index key_index;
+drop index fruit_index;
+
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+
+drop index partial_key_index;
+
+-- Cleanup
+drop table insertconflicttest;
+
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+
+create table capitals (
+ state char(2)
+) inherits (cities);
+
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+
+-- Tests proper for inheritance:
+
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+
+-- Succeeds:
+
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index a0ff953..b25596a 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -194,7 +194,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -245,6 +245,9 @@ INSERT INTO atest5 VALUES (5,5,5); -- fail
UPDATE atest5 SET three = 10; -- ok
UPDATE atest5 SET one = 8; -- fail
UPDATE atest5 SET three = 5, one = 2; -- fail
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index ed7adbf..92734a3 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -436,6 +436,80 @@ DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1;
DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1;
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though):
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+
+SET SESSION AUTHORIZATION rls_regress_user1;
+
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index 1e15f84..7cb5f39 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -680,6 +680,9 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
SELECT * FROM shoelace_candelete;
@@ -844,6 +847,17 @@ insert into rule_and_refint_t3 values (1, 12, 11, 'row3');
insert into rule_and_refint_t3 values (1, 12, 12, 'row4');
insert into rule_and_refint_t3 values (1, 11, 13, 'row5');
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 4be2e40..2be9cb7 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -374,6 +374,20 @@ from
int4_tbl i4 on dummy = i4.f1;
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+
+select * from upsert;
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index 0ea2c31..323ca1a 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -208,7 +208,7 @@ drop sequence ttdummy_seq;
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
5 10
@@ -237,6 +237,12 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
@@ -246,6 +252,9 @@ UPDATE main_table SET a = a + 1 WHERE b < 30;
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
+
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
30 40
@@ -1173,3 +1182,61 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+
+select * from upsert;
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 60c7e29..48dd9a9 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -69,6 +69,8 @@ DELETE FROM rw_view14 WHERE a=3; -- should be OK
-- Partially updatable view
INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
UPDATE rw_view15 SET upper='ROW 3' WHERE a=3; -- should fail
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
index e71128c..903f3fb 100644
--- a/src/test/regress/sql/update.sql
+++ b/src/test/regress/sql/update.sql
@@ -74,4 +74,18 @@ UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+
DROP TABLE update_test;
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index c716369..8d49384 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -795,6 +795,43 @@ SELECT * FROM t LIMIT 10;
SELECT * FROM y;
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+
+DROP TABLE z;
+
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
--
1.9.1
0005-RLS-support-for-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0005-RLS-support-for-ON-CONFLICT-UPDATE.patchDownload
From 6e90a49480228f09dd587d1a52793a5b0f7e0f5f Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 6 Jan 2015 16:32:21 -0800
Subject: [PATCH 5/8] RLS support for ON CONFLICT UPDATE
Row-Level Security policies may apply to UPDATE commands or INSERT
commands only. UPDATE RLS policies can have both USING() security
barrier quals, and CHECK options (INSERT RLS policies may only have
CHECK options, though). It is necessary to carefully consider the
behavior of RLS policies in the context of INSERT with ON CONFLICT
UPDATE, since ON CONFLICT UPDATE is more or less a new top-level
command, conceptually quite different to two separate statements (an
INSERT and an UPDATE).
The approach taken is to "bunch together" both sets of policies, and to
enforce them in 3 different places against three different slots (3
different stages of query processing in the executor).
Note that UPDATE policy USING() barrier quals are always treated as
CHECK options. It is thought that silently failing when USING() barrier
quals are not satisfied is a more surprising outcome, even if it is
closer to the existing behavior of UPDATE statements. This is because
the user's intent to UPDATE one particular row based on simple criteria
is quite clear with ON CONFLICT UPDATE.
The 3 places that RLS policies are enforced are:
* Against row actually inserted, after insertion proceeds successfully
(INSERT-applicable policies only).
* Against row in target table that caused conflict. The implementation
is careful not to leak the contents of that row in diagnostic
messages (INSERT-applicable *and* UPDATE-applicable policies).
* Against the version of the row added by to the relation after
ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
policies).
Documentation and tests follow in later commits.
---
src/backend/executor/execMain.c | 16 ++++--
src/backend/executor/nodeModifyTable.c | 53 ++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/nodes/outfuncs.c | 1 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/rewrite/rewriteHandler.c | 2 +
src/backend/rewrite/rowsecurity.c | 94 +++++++++++++++++++++++++++++-----
src/include/executor/executor.h | 3 +-
src/include/nodes/parsenodes.h | 1 +
10 files changed, 153 insertions(+), 20 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 6f0c5ab..2ef5dcd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1676,7 +1676,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
*/
void
ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate)
+ TupleTableSlot *slot, bool detail,
+ bool onlyInsert, EState *estate)
{
ExprContext *econtext;
ListCell *l1,
@@ -1699,6 +1700,15 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
ExprState *wcoExpr = (ExprState *) lfirst(l2);
/*
+ * INSERT ... ON CONFLICT UPDATE callers may require that not all WITH
+ * CHECK OPTIONs associated with resultRelInfo are enforced at all
+ * stages of query processing. (UPDATE-related policies are not
+ * enforced in respect of a successfully inserted tuple).
+ */
+ if (onlyInsert && wco->commandType == CMD_UPDATE)
+ continue;
+
+ /*
* WITH CHECK OPTION checks are intended to ensure that the new tuple
* is visible (in the case of a view) or that it passes the
* 'with-check' policy (in the case of row security).
@@ -1712,10 +1722,10 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
(errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION),
errmsg("new row violates WITH CHECK OPTION for \"%s\"",
wco->viewname),
- errdetail("Failing row contains %s.",
+ detail? errdetail("Failing row contains %s.",
ExecBuildSlotValueDescription(slot,
RelationGetDescr(resultRelInfo->ri_RelationDesc),
- 64))));
+ 64)):0));
}
}
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1603c45..90236ce 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -458,7 +458,8 @@ vlock:
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, spec == SPEC_INSERT,
+ estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -952,7 +953,7 @@ lreplace:;
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, false, estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -1148,6 +1149,54 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+ /*
+ * For RLS with ON CONFLICT UPDATE, security quals are always
+ * treated as WITH CHECK options, even when there were separate
+ * security quals and explicit WITH CHECK options (ordinarily,
+ * security quals are only treated as WITH CHECK options when there
+ * are no explicit WITH CHECK options). Also, CHECK OPTIONs
+ * (originating either explicitly, or implicitly as security quals)
+ * for both UPDATE and INSERT policies (or ALL policies) are
+ * checked (as CHECK OPTIONs) at three different points for three
+ * distinct but related tuples/slots in the context of ON CONFLICT
+ * UPDATE. There are three relevant ExecWithCheckOptions() calls:
+ *
+ * * After successful insertion, within ExecInsert(), against the
+ * inserted tuple. This only includes INSERT-applicable policies.
+ *
+ * * Here, after row locking but before calling ExecUpdate(), on
+ * the existing tuple in the target relation (which we cannot leak
+ * details of). This is conceptually like a security barrier qual
+ * for the purposes of the auxiliary update, although unlike
+ * regular updates that require security barrier quals we prefer to
+ * raise an error (by treating the security barrier quals as CHECK
+ * OPTIONS) rather than silently not affect rows, because the
+ * intent to update seems clear and unambiguous for ON CONFLICT
+ * UPDATE. This includes both INSERT-applicable and
+ * UPDATE-applicable policies.
+ *
+ * * On the final tuple created by the update within ExecUpdate (if
+ * any). This is also subject to INSERT policy enforcement, unlike
+ * conventional ExecUpdate() calls for UPDATE statements -- it
+ * includes both INSERT-applicable and UPDATE-applicable policies.
+ */
+ if (resultRelInfo->ri_WithCheckOptions != NIL)
+ {
+ TupleTableSlot *opts;
+
+ /* Construct temp slot for locked tuple from target */
+ opts = MakeSingleTupleTableSlot(slot->tts_tupleDescriptor);
+ ExecStoreTuple(copyTuple, opts, InvalidBuffer, false);
+
+ /*
+ * Check, but without leaking contents of tuple; user only
+ * supplied one conflicting value or composition of values, and
+ * not the entire tuple.
+ */
+ ExecWithCheckOptions(resultRelInfo, opts, false, false,
+ estate);
+ }
+
if (!TupIsNull(slot))
*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
planSlot, &onConflict->mt_epqstate,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df611d2..5c091e1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2074,6 +2074,7 @@ _copyWithCheckOption(const WithCheckOption *from)
COPY_STRING_FIELD(viewname);
COPY_NODE_FIELD(qual);
+ COPY_SCALAR_FIELD(commandType);
COPY_SCALAR_FIELD(cascaded);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 24e58fa..4057c27 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2384,6 +2384,7 @@ _equalWithCheckOption(const WithCheckOption *a, const WithCheckOption *b)
{
COMPARE_STRING_FIELD(viewname);
COMPARE_NODE_FIELD(qual);
+ COMPARE_SCALAR_FIELD(commandType);
COMPARE_SCALAR_FIELD(cascaded);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 34e9163..d077882 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2336,6 +2336,7 @@ _outWithCheckOption(StringInfo str, const WithCheckOption *node)
WRITE_STRING_FIELD(viewname);
WRITE_NODE_FIELD(qual);
+ WRITE_ENUM_FIELD(commandType, CmdType);
WRITE_BOOL_FIELD(cascaded);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b471bbf..30b0eca 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -272,6 +272,7 @@ _readWithCheckOption(void)
READ_STRING_FIELD(viewname);
READ_NODE_FIELD(qual);
+ READ_ENUM_FIELD(commandType, CmdType);
READ_BOOL_FIELD(cascaded);
READ_DONE();
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 3db5165..a0df1e8 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1763,6 +1763,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs, bool forUpdatePushedDown)
List *quals = NIL;
wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->commandType = parsetree->commandType;
quals = lcons(wco->qual, quals);
activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs);
@@ -2931,6 +2932,7 @@ rewriteTargetView(Query *parsetree, Relation view)
wco->viewname = pstrdup(RelationGetRelationName(view));
wco->qual = NULL;
wco->cascaded = cascaded;
+ wco->commandType = viewquery->commandType;
parsetree->withCheckOptions = lcons(wco,
parsetree->withCheckOptions);
diff --git a/src/backend/rewrite/rowsecurity.c b/src/backend/rewrite/rowsecurity.c
index 35790a9..d09e482 100644
--- a/src/backend/rewrite/rowsecurity.c
+++ b/src/backend/rewrite/rowsecurity.c
@@ -55,12 +55,14 @@
#include "utils/syscache.h"
#include "tcop/utility.h"
-static List *pull_row_security_policies(CmdType cmd, Relation relation,
- Oid user_id);
+static List *pull_row_security_policies(CmdType cmd, bool onConflict,
+ Relation relation, Oid user_id);
static void process_policies(List *policies, int rt_index,
Expr **final_qual,
Expr **final_with_check_qual,
- bool *hassublinks);
+ bool *hassublinks,
+ Expr **spec_with_check_eval,
+ bool onConflict);
static bool check_role_for_policy(ArrayType *policy_roles, Oid user_id);
/*
@@ -87,6 +89,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
Expr *rowsec_with_check_expr = NULL;
Expr *hook_expr = NULL;
Expr *hook_with_check_expr = NULL;
+ Expr *hook_spec_with_check_expr = NULL;
List *rowsec_policies;
List *hook_policies = NIL;
@@ -148,8 +151,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Grab the built-in policies which should be applied to this relation. */
rel = heap_open(rte->relid, NoLock);
- rowsec_policies = pull_row_security_policies(root->commandType, rel,
- user_id);
+ rowsec_policies = pull_row_security_policies(root->commandType,
+ root->specClause == SPEC_INSERT,
+ rel, user_id);
/*
* Check if this is only the default-deny policy.
@@ -167,7 +171,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Now that we have our policies, build the expressions from them. */
process_policies(rowsec_policies, rt_index, &rowsec_expr,
- &rowsec_with_check_expr, &hassublinks);
+ &rowsec_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
/*
* Also, allow extensions to add their own policies.
@@ -197,7 +203,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Build the expression from any policies returned. */
process_policies(hook_policies, rt_index, &hook_expr,
- &hook_with_check_expr, &hassublinks);
+ &hook_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
}
/*
@@ -229,6 +237,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) rowsec_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
@@ -243,6 +252,23 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) hook_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
+ root->withCheckOptions = lcons(wco, root->withCheckOptions);
+ }
+
+ /*
+ * Also add the expression, if any, returned from the extension that
+ * applies to auxiliary UPDATE within ON CONFLICT UPDATE.
+ */
+ if (hook_spec_with_check_expr)
+ {
+ WithCheckOption *wco;
+
+ wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->viewname = RelationGetRelationName(rel);
+ wco->qual = (Node *) hook_spec_with_check_expr;
+ wco->cascaded = false;
+ wco->commandType = CMD_UPDATE;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
}
@@ -288,7 +314,8 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
*
*/
static List *
-pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
+pull_row_security_policies(CmdType cmd, bool onConflict, Relation relation,
+ Oid user_id)
{
List *policies = NIL;
ListCell *item;
@@ -322,7 +349,9 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
if (policy->cmd == ACL_INSERT_CHR
&& check_role_for_policy(policy->roles, user_id))
policies = lcons(policy, policies);
- break;
+ if (!onConflict)
+ break;
+ /* FALL THRU */
case CMD_UPDATE:
if (policy->cmd == ACL_UPDATE_CHR
&& check_role_for_policy(policy->roles, user_id))
@@ -384,26 +413,41 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
*/
static void
process_policies(List *policies, int rt_index, Expr **qual_eval,
- Expr **with_check_eval, bool *hassublinks)
+ Expr **with_check_eval, bool *hassublinks,
+ Expr **spec_with_check_eval, bool onConflict)
{
ListCell *item;
List *quals = NIL;
List *with_check_quals = NIL;
+ List *conflict_update_quals = NIL;
/*
* Extract the USING and WITH CHECK quals from each of the policies
- * and add them to our lists.
+ * and add them to our lists. CONFLICT UPDATE quals are always treated
+ * as CHECK OPTIONS.
*/
foreach(item, policies)
{
RowSecurityPolicy *policy = (RowSecurityPolicy *) lfirst(item);
if (policy->qual != NULL)
- quals = lcons(copyObject(policy->qual), quals);
+ {
+ if (!onConflict || policy->cmd != ACL_UPDATE_CHR)
+ quals = lcons(copyObject(policy->qual), quals);
+ else
+ conflict_update_quals = lcons(copyObject(policy->qual), quals);
+ }
if (policy->with_check_qual != NULL)
- with_check_quals = lcons(copyObject(policy->with_check_qual),
- with_check_quals);
+ {
+ if (!onConflict || policy->cmd != ACL_UPDATE_CHR)
+ with_check_quals = lcons(copyObject(policy->with_check_qual),
+ with_check_quals);
+ else
+ conflict_update_quals =
+ lcons(copyObject(policy->with_check_qual),
+ conflict_update_quals);
+ }
if (policy->hassublinks)
*hassublinks = true;
@@ -420,6 +464,10 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
/*
* If we end up with only USING quals, then use those as
* WITH CHECK quals also.
+ *
+ * For the INSERT with ON CONFLICT UPDATE case, we always enforce that the
+ * UPDATE's USING quals are treated like WITH CHECK quals, enforced against
+ * the target relation's tuple in multiple places.
*/
if (with_check_quals == NIL)
with_check_quals = copyObject(quals);
@@ -453,6 +501,24 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
else
*with_check_eval = (Expr*) linitial(with_check_quals);
+ /*
+ * For INSERT with ON CONFLICT UPDATE, *both* sets of WITH CHECK options
+ * (from any INSERT policy and any UPDATE policy) are enforced.
+ *
+ * These are handled separately because enforcement of each type of WITH
+ * CHECK option is based on the point in query processing of INSERT ... ON
+ * CONFLICT UPDATE. The INSERT path does not enforce UPDATE related CHECK
+ * OPTIONs.
+ */
+ if (conflict_update_quals != NIL)
+ {
+ if (list_length(conflict_update_quals) > 1)
+ *spec_with_check_eval = makeBoolExpr(AND_EXPR,
+ conflict_update_quals, -1);
+ else
+ *spec_with_check_eval = (Expr*) linitial(conflict_update_quals);
+ }
+
return;
}
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 9400801..6c535da 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -195,7 +195,8 @@ extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
extern void ExecConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
extern void ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate);
+ TupleTableSlot *slot, bool detail, bool onlyInsert,
+ EState *estate);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti);
extern ExecAuxRowMark *ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist);
extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 9ae3bb5..6447f45 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -868,6 +868,7 @@ typedef struct WithCheckOption
NodeTag type;
char *viewname; /* name of view that specified the WCO */
Node *qual; /* constraint qual to check */
+ CmdType commandType; /* select|insert|update|delete */
bool cascaded; /* true = WITH CASCADED CHECK OPTION */
} WithCheckOption;
--
1.9.1
0004-Project-updates-from-ON-CONFLICT-UPDATE-RETURNING.patchtext/x-patch; charset=US-ASCII; name=0004-Project-updates-from-ON-CONFLICT-UPDATE-RETURNING.patchDownload
From ad503df1334b1aa1ce6db0cf1cb7e322ae6a7850 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 21 Nov 2014 16:59:54 -0800
Subject: [PATCH 4/8] Project updates from ON CONFLICT UPDATE RETURNING
This establishes that an INSERT with an ON CONFLICT UPDATE clause
processes all slots that are ultimately affected, regardless of whether
or not the alternative ON CONFLICT UPDATE path was taken. However, if
an ON CONFLICT UPDATE's WHERE clause is not satisfied in respect of some
slot/tuple, the post-update tuple is not projected (although the row is
still locked, just as before).
Also, for ON CONFLICT UPDATE variant INSERTs (but not ON CONFLICT IGNORE
variant INSERTs), the number of rows affected in total is reported by
the command tag using the new "UPSERT" command identifier, which
otherwise matches the format of the existing "INSERT" command tag.
There is no precedent for a top level command that uses a different
command tag identifier according to whether or not some clause was used,
but doing so seems appropriate, since client programs are expected to
have an interest in whether or not some number of rows projected by
RETURNING may have been updated, and in any case indicating that the
rows were affected by an "INSERT" when they may not have been inserted
is simply misleading. However, there is still no principled method for
client programs to distinguish between INSERT ... ON CONFLICT UPDATE
projected tuples generated by being inserted or by being updated. This
is thought not to matter, since the use of INSERT with ON CONFLICT
UPDATE indicates that either outcome is equivalent.
---
src/backend/executor/nodeModifyTable.c | 30 ++++++++++++++++++++++--------
src/backend/tcop/pquery.c | 16 +++++++++++++---
src/bin/psql/common.c | 5 ++++-
3 files changed, 39 insertions(+), 12 deletions(-)
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 05c78c9..1603c45 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -59,7 +59,9 @@ static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
TupleTableSlot *planSlot,
TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
- EState *estate);
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning);
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -413,6 +415,8 @@ vlock:
if (conflict)
{
+ TupleTableSlot *returning = NULL;
+
/*
* Lock and consider updating in the SPEC_INSERT case. For the
* SPEC_IGNORE case, it's still necessary to verify that the tuple
@@ -423,12 +427,20 @@ vlock:
planSlot,
slot,
onConflict,
- estate))
+ estate,
+ canSetTag,
+ &returning))
goto vlock;
else if (spec == SPEC_IGNORE)
ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
- return NULL;
+ /*
+ * RETURNING may have been processed already -- the target
+ * ResultRelInfo might have made representation within ExecUpdate()
+ * that this is required. Inserted and updated tuples are
+ * projected indifferently for ON CONFLICT UPDATE with RETURNING.
+ */
+ return returning;
}
}
@@ -967,7 +979,9 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
TupleTableSlot *planSlot,
TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
- EState *estate)
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning)
{
Relation relation = resultRelInfo->ri_RelationDesc;
HeapTupleData tuple;
@@ -1135,9 +1149,9 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
if (!TupIsNull(slot))
- ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot,
- &onConflict->mt_epqstate, onConflict->ps.state,
- false);
+ *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
+ planSlot, &onConflict->mt_epqstate,
+ onConflict->ps.state, canSetTag);
ReleaseBuffer(buffer);
@@ -1149,7 +1163,7 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
/* must provide our own instrumentation support */
if (onConflict->ps.instrument)
- InstrStopNode(onConflict->ps.instrument, 0);
+ InstrStopNode(onConflict->ps.instrument, *returning ? 1:0);
return true;
case HeapTupleUpdated:
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 9c14e8a..41c4191 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -189,7 +189,8 @@ ProcessQuery(PlannedStmt *plan,
*/
if (completionTag)
{
- Oid lastOid;
+ Oid lastOid;
+ ModifyTableState *pstate;
switch (queryDesc->operation)
{
@@ -198,12 +199,16 @@ ProcessQuery(PlannedStmt *plan,
"SELECT %u", queryDesc->estate->es_processed);
break;
case CMD_INSERT:
+ pstate = (((ModifyTableState *) queryDesc->planstate));
+ Assert(IsA(pstate, ModifyTableState));
+
if (queryDesc->estate->es_processed == 1)
lastOid = queryDesc->estate->es_lastoid;
else
lastOid = InvalidOid;
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
- "INSERT %u %u", lastOid, queryDesc->estate->es_processed);
+ "%s %u %u", pstate->spec == SPEC_INSERT? "UPSERT":"INSERT",
+ lastOid, queryDesc->estate->es_processed);
break;
case CMD_UPDATE:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
@@ -1356,7 +1361,10 @@ PortalRunMulti(Portal portal, bool isTopLevel,
* 0" here because technically there is no query of the matching tag type,
* and printing a non-zero count for a different query type seems wrong,
* e.g. an INSERT that does an UPDATE instead should not print "0 1" if
- * one row was updated. See QueryRewrite(), step 3, for details.
+ * one row was updated (unless the ON CONFLICT UPDATE, or "UPSERT" variant
+ * of INSERT was used to update the row, where it's logically a direct
+ * effect of the top level command). See QueryRewrite(), step 3, for
+ * details.
*/
if (completionTag && completionTag[0] == '\0')
{
@@ -1366,6 +1374,8 @@ PortalRunMulti(Portal portal, bool isTopLevel,
sprintf(completionTag, "SELECT 0 0");
else if (strcmp(completionTag, "INSERT") == 0)
strcpy(completionTag, "INSERT 0 0");
+ else if (strcmp(completionTag, "UPSERT") == 0)
+ strcpy(completionTag, "UPSERT 0 0");
else if (strcmp(completionTag, "UPDATE") == 0)
strcpy(completionTag, "UPDATE 0");
else if (strcmp(completionTag, "DELETE") == 0)
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 275bdcc..9302e41 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -894,9 +894,12 @@ PrintQueryResults(PGresult *results)
success = StoreQueryTuple(results);
else
success = PrintQueryTuples(results);
- /* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
+ /*
+ * if it's INSERT/UPSERT/UPDATE/DELETE RETURNING, also print status
+ */
cmdstatus = PQcmdStatus(results);
if (strncmp(cmdstatus, "INSERT", 6) == 0 ||
+ strncmp(cmdstatus, "UPSERT", 6) == 0 ||
strncmp(cmdstatus, "UPDATE", 6) == 0 ||
strncmp(cmdstatus, "DELETE", 6) == 0)
PrintQueryStatus(results);
--
1.9.1
0003-EXCLUDED-expressions-within-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0003-EXCLUDED-expressions-within-ON-CONFLICT-UPDATE.patchDownload
From 74c98a421bf8ee8a3232406f8ac78e80083a60f2 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Thu, 18 Sep 2014 19:08:27 -0700
Subject: [PATCH 3/8] EXCLUDED expressions within ON CONFLICT UPDATE
EXCLUDED.* (which previously appeared as EXCLUDED(), and CONFLICTING()
before that) is an "internal" primnode expression which enables
referencing of rejected-for-insertion tuples within both the targetlist
and predicate of the UPDATE portion of an INSERT ... ON CONFLICT UPDATE
query. The expression is invoked using an alias-like syntax (more on
how this works later). The fact that a dedicated expression is used
(rather than a dedicated range table entry involved in query
optimization) is an implementation detail.
This additional support is particularly useful for ON CONFLICT queries
that propose multiple tuples for insertion, since it isn't otherwise
possible to succinctly decide which actual values to update each column
with (in the event of taking the update path in respect of a given
slot).
The effects of BEFORE INSERT row triggers on the slot/tuple proposed for
insertion are carried. This seems logical, since it might be the case
that the rejected values would not have been rejected had some BEFORE
INSERT trigger been disabled. On the other hand, the potential hazards
around equivalent modifications occurring when both INSERT and UPDATE
BEFORE triggers are fired for the same slot/tuple should be considered
by client applications. It's possible to imagine a use case in which
this behavior is surprising and undesirable -- essentially the same
non-idempotent modification may occur twice. (It might also be the case
that BEFORE trigger related side-effects undesirably occur twice, but
writing BEFORE triggers with external side-effects is already considered
a questionable practice for several reasons (consider commit 6868ed74),
and besides, the implementation cannot reasonably prevent this, as noted
in nodeModifyTable.c comments added by the main ON CONFLICT commit).
In this revision, the raw grammar does not generate an ExcludedExpr.
Parse analysis of ON CONFLICT UPDATE is made to add a new relation RTE
to the auxiliary sub_pstate parser state (an alias for the target).
This makes parse analysis build a query tree that is more or less
consistent with there actually being an EXCLUDED relation. Then, as
part of query rewrite, immediately after normalizing the UPDATE
targetlist, Vars referencing the pseudo-relation (using the EXCLUDED
alias) are replaced with ExcludedExpr that references Vars in the target
relation itself.
Speculative insertion/the executor arranges to rig the Vars and
UPDATE-related/EPQ scan planstate's expression context such that values
will actually originate from the rejected tuple's slot (driven, as
always for the UPDATE's execution, by the parent INSERT ModifyTable
node, changed once per slot proposed for insertion as appropriate).
This whole mechanism is somewhat similar to the handling of trigger WHEN
clauses, where a similar dance must also occur within the executor.
Note that pg_stat_statements does not fingerprint ExludedExpr, because
it cannot appear in the post-parse-analysis, pre-rewrite Query tree.
(pg_stat_statements does not fingerprint every primnode anyway, mostly
because some are only expected in utility statements). Other existing
Node handling sites that don't expect to see primnodes that appear only
after rewriting (ExcludedExpr may be in its own subcategory here in that
it is the only such non-utility related Node) do not have an
ExcludedExpr case added either.
---
src/backend/executor/execQual.c | 54 +++++++++++++++++++++
src/backend/executor/nodeModifyTable.c | 32 ++++++++++++
src/backend/nodes/copyfuncs.c | 16 ++++++
src/backend/nodes/equalfuncs.c | 11 +++++
src/backend/nodes/nodeFuncs.c | 38 +++++++++++++++
src/backend/nodes/outfuncs.c | 11 +++++
src/backend/nodes/readfuncs.c | 15 ++++++
src/backend/optimizer/plan/setrefs.c | 6 +++
src/backend/parser/analyze.c | 22 ++++++++-
src/backend/rewrite/rewriteHandler.c | 89 ++++++++++++++++++++++++++++++++++
src/backend/utils/adt/ruleutils.c | 39 +++++++++++++++
src/include/nodes/execnodes.h | 10 ++++
src/include/nodes/nodes.h | 2 +
src/include/nodes/primnodes.h | 47 ++++++++++++++++++
14 files changed, 391 insertions(+), 1 deletion(-)
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 0e7400f..57d726e 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -182,6 +182,9 @@ static Datum ExecEvalArrayCoerceExpr(ArrayCoerceExprState *astate,
bool *isNull, ExprDoneCond *isDone);
static Datum ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
bool *isNull, ExprDoneCond *isDone);
+static Datum ExecEvalExcluded(ExcludedExprState *excludedExpr,
+ ExprContext *econtext, bool *isNull,
+ ExprDoneCond *isDone);
/* ----------------------------------------------------------------
@@ -4338,6 +4341,33 @@ ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
return 0; /* keep compiler quiet */
}
+/* ----------------------------------------------------------------
+ * ExecEvalExcluded
+ * ----------------------------------------------------------------
+ */
+static Datum
+ExecEvalExcluded(ExcludedExprState *excludedExpr, ExprContext *econtext,
+ bool *isNull, ExprDoneCond *isDone)
+{
+ /*
+ * ExcludedExpr is essentially an expression that adapts its single Var
+ * argument to refer to the expression context inner slot's tuple, which is
+ * reserved for the purpose of referencing EXCLUDED.* tuples within ON
+ * CONFLICT UPDATE auxiliary queries' EPQ expression context (ON CONFLICT
+ * UPDATE makes special use of the EvalPlanQual() mechanism to update).
+ *
+ * nodeModifyTable.c assigns its own table slot in the auxiliary queries'
+ * EPQ expression state (originating in the parent INSERT node) on the
+ * assumption that it may only be used by ExcludedExpr, and on the
+ * assumption that the inner slot is not otherwise useful. This occurs in
+ * advance of the expression evaluation for UPDATE (which calls here are
+ * part of) once per slot proposed for insertion, and works because of
+ * restrictions on the structure of ON CONFLICT UPDATE auxiliary queries.
+ *
+ * Just evaluate nested Var.
+ */
+ return ExecEvalScalarVar(excludedExpr->arg, econtext, isNull, isDone);
+}
/*
* ExecEvalExprSwitchContext
@@ -5065,6 +5095,30 @@ ExecInitExpr(Expr *node, PlanState *parent)
state = (ExprState *) makeNode(ExprState);
state->evalfunc = ExecEvalCurrentOfExpr;
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExprState *cstate = makeNode(ExcludedExprState);
+ Var *contained = (Var*) excludedexpr->arg;
+
+ /*
+ * varno forced to INNER_VAR -- see remarks within
+ * ExecLockUpdateTuple().
+ *
+ * We rely on the assumption that the only place that
+ * ExcludedExpr may appear is where EXCLUDED Var references
+ * originally appeared after parse analysis. The rewriter
+ * replaces these with ExcludedExpr that reference the
+ * corresponding Var within the ON CONFLICT UPDATE target RTE.
+ */
+ Assert(IsA(contained, Var));
+
+ contained->varno = INNER_VAR;
+ cstate->arg = ExecInitExpr((Expr *) contained, parent);
+ state = (ExprState *) cstate;
+ state->evalfunc = (ExprStateEvalFunc) ExecEvalExcluded;
+ }
+ break;
case T_TargetEntry:
{
TargetEntry *tle = (TargetEntry *) node;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d03604c..05c78c9 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -57,6 +57,7 @@
static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
ItemPointer conflictTid,
TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
EState *estate);
@@ -420,6 +421,7 @@ vlock:
if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
&conflictTid,
planSlot,
+ slot,
onConflict,
estate))
goto vlock;
@@ -963,6 +965,7 @@ static bool
ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
ItemPointer conflictTid,
TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
EState *estate)
{
@@ -973,6 +976,7 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
HTSU_Result test;
Buffer buffer;
TupleTableSlot *slot;
+ ExprContext *econtext;
/*
* XXX We don't have the TID of the conflicting tuple if the index
@@ -1094,12 +1098,40 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
/*
+ * Save EPQ expression context. Auxiliary plan's scan node (which
+ * would have been just initialized by EvalPlanQualBegin() on the
+ * first time through here per query) cannot fail to provide one.
+ */
+ econtext = onConflict->mt_epqstate.planstate->ps_ExprContext;
+
+ /*
* UPDATE affects the same ResultRelation as INSERT in the context
* of ON CONFLICT UPDATE, so parent's target rti is used
*/
EvalPlanQualSetTuple(&onConflict->mt_epqstate,
resultRelInfo->ri_RangeTableIndex, copyTuple);
+ /*
+ * Make available rejected tuple for referencing within UPDATE
+ * expression (that is, make available a slot with the rejected
+ * tuple, possibly already modified by BEFORE INSERT row triggers).
+ *
+ * This is for the benefit of any ExcludedExpr that may appear
+ * within UPDATE's targetlist or WHERE clause. The EXCLUDED tuple
+ * may be referenced as an ExcludedExpr, which exist purely for our
+ * benefit. The nested ExcludedExpr's Var will necessarily have an
+ * INNER_VAR varno on the assumption that the inner slot of the EPQ
+ * scan plan state's expression context will contain the EXCLUDED
+ * heaptuple slot (that is, on the assumption that during
+ * expression evaluation, the ecxt_innertuple will be assigned the
+ * insertSlot by this codepath, in advance of expression
+ * evaluation).
+ *
+ * See handling of ExcludedExpr within handleRewrite.c and
+ * execQual.c.
+ */
+ econtext->ecxt_innertuple = insertSlot;
+
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
if (!TupIsNull(slot))
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6c1a7f1..df611d2 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1779,6 +1779,19 @@ _copyCurrentOfExpr(const CurrentOfExpr *from)
}
/*
+ * _copyExcludedExpr
+ */
+static ExcludedExpr *
+_copyExcludedExpr(const ExcludedExpr *from)
+{
+ ExcludedExpr *newnode = makeNode(ExcludedExpr);
+
+ COPY_NODE_FIELD(arg);
+
+ return newnode;
+}
+
+/*
* _copyTargetEntry
*/
static TargetEntry *
@@ -4287,6 +4300,9 @@ copyObject(const void *from)
case T_CurrentOfExpr:
retval = _copyCurrentOfExpr(from);
break;
+ case T_ExcludedExpr:
+ retval = _copyExcludedExpr(from);
+ break;
case T_TargetEntry:
retval = _copyTargetEntry(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 4127269..24e58fa 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -681,6 +681,14 @@ _equalCurrentOfExpr(const CurrentOfExpr *a, const CurrentOfExpr *b)
}
static bool
+_equalExcludedExpr(const ExcludedExpr *a, const ExcludedExpr *b)
+{
+ COMPARE_NODE_FIELD(arg);
+
+ return true;
+}
+
+static bool
_equalTargetEntry(const TargetEntry *a, const TargetEntry *b)
{
COMPARE_NODE_FIELD(expr);
@@ -2720,6 +2728,9 @@ equal(const void *a, const void *b)
case T_CurrentOfExpr:
retval = _equalCurrentOfExpr(a, b);
break;
+ case T_ExcludedExpr:
+ retval = _equalExcludedExpr(a, b);
+ break;
case T_TargetEntry:
retval = _equalTargetEntry(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 4107cc9..a9e1e13 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -235,6 +235,13 @@ exprType(const Node *expr)
case T_CurrentOfExpr:
type = BOOLOID;
break;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ type = exprType((Node *) n->arg);
+ }
+ break;
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -469,6 +476,12 @@ exprTypmod(const Node *expr)
return ((const CoerceToDomainValue *) expr)->typeMod;
case T_SetToDefault:
return ((const SetToDefault *) expr)->typeMod;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ return ((const Var *) n->arg)->vartypmod;
+ }
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
default:
@@ -894,6 +907,9 @@ exprCollation(const Node *expr)
case T_CurrentOfExpr:
coll = InvalidOid; /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ coll = exprCollation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -1089,6 +1105,12 @@ exprSetCollation(Node *expr, Oid collation)
case T_CurrentOfExpr:
Assert(!OidIsValid(collation)); /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ {
+ Var *v = (Var *) ((ExcludedExpr *) expr)->arg;
+ v->varcollid = collation;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
break;
@@ -1487,6 +1509,10 @@ exprLocation(const Node *expr)
/* just use argument's location */
loc = exprLocation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_ExcludedExpr:
+ /* just use nested expr's location */
+ loc = exprLocation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
@@ -1916,6 +1942,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_ExcludedExpr:
+ return walker(((ExcludedExpr *) node)->arg, context);
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -2632,6 +2660,16 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExpr *newnode;
+
+ FLATCOPY(newnode, excludedexpr, ExcludedExpr);
+ MUTATE(newnode->arg, newnode->arg, Node *);
+ return (Node *) newnode;
+ }
+ break;
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index a32fbaa..34e9163 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1429,6 +1429,14 @@ _outCurrentOfExpr(StringInfo str, const CurrentOfExpr *node)
}
static void
+_outExcludedExpr(StringInfo str, const ExcludedExpr *node)
+{
+ WRITE_NODE_TYPE("EXCLUDED");
+
+ WRITE_NODE_FIELD(arg);
+}
+
+static void
_outTargetEntry(StringInfo str, const TargetEntry *node)
{
WRITE_NODE_TYPE("TARGETENTRY");
@@ -3069,6 +3077,9 @@ _outNode(StringInfo str, const void *obj)
case T_CurrentOfExpr:
_outCurrentOfExpr(str, obj);
break;
+ case T_ExcludedExpr:
+ _outExcludedExpr(str, obj);
+ break;
case T_TargetEntry:
_outTargetEntry(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 9f6570f..b471bbf 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1132,6 +1132,19 @@ _readCurrentOfExpr(void)
}
/*
+ * _readExcludedExpr
+ */
+static ExcludedExpr *
+_readExcludedExpr(void)
+{
+ READ_LOCALS(ExcludedExpr);
+
+ READ_NODE_FIELD(arg);
+
+ READ_DONE();
+}
+
+/*
* _readTargetEntry
*/
static TargetEntry *
@@ -1396,6 +1409,8 @@ parseNodeString(void)
return_value = _readSetToDefault();
else if (MATCH("CURRENTOFEXPR", 13))
return_value = _readCurrentOfExpr();
+ else if (MATCH("EXCLUDED", 8))
+ return_value = _readExcludedExpr();
else if (MATCH("TARGETENTRY", 11))
return_value = _readTargetEntry();
else if (MATCH("RANGETBLREF", 11))
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 3368173..9e73d6c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -792,6 +792,12 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
}
else
{
+ /*
+ * Decrement rtoffset, to compensate for dummy RTE left by
+ * EXCLUDED.* alias. Auxiliary plan will have same
+ * resultRelation from flattened RTE as its parent.
+ */
+ rtoffset -= PRS2_OLD_VARNO;
splan->onConflictPlan = (Plan *) set_plan_refs(root,
(Plan *) splan->onConflictPlan,
rtoffset);
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index caaa44c..e0ec207 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -779,7 +779,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
UpdateStmt *pupd;
Query *dqry;
ParseState *sub_pstate = make_parsestate(pstate);
- RangeTblEntry *subTarget;
+ RangeTblEntry *subTarget, *exclRte;
pupd = (UpdateStmt *) stmt->confClause->updatequery;
@@ -788,6 +788,26 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* Assign same target relation as parent InsertStmt */
pupd->relation = stmt->relation;
+ pupd->relation->alias = makeAlias("target", NIL);
+
+ /*
+ * Create EXCLUDED alias for target relation. This can be used to
+ * reference the tuple originally proposed for insertion from
+ * within the ON CONFLICT UPDATE auxiliary query.
+ *
+ * NOTE: 'EXCLUDED' will always have a varno equal to 1 (at least
+ * until rewriting, where the RTE is effectively discarded).
+ */
+ exclRte = addRangeTableEntryForRelation(sub_pstate,
+ pstate->p_target_relation,
+ makeAlias("excluded", NIL),
+ false, false);
+
+ /*
+ * Add RTE. Vars referencing the alias are rewritten to reference
+ * "target", nested within an ExcludedExpr.
+ */
+ addRTEtoQuery(sub_pstate, exclRte, false, true, true);
/*
* The optimizer is not prepared to accept a subquery RTE for a
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 5ab0cba..3db5165 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -43,6 +43,12 @@ typedef struct acquireLocksOnSubLinks_context
bool for_execute; /* AcquireRewriteLocks' forExecute param */
} acquireLocksOnSubLinks_context;
+typedef struct excluded_replace_context
+{
+ int varno; /* varno of EXLCUDED.* Vars */
+ int rvarno; /* replace varno */
+} excluded_replace_context;
+
static bool acquireLocksOnSubLinks(Node *node,
acquireLocksOnSubLinks_context *context);
static Query *rewriteRuleAction(Query *parsetree,
@@ -71,6 +77,10 @@ static Query *fireRIRrules(Query *parsetree, List *activeRIRs,
bool forUpdatePushedDown);
static bool view_has_instead_trigger(Relation view, CmdType event);
static Bitmapset *adjust_view_column_set(Bitmapset *cols, List *targetlist);
+static Node *excluded_replace_vars(Node *expr,
+ excluded_replace_context *context);
+static Node *excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context);
/*
@@ -3099,6 +3109,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
if (parsetree->specClause == SPEC_INSERT)
{
Query *qry;
+ excluded_replace_context context;
/*
* While user-defined rules will never be applied in the
@@ -3107,6 +3118,35 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
qry = (Query *) parsetree->onConflict;
rewriteTargetListIU(qry, rt_entry_relation, NULL);
+
+ /*
+ * Replace OLD Vars (associated with the EXCLUDED.* alias) with
+ * first (and only) "real" relation RTE in rtable. This allows
+ * the implementation to treat EXCLUDED.* as an alias for the
+ * target relation, which is useful during parse analysis,
+ * while ultimately having those references rewritten as
+ * special ExcludedExpr references to the corresponding Var in
+ * the target RTE.
+ *
+ * This is necessary because while we want a join-like syntax
+ * for aesthetic reasons, the resemblance is superficial. In
+ * fact, execution of the ModifyTable node (and its direct
+ * child auxiliary query) manages tupleslot state directly, and
+ * is directly tasked with making available the appropriate
+ * tupleslot to the expression context.
+ *
+ * This is a kludge, but appears necessary, since the slot made
+ * available for referencing via ExcludedExpr is in fact the
+ * slot just excluded from insertion by speculative insertion
+ * (with the effects of BEFORE ROW INSERT triggers carried).
+ * An ad-hoc method for making the excluded tuple available
+ * within the auxiliary expression context is appropriate.
+ */
+ context.varno = PRS2_OLD_VARNO;
+ context.rvarno = PRS2_OLD_VARNO + 1;
+
+ parsetree->onConflict =
+ excluded_replace_vars(parsetree->onConflict, &context);
}
}
else if (event == CMD_UPDATE)
@@ -3428,3 +3468,52 @@ QueryRewrite(Query *parsetree)
return results;
}
+
+/*
+ * Apply pullup variable replacement throughout an expression tree
+ *
+ * Returns modified tree, with user-specified rvarno replaced with varno.
+ */
+static Node *
+excluded_replace_vars(Node *expr, excluded_replace_context *context)
+{
+ /*
+ * Don't recurse into subqueries; they're forbidden in auxiliary ON
+ * CONFLICT query
+ */
+ return replace_rte_variables(expr,
+ context->varno, 0,
+ excluded_replace_vars_callback,
+ (void *) context,
+ NULL);
+}
+
+static Node *
+excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context)
+{
+ excluded_replace_context *rcon = (excluded_replace_context *) context->callback_arg;
+ ExcludedExpr *n = makeNode(ExcludedExpr);
+
+ /* Replace with an enclosing ExcludedExpr */
+ var->varno = rcon->rvarno;
+ n->arg = (Node *) var;
+
+ /*
+ * Would have to adjust varlevelsup if referenced item is from higher query
+ * (should not happen)
+ */
+ Assert(var->varlevelsup == 0);
+
+ if (var->varattno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference system column using EXCLUDED.* alias")));
+
+ if (var->varattno == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference whole-row using EXCLUDED.* alias")));
+
+ return (Node*) n;
+}
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index dd748ac..84f344a 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5618,6 +5618,24 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
return NULL;
}
+ else if (var->varno == INNER_VAR)
+ {
+ /* Assume an EXCLUDED variable */
+ rte = rt_fetch(PRS2_OLD_VARNO, dpns->rtable);
+
+ /*
+ * Sanity check: EXCLUDED.* Vars should only appear in auxiliary ON
+ * CONFLICT UPDATE queries. Assert that rte and planstate are
+ * consistent with that.
+ */
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(IsA(dpns->planstate, SeqScanState) ||
+ IsA(dpns->planstate, ResultState));
+
+ refname = "excluded";
+ colinfo = deparse_columns_fetch(PRS2_OLD_VARNO, dpns);
+ attnum = var->varattno;
+ }
else
{
elog(ERROR, "bogus varno: %d", var->varno);
@@ -6358,6 +6376,7 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags)
case T_CoerceToDomainValue:
case T_SetToDefault:
case T_CurrentOfExpr:
+ case T_ExcludedExpr:
/* single words: always simple */
return true;
@@ -7583,6 +7602,26 @@ get_rule_expr(Node *node, deparse_context *context,
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ Var *variable = (Var *) excludedexpr->arg;
+ bool save_varprefix;
+
+ /*
+ * Force parentheses because our caller probably assumed our
+ * Var is a simple expression.
+ */
+ appendStringInfoChar(buf, '(');
+ save_varprefix = context->varprefix;
+ /* Ensure EXCLUDED.* prefix is always visible */
+ context->varprefix = true;
+ get_rule_expr((Node *) variable, context, true);
+ context->varprefix = save_varprefix;
+ appendStringInfoChar(buf, ')');
+ }
+ break;
+
case T_List:
{
char *sep;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 19b5e29..0274ebc 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -973,6 +973,16 @@ typedef struct DomainConstraintState
ExprState *check_expr; /* for CHECK, a boolean expression */
} DomainConstraintState;
+/* ----------------
+ * ExcludedExprState node
+ * ----------------
+ */
+typedef struct ExcludedExprState
+{
+ ExprState xprstate;
+ ExprState *arg; /* the argument */
+} ExcludedExprState;
+
/* ----------------------------------------------------------------
* Executor State Trees
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cac6b15..ca568a2 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -168,6 +168,7 @@ typedef enum NodeTag
T_CoerceToDomainValue,
T_SetToDefault,
T_CurrentOfExpr,
+ T_ExcludedExpr,
T_TargetEntry,
T_RangeTblRef,
T_JoinExpr,
@@ -207,6 +208,7 @@ typedef enum NodeTag
T_NullTestState,
T_CoerceToDomainState,
T_DomainConstraintState,
+ T_ExcludedExprState,
/*
* TAGS FOR PLANNER NODES (relation.h)
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1d06f42..21c39dc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1147,6 +1147,53 @@ typedef struct CurrentOfExpr
int cursor_param; /* refcursor parameter number, or 0 */
} CurrentOfExpr;
+/*
+ * ExcludedExpr - an EXCLUDED.* expression
+ *
+ * During parse analysis of ON CONFLICT UPDATE auxiliary queries, a dummy
+ * EXCLUDED range table entry is generated, which is actually just an alias for
+ * the target relation. This is useful during parse analysis, allowing the
+ * parser to produce simple error messages, for example. There is the
+ * appearance of a join within the auxiliary ON CONFLICT UPDATE, superficially
+ * similar to a join in an UPDATE ... FROM; this is a limited, ad-hoc join
+ * though, as the executor needs to tightly control the referenced tuple/slot
+ * through which update evaluation references excluded values originally
+ * proposed for insertion. Note that EXCLUDED.* values carry forward the
+ * effects of BEFORE ROW INSERT triggers.
+ *
+ * To implement a limited "join" for ON CONFLICT UPDATE auxiliary queries,
+ * during the rewrite stage, Vars referencing the alias EXCLUDED.* RTE are
+ * swapped with ExcludedExprs, which also contain Vars; their Vars are
+ * equivalent, but reference the target instead. The ExcludedExpr Var actually
+ * evaluates against varno INNER_VAR during expression evaluation (and not a
+ * varno INDEX_VAR associated with an entry in the flattened range table
+ * representing the target, which is necessarily being scanned whenever an
+ * ExcludedExpr is evaluated) while still being logically associated with the
+ * target. The Var is only rigged to reference the inner slot during
+ * ExcludedExpr initialization. The executor closely controls the evaluation
+ * expression, installing the EXCLUDED slot actually excluded from insertion
+ * into the inner slot of the child/auxiliary evaluation context in an ad-hoc
+ * fashion, which, after ExcludedExpr initialization, is expected (i.e. it is
+ * expected during ExcludedExpr evaluation that the parent insert will make
+ * each excluded tuple available in the inner slot in turn). ExcludedExpr are
+ * only ever evaluated during special speculative insertion related EPQ
+ * expression evaluation, purely for the benefit of auxiliary UPDATE
+ * expressions.
+ *
+ * Aside from representing a logical choke point for this special expression
+ * evaluation, having a dedicated primnode also prevents the optimizer from
+ * considering various optimization that might otherwise be attempted.
+ * Obviously there is no useful join optimization possible within the auxiliary
+ * query, and an ExcludedExpr based post-rewrite query tree representation is a
+ * convenient way of preventing that, as well as related inapplicable
+ * optimizations concerning the equivalence of Vars.
+ */
+typedef struct ExcludedExpr
+{
+ Expr xpr;
+ Node *arg; /* argument (Var) */
+} ExcludedExpr;
+
/*--------------------
* TargetEntry -
* a target entry (used in query target lists)
--
1.9.1
0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 1e29ca5e2f189d4104dfc87304f2bc702fccd29e Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:01:32 -0700
Subject: [PATCH 2/8] Support INSERT ... ON CONFLICT {UPDATE | IGNORE}
This non-standard INSERT clause allows DML statement authors to specify
that in the event of each of any of the tuples being inserted
duplicating an existing tuple in terms of a value or set of values
constrained by a unique index, an alternative path may be taken. The
statement may alternatively IGNORE the tuple being inserted without
raising an error, or go to UPDATE the existing tuple whose value is
duplicated by a value within one single tuple proposed for insertion.
The implementation loops until either an insert or an UPDATE/IGNORE
occurs. No existing tuple may be affected more than once per INSERT.
This is implemented using a new infrastructure called "speculative
insertion". (The approach to "Value locking" presenting here follows
design #2, as described on the value locking Postgres Wiki page).
Alternatively, we may go to UPDATE, using the EvalPlanQual() mechanism
to execute a special auxiliary plan.
READ COMMITTED isolation level is permitted to UPDATE a tuple even where
no version is visible to the command's MVCC snapshot. Similarly, any
query predicate associated with the UPDATE portion of the new statement
need only satisfy an already locked, conclusively committed and visible
conflict tuple. When the predicate isn't satisfied, the tuple is still
locked, which implies that at READ COMMITTED, a tuple may be locked
without any version being visible to the command's MVCC snapshot.
Users specify a single unique index to take the alternative path on,
which is inferred from a set of user-supplied column names (or
expressions). This is mandatory for the ON CONFLICT UPDATE variant,
which should address concerns about spuriously taking an incorrect
alternative ON CONFLICT path (i.e. the wrong unique index is used for
arbitration of whether or not to take the alternative path) due to there
being more than one would-be unique violation. Previous revisions of
the patch didn't mandate this. However, we may still IGNORE based on
the first would-be unique violation detected, on the assumption that it
doesn't particularly matter where it originated from for that variant
(iff the user didn't make a point of indicated his or her intent).
The auxiliary ModifyTable plan used by the UPDATE portion of the new
statement is not formally a subplan of its parent INSERT ModifyTable
plan. Rather, it's an independently planned subquery, whose execution
is tightly driven by its parent. Special auxiliary state pertaining to
the auxiliary UPDATE is tracked by its parent through all stages of
query execution.
The implementation imposes some restrictions on child auxiliary UPDATE
plans, which make the plans comport with their parent to the extent
required during the executor stage. One user-visible consequences of
this is that the special auxiliary UPDATE query cannot have subselects
within its targetlist or WHERE clause. UPDATEs may not reference any
other table, and UPDATE FROM is disallowed. INSERT's RETURNING clause
projects tuples successfully inserted (in a later commit, it is made to
project tuples inserted and updated, though).
---
contrib/pg_stat_statements/pg_stat_statements.c | 5 +
contrib/postgres_fdw/deparse.c | 7 +-
contrib/postgres_fdw/postgres_fdw.c | 16 +-
contrib/postgres_fdw/postgres_fdw.h | 2 +-
src/backend/access/heap/heapam.c | 97 ++++-
src/backend/access/nbtree/nbtinsert.c | 32 +-
src/backend/catalog/index.c | 52 ++-
src/backend/commands/constraint.c | 7 +-
src/backend/commands/copy.c | 5 +-
src/backend/commands/explain.c | 87 ++++-
src/backend/executor/execMain.c | 14 +-
src/backend/executor/execUtils.c | 244 +++++++++++--
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeModifyTable.c | 453 +++++++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 39 ++
src/backend/nodes/equalfuncs.c | 32 ++
src/backend/nodes/nodeFuncs.c | 36 ++
src/backend/nodes/outfuncs.c | 7 +
src/backend/nodes/readfuncs.c | 4 +
src/backend/optimizer/path/indxpath.c | 56 +++
src/backend/optimizer/path/tidpath.c | 8 +-
src/backend/optimizer/plan/createplan.c | 16 +-
src/backend/optimizer/plan/planner.c | 50 +++
src/backend/optimizer/plan/setrefs.c | 25 +-
src/backend/optimizer/plan/subselect.c | 6 +
src/backend/optimizer/util/plancat.c | 222 +++++++++++-
src/backend/parser/analyze.c | 100 +++++-
src/backend/parser/gram.y | 74 +++-
src/backend/parser/parse_agg.c | 7 +
src/backend/parser/parse_clause.c | 163 +++++++++
src/backend/parser/parse_expr.c | 3 +
src/backend/rewrite/rewriteHandler.c | 26 ++
src/backend/storage/ipc/procarray.c | 96 +++++
src/backend/storage/lmgr/lmgr.c | 68 ++++
src/backend/utils/adt/lockfuncs.c | 1 +
src/backend/utils/time/tqual.c | 45 +++
src/include/access/heapam.h | 3 +-
src/include/access/heapam_xlog.h | 2 +
src/include/executor/executor.h | 19 +-
src/include/nodes/execnodes.h | 9 +
src/include/nodes/nodes.h | 14 +
src/include/nodes/parsenodes.h | 38 +-
src/include/nodes/plannodes.h | 3 +
src/include/optimizer/paths.h | 1 +
src/include/optimizer/plancat.h | 2 +
src/include/optimizer/planmain.h | 3 +-
src/include/parser/kwlist.h | 2 +
src/include/parser/parse_clause.h | 2 +
src/include/parser/parse_node.h | 1 +
src/include/storage/lmgr.h | 5 +
src/include/storage/lock.h | 10 +
src/include/storage/proc.h | 10 +
src/include/storage/procarray.h | 7 +
src/include/utils/snapshot.h | 11 +
54 files changed, 2125 insertions(+), 131 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 2629bfc..d5f3c81 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2198,6 +2198,11 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
JumbleRangeTable(jstate, query->rtable);
JumbleExpr(jstate, (Node *) query->jointree);
JumbleExpr(jstate, (Node *) query->targetList);
+ APP_JUMB(query->specClause);
+ JumbleExpr(jstate, (Node *) query->arbiterExpr);
+ JumbleExpr(jstate, query->arbiterWhere);
+ if (query->onConflict)
+ JumbleQuery(jstate, (Query *) query->onConflict);
JumbleExpr(jstate, (Node *) query->returningList);
JumbleExpr(jstate, (Node *) query->groupClause);
JumbleExpr(jstate, query->havingQual);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 59cb053..ca51586 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -847,8 +847,8 @@ appendWhereClause(StringInfo buf,
void
deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
- List **retrieved_attrs)
+ List *targetAttrs, bool ignore,
+ List *returningList, List **retrieved_attrs)
{
AttrNumber pindex;
bool first;
@@ -892,6 +892,9 @@ deparseInsertSql(StringInfo buf, PlannerInfo *root,
else
appendStringInfoString(buf, " DEFAULT VALUES");
+ if (ignore)
+ appendStringInfoString(buf, " ON CONFLICT IGNORE");
+
deparseReturningList(buf, root, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_insert_after_row,
returningList, retrieved_attrs);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..1539899 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1167,6 +1167,7 @@ postgresPlanForeignModify(PlannerInfo *root,
List *targetAttrs = NIL;
List *returningList = NIL;
List *retrieved_attrs = NIL;
+ bool ignore = false;
initStringInfo(&sql);
@@ -1201,7 +1202,7 @@ postgresPlanForeignModify(PlannerInfo *root,
int col;
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -1218,6 +1219,17 @@ postgresPlanForeignModify(PlannerInfo *root,
if (plan->returningLists)
returningList = (List *) list_nth(plan->returningLists, subplan_index);
+ if (root->parse->arbiterExpr)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT unique index inference")));
+ else if (plan->spec == SPEC_INSERT)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT UPDATE")));
+ else if (plan->spec == SPEC_IGNORE)
+ ignore = true;
+
/*
* Construct the SQL command string.
*/
@@ -1225,7 +1237,7 @@ postgresPlanForeignModify(PlannerInfo *root,
{
case CMD_INSERT:
deparseInsertSql(&sql, root, resultRelation, rel,
- targetAttrs, returningList,
+ targetAttrs, ignore, returningList,
&retrieved_attrs);
break;
case CMD_UPDATE:
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..3763a57 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -60,7 +60,7 @@ extern void appendWhereClause(StringInfo buf,
List **params);
extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
+ List *targetAttrs, bool ignore, List *returningList,
List **retrieved_attrs);
extern void deparseUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 21e9d06..3a9d40b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
*
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc->specInsert fields
+ * are filled.
+ *
* Note that these options will be applied when inserting into the heap's
* TOAST table, too, if the tuple requires any out-of-line data.
*
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
END_CRIT_SECTION();
+ /*
+ * Let others know that we speculatively inserted this tuple, before
+ * releasing the buffer lock.
+ */
+ if (options & HEAP_INSERT_SPECULATIVE)
+ SetSpeculativeInsertionTid(relation->rd_node, &heaptup->t_self);
+
UnlockReleaseBuffer(buffer);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we "super-delete" a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id. Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
*/
HTSU_Result
heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd)
+ HeapUpdateFailureData *hufd, bool killspeculative)
{
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ if (!killspeculative)
+ {
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ }
+ else
+ {
+ if (tp.t_data->t_choice.t_heap.t_xmin != xid ||
+ tp.t_data->t_choice.t_heap.t_field3.t_cid != cid)
+ elog(ERROR, "attempted to super-delete a tuple from other CID");
+ result = HeapTupleMayBeUpdated;
+ }
+
if (result == HeapTupleInvisible)
{
@@ -2823,12 +2850,15 @@ l1:
* using our own TransactionId below, since some other backend could
* incorporate our XID into a MultiXact immediately afterwards.)
*/
- MultiXactIdSetOldestMember();
+ if (!killspeculative)
+ {
+ MultiXactIdSetOldestMember();
- compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
- tp.t_data->t_infomask, tp.t_data->t_infomask2,
- xid, LockTupleExclusive, true,
- &new_xmax, &new_infomask, &new_infomask2);
+ compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
+ tp.t_data->t_infomask, tp.t_data->t_infomask2,
+ xid, LockTupleExclusive, true,
+ &new_xmax, &new_infomask, &new_infomask2);
+ }
START_CRIT_SECTION();
@@ -2855,8 +2885,23 @@ l1:
tp.t_data->t_infomask |= new_infomask;
tp.t_data->t_infomask2 |= new_infomask2;
HeapTupleHeaderClearHotUpdated(tp.t_data);
- HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
- HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ /*
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
+ * instead of setting xmax, to make the tuple clearly invisible to
+ * everyone. In particular, we want HeapTupleSatisfiesDirty() to regard
+ * the tuple as dead, so that another backend inserting a duplicate key
+ * value won't unnecessarily wait for our transaction to finish.
+ */
+ if (!killspeculative)
+ {
+ HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
+ HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ }
+ else
+ {
+ HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ }
+
/* Make sure there is no forward chain link in t_ctid */
tp.t_data->t_ctid = tp.t_self;
@@ -2872,7 +2917,11 @@ l1:
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);
- xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
+ xlrec.flags = 0;
+ if (all_visible_cleared)
+ xlrec.flags |= XLOG_HEAP_ALL_VISIBLE_CLEARED;
+ if (killspeculative)
+ xlrec.flags |= XLOG_HEAP_KILLED_SPECULATIVE_TUPLE;
xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask,
tp.t_data->t_infomask2);
xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self);
@@ -2977,7 +3026,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
result = heap_delete(relation, tid,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd, false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -4070,14 +4119,16 @@ get_mxact_status_for_lock(LockTupleMode mode, bool is_update)
*
* Function result may be:
* HeapTupleMayBeUpdated: lock was successfully acquired
+ * HeapTupleInvisible: lock failed because tuple instantaneously invisible
* HeapTupleSelfUpdated: lock failed because tuple updated by self
* HeapTupleUpdated: lock failed because tuple updated by other xact
* HeapTupleWouldBlock: lock couldn't be acquired and wait_policy is skip
*
- * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
- * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
- * (the last only for HeapTupleSelfUpdated, since we
- * cannot obtain cmax from a combocid generated by another transaction).
+ * In the failure cases other than HeapTupleInvisible, the routine fills
+ * *hufd with the tuple's t_ctid, t_xmax (resolving a possible MultiXact,
+ * if necessary), and t_cmax (the last only for HeapTupleSelfUpdated,
+ * since we cannot obtain cmax from a combocid generated by another
+ * transaction).
* See comments for struct HeapUpdateFailureData for additional info.
*
* See README.tuplock for a thorough explanation of this mechanism.
@@ -4115,8 +4166,15 @@ l3:
if (result == HeapTupleInvisible)
{
- UnlockReleaseBuffer(*buffer);
- elog(ERROR, "attempted to lock invisible tuple");
+ LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
+
+ /*
+ * This is possible, but only when locking a tuple for speculative
+ * insertion. We return this value here rather than throwing an error
+ * in order to give that case the opportunity to throw a more specific
+ * error.
+ */
+ return HeapTupleInvisible;
}
else if (result == HeapTupleBeingUpdated)
{
@@ -7326,7 +7384,10 @@ heap_xlog_delete(XLogReaderState *record)
HeapTupleHeaderClearHotUpdated(htup);
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ else
+ HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 7db8a96..5062279 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -51,7 +51,8 @@ static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
Relation heapRel, Buffer buf, OffsetNumber offset,
ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique);
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken);
static void _bt_findinsertloc(Relation rel,
Buffer *bufptr,
OffsetNumber *offsetptr,
@@ -159,17 +160,27 @@ top:
*/
if (checkUnique != UNIQUE_CHECK_NO)
{
- TransactionId xwait;
+ TransactionId xwait;
+ uint32 speculativeToken;
offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey,
- checkUnique, &is_unique);
+ checkUnique, &is_unique, &speculativeToken);
if (TransactionIdIsValid(xwait))
{
/* Have to wait for the other guy ... */
_bt_relbuf(rel, buf);
- XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+ /*
+ * If it's a speculative insertion, wait for it to finish (ie.
+ * to go ahead with the insertion, or kill the tuple). Otherwise
+ * wait for the transaction to finish as usual.
+ */
+ if (speculativeToken)
+ SpeculativeInsertionWait(xwait, speculativeToken);
+ else
+ XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+
/* start over... */
_bt_freestack(stack);
goto top;
@@ -211,9 +222,12 @@ top:
* also point to end-of-page, which means that the first tuple to check
* is the first tuple on the next page.
*
- * Returns InvalidTransactionId if there is no conflict, else an xact ID
- * we must wait for to see if it commits a conflicting tuple. If an actual
- * conflict is detected, no return --- just ereport().
+ * Returns InvalidTransactionId if there is no conflict, else an xact ID we
+ * must wait for to see if it commits a conflicting tuple. If an actual
+ * conflict is detected, no return --- just ereport(). If an xact ID is
+ * returned, and the conflicting tuple still has a speculative insertion in
+ * progress, *speculativeToken is set to non-zero, and the caller can wait for
+ * the verdict on the insertion using SpeculativeInsertionWait().
*
* However, if checkUnique == UNIQUE_CHECK_PARTIAL, we always return
* InvalidTransactionId because we don't want to wait. In this case we
@@ -223,7 +237,8 @@ top:
static TransactionId
_bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
Buffer buf, OffsetNumber offset, ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique)
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int natts = rel->rd_rel->relnatts;
@@ -340,6 +355,7 @@ _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
if (nbuf != InvalidBuffer)
_bt_relbuf(rel, nbuf);
/* Tell _bt_doinsert to wait... */
+ *speculativeToken = SnapshotDirty.speculativeToken;
return xwait;
}
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 9bb9deb..b9c5c81 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1659,8 +1659,50 @@ BuildIndexInfo(Relation index)
ii->ii_ExclusionStrats = NULL;
}
+ /*
+ * fetch info for checking unique constraints. (this is currently only used
+ * by ExecCheckIndexConstraints(), for INSERT ... ON CONFLICT UPDATE, which
+ * must support "speculative insertion". In regular insertions, the index
+ * AM handles the unique check itself. Might make sense to do this lazily,
+ * only when needed)
+ */
+ if (indexStruct->indisunique)
+ {
+ int ncols = index->rd_rel->relnatts;
+
+ if (index->rd_rel->relam != BTREE_AM_OID)
+ elog(ERROR, "only b-tree indexes are supported for foreign keys");
+
+ ii->ii_UniqueOps = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueProcs = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueStrats = (uint16 *) palloc(sizeof(uint16) * ncols);
+
+ /*
+ * We have to look up the operator's strategy number. This
+ * provides a cross-check that the operator does match the index.
+ */
+ /* We need the func OIDs and strategy numbers too */
+ for (i = 0; i < ncols; i++)
+ {
+ ii->ii_UniqueStrats[i] = BTEqualStrategyNumber;
+ ii->ii_UniqueOps[i] =
+ get_opfamily_member(index->rd_opfamily[i],
+ index->rd_opcintype[i],
+ index->rd_opcintype[i],
+ ii->ii_UniqueStrats[i]);
+ ii->ii_UniqueProcs[i] = get_opcode(ii->ii_UniqueOps[i]);
+ }
+ ii->ii_Unique = true;
+ }
+ else
+ {
+ ii->ii_UniqueOps = NULL;
+ ii->ii_UniqueProcs = NULL;
+ ii->ii_UniqueStrats = NULL;
+ ii->ii_Unique = false;
+ }
+
/* other info */
- ii->ii_Unique = indexStruct->indisunique;
ii->ii_ReadyForInserts = IndexIsReady(indexStruct);
/* initialize index-build state to default */
@@ -2606,10 +2648,10 @@ IndexCheckExclusion(Relation heapRelation,
/*
* Check that this tuple has no conflicts.
*/
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- &(heapTuple->t_self), values, isnull,
- estate, true, false);
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &(heapTuple->t_self),
+ values, isnull, estate, true,
+ false, true, NULL);
}
heap_endscan(scan);
diff --git a/src/backend/commands/constraint.c b/src/backend/commands/constraint.c
index 561d8fa..d5ab12f 100644
--- a/src/backend/commands/constraint.c
+++ b/src/backend/commands/constraint.c
@@ -170,9 +170,10 @@ unique_key_recheck(PG_FUNCTION_ARGS)
* For exclusion constraints we just do the normal check, but now it's
* okay to throw error.
*/
- check_exclusion_constraint(trigdata->tg_relation, indexRel, indexInfo,
- &(new_row->t_self), values, isnull,
- estate, false, false);
+ check_exclusion_or_unique_constraint(trigdata->tg_relation, indexRel,
+ indexInfo, &(new_row->t_self),
+ values, isnull, estate, false,
+ false, true, NULL);
}
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cf95aa8..be4f76e 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2426,7 +2426,8 @@ CopyFrom(CopyState cstate)
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false,
+ InvalidOid);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, tuple,
@@ -2533,7 +2534,7 @@ CopyFromInsertBatch(CopyState cstate, EState *estate, CommandId mycid,
ExecStoreTuple(bufferedTuples[i], myslot, InvalidBuffer, false);
recheckIndexes =
ExecInsertIndexTuples(myslot, &(bufferedTuples[i]->t_self),
- estate);
+ estate, false, InvalidOid);
ExecARInsertTriggers(estate, resultRelInfo,
bufferedTuples[i],
recheckIndexes);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 8a0be5d..924b361 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -97,7 +97,8 @@ static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
static void ExplainScanTarget(Scan *plan, ExplainState *es);
static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
-static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es);
+static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors);
static void ExplainMemberNodes(List *plans, PlanState **planstates,
List *ancestors, ExplainState *es);
static void ExplainSubPlans(List *plans, List *ancestors,
@@ -755,6 +756,9 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
ExplainPreScanMemberNodes(((ModifyTable *) plan)->plans,
((ModifyTableState *) planstate)->mt_plans,
rels_used);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainPreScanNode(((ModifyTableState *) planstate)->onConflict,
+ rels_used);
break;
case T_Append:
ExplainPreScanMemberNodes(((Append *) plan)->appendplans,
@@ -856,6 +860,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
const char *custom_name = NULL;
int save_indent = es->indent;
bool haschildren;
+ bool suppresschildren = false;
+ ModifyTable *mtplan;
switch (nodeTag(plan))
{
@@ -864,13 +870,33 @@ ExplainNode(PlanState *planstate, List *ancestors,
break;
case T_ModifyTable:
sname = "ModifyTable";
- switch (((ModifyTable *) plan)->operation)
+ mtplan = (ModifyTable *) plan;
+ switch (mtplan->operation)
{
case CMD_INSERT:
pname = operation = "Insert";
break;
case CMD_UPDATE:
- pname = operation = "Update";
+ if (mtplan->spec == SPEC_NONE)
+ {
+ pname = operation = "Update";
+ }
+ else
+ {
+ Assert(mtplan->spec == SPEC_UPDATE);
+
+ pname = operation = "Conflict Update";
+
+ /*
+ * Do not display child sequential scan/result node.
+ * Quals from child will be directly attributed to
+ * ModifyTable node, since we prefer to avoid
+ * displaying scan node to users, as it is merely an
+ * implementation detail; it is never executed in the
+ * conventional way.
+ */
+ suppresschildren = true;
+ }
break;
case CMD_DELETE:
pname = operation = "Delete";
@@ -1450,7 +1476,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate, es);
break;
case T_ModifyTable:
- show_modifytable_info((ModifyTableState *) planstate, es);
+ show_modifytable_info((ModifyTableState *) planstate, es,
+ ancestors);
break;
case T_Hash:
show_hash_info((HashState *) planstate, es);
@@ -1578,7 +1605,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate->subPlan;
if (haschildren)
{
- ExplainOpenGroup("Plans", "Plans", false, es);
+ if (!suppresschildren)
+ ExplainOpenGroup("Plans", "Plans", false, es);
/* Pass current PlanState as head of ancestors list for children */
ancestors = lcons(planstate, ancestors);
}
@@ -1601,9 +1629,13 @@ ExplainNode(PlanState *planstate, List *ancestors,
switch (nodeTag(plan))
{
case T_ModifyTable:
- ExplainMemberNodes(((ModifyTable *) plan)->plans,
- ((ModifyTableState *) planstate)->mt_plans,
- ancestors, es);
+ if (((ModifyTable *) plan)->spec != SPEC_UPDATE)
+ ExplainMemberNodes(((ModifyTable *) plan)->plans,
+ ((ModifyTableState *) planstate)->mt_plans,
+ ancestors, es);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainNode(((ModifyTableState *) planstate)->onConflict,
+ ancestors, "Member", NULL, es);
break;
case T_Append:
ExplainMemberNodes(((Append *) plan)->appendplans,
@@ -1641,7 +1673,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (haschildren)
{
ancestors = list_delete_first(ancestors);
- ExplainCloseGroup("Plans", "Plans", false, es);
+
+ if (!suppresschildren)
+ ExplainCloseGroup("Plans", "Plans", false, es);
}
/* in text format, undo whatever indentation we added */
@@ -2119,6 +2153,15 @@ ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
rti = linitial_int(plan->resultRelations);
ExplainTargetRel((Plan *) plan, rti, es);
+
+ if (plan->arbiterIndex != InvalidOid)
+ {
+ char *indexname = get_rel_name(plan->arbiterIndex);
+
+ /* nothing to do for text format explains */
+ if (es->format != EXPLAIN_FORMAT_TEXT && indexname != NULL)
+ ExplainPropertyText("Arbiter Index", indexname, es);
+ }
}
/*
@@ -2154,6 +2197,12 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
if (es->verbose)
namespace = get_namespace_name(get_rel_namespace(rte->relid));
objecttag = "Relation Name";
+
+ /*
+ * ON CONFLICT's "TARGET" alias will not appear in output for
+ * auxiliary ModifyTable as its alias, because target
+ * resultRelation is shared between parent and auxiliary queries
+ */
break;
case T_FunctionScan:
{
@@ -2232,7 +2281,8 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
* Show extra information for a ModifyTable node
*/
static void
-show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
+show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors)
{
FdwRoutine *fdwroutine = mtstate->resultRelInfo->ri_FdwRoutine;
@@ -2254,6 +2304,23 @@ show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
0,
es);
}
+ else if (mtstate->spec == SPEC_UPDATE)
+ {
+ PlanState *ps = (*mtstate->mt_plans);
+
+ /*
+ * Seqscan node is always used, unless optimizer determined that
+ * predicate precludes ever updating, in which case a simple Result
+ * node is possible
+ */
+ Assert(IsA(ps->plan, SeqScan) || IsA(ps->plan, Result));
+
+ /* Attribute child scan node's qual to ModifyTable node */
+ show_scan_qual(ps->plan->qual, "Filter", ps, ancestors, es);
+
+ if (ps->plan->qual)
+ show_instrumentation_count("Rows Removed by Filter", 1, ps, es);
+ }
}
/*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f6a379f..6f0c5ab 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2010,7 +2010,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* the latest version of the row was deleted, so we need do
* nothing. (Should be safe to examine xmin without getting
* buffer's content lock, since xmin never changes in an existing
- * tuple.)
+ * non-promise tuple, and there is no reason to lock a promise
+ * tuple until it is clear that it has been fulfilled.)
*/
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple.t_data),
priorXmax))
@@ -2091,11 +2092,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* case, so as to avoid the "Halloween problem" of
* repeated update attempts. In the latter case it might
* be sensible to fetch the updated tuple instead, but
- * doing so would require changing heap_lock_tuple as well
- * as heap_update and heap_delete to not complain about
- * updating "invisible" tuples, which seems pretty scary.
- * So for now, treat the tuple as deleted and do not
- * process.
+ * doing so would require changing heap_update and
+ * heap_delete to not complain about updating "invisible"
+ * tuples, which seems pretty scary (heap_lock_tuple will
+ * not complain, but few callers expect HeapTupleInvisible,
+ * and we're not one of them). So for now, treat the tuple
+ * as deleted and do not process.
*/
ReleaseBuffer(buffer);
return NULL;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 32697dd..ad15dcf 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -990,7 +990,8 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
*
* This returns a list of index OIDs for any unique or exclusion
* constraints that are deferred and that had
- * potential (unconfirmed) conflicts.
+ * potential (unconfirmed) conflicts. (if noDupErr == true, the
+ * same is done for non-deferred constraints)
*
* CAUTION: this must not be called for a HOT update.
* We can't defend against that here for lack of info.
@@ -1000,7 +1001,9 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
List *
ExecInsertIndexTuples(TupleTableSlot *slot,
ItemPointer tupleid,
- EState *estate)
+ EState *estate,
+ bool noDupErr,
+ Oid arbiterIdx)
{
List *result = NIL;
ResultRelInfo *resultRelInfo;
@@ -1070,7 +1073,18 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
/* Skip this index-update if the predicate isn't satisfied */
if (!ExecQual(predicate, econtext, false))
+ {
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ ereport(ERROR,
+ (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION),
+ errmsg("partial arbiter unique index has predicate that does not cover tuple proposed for insertion"),
+ errdetail("ON CONFLICT inference clause implies that the tuple proposed for insertion actually be covered by partial predicate for index \"%s\".",
+ RelationGetRelationName(indexRelation)),
+ errhint("ON CONFLICT inference clause must infer a unique index that covers the final tuple, after BEFORE ROW INSERT triggers fire."),
+ errtableconstraint(heapRelation,
+ RelationGetRelationName(indexRelation))));
continue;
+ }
}
/*
@@ -1092,9 +1106,16 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* For a deferrable unique index, we tell the index AM to just detect
* possible non-uniqueness, and we add the index OID to the result
* list if further checking is needed.
+ *
+ * For a speculative insertion (ON CONFLICT UPDATE/IGNORE), just detect
+ * possible non-uniqueness, and tell the caller if it failed.
*/
if (!indexRelation->rd_index->indisunique)
checkUnique = UNIQUE_CHECK_NO;
+ else if (noDupErr && arbiterIdx == InvalidOid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
+ else if (noDupErr && arbiterIdx == indexRelation->rd_index->indexrelid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
else if (indexRelation->rd_index->indimmediate)
checkUnique = UNIQUE_CHECK_YES;
else
@@ -1112,8 +1133,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* If the index has an associated exclusion constraint, check that.
* This is simpler than the process for uniqueness checks since we
* always insert first and then check. If the constraint is deferred,
- * we check now anyway, but don't throw error on violation; instead
- * we'll queue a recheck event.
+ * we check now anyway, but don't throw error on violation or wait for
+ * a conclusive outcome from a concurrent insertion; instead we'll
+ * queue a recheck event. Similarly, noDupErr callers (speculative
+ * inserters) will recheck later, and wait for a conclusive outcome
+ * then.
*
* An index for an exclusion constraint can't also be UNIQUE (not an
* essential property, we just don't allow it in the grammar), so no
@@ -1121,13 +1145,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*/
if (indexInfo->ii_ExclusionOps != NULL)
{
- bool errorOK = !indexRelation->rd_index->indimmediate;
+ bool violationOK = (!indexRelation->rd_index->indimmediate ||
+ noDupErr);
satisfiesConstraint =
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- tupleid, values, isnull,
- estate, false, errorOK);
+ check_exclusion_or_unique_constraint(heapRelation,
+ indexRelation, indexInfo,
+ tupleid, values, isnull,
+ estate, false,
+ violationOK, false, NULL);
}
if ((checkUnique == UNIQUE_CHECK_PARTIAL ||
@@ -1135,7 +1161,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
!satisfiesConstraint)
{
/*
- * The tuple potentially violates the uniqueness or exclusion
+ * The tuple potentially violates the unique index or exclusion
* constraint, so make a note of the index so that we can re-check
* it later.
*/
@@ -1146,18 +1172,150 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
return result;
}
+/* ----------------------------------------------------------------
+ * ExecCheckIndexConstraints
+ *
+ * This routine checks if a tuple violates any unique or
+ * exclusion constraints. If no conflict, returns true.
+ * Otherwise returns false, and the TID of the conflicting
+ * tuple is returned in *conflictTid
+ *
+ * Note that this doesn't lock the values in any way, so it's
+ * possible that a conflicting tuple is inserted immediately
+ * after this returns, and a later insert with the same values
+ * still conflicts. But this can be used for a pre-check before
+ * insertion.
+ * ----------------------------------------------------------------
+ */
+bool
+ExecCheckIndexConstraints(TupleTableSlot *slot,
+ EState *estate, ItemPointer conflictTid,
+ Oid arbiterIdx)
+{
+ ResultRelInfo *resultRelInfo;
+ int i;
+ int numIndices;
+ RelationPtr relationDescs;
+ Relation heapRelation;
+ IndexInfo **indexInfoArray;
+ ExprContext *econtext;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ ItemPointerData invalidItemPtr;
+ bool checkedIndex = false;
+
+ ItemPointerSetInvalid(&invalidItemPtr);
+
+ /*
+ * Get information from the result relation info structure.
+ */
+ resultRelInfo = estate->es_result_relation_info;
+ numIndices = resultRelInfo->ri_NumIndices;
+ relationDescs = resultRelInfo->ri_IndexRelationDescs;
+ indexInfoArray = resultRelInfo->ri_IndexRelationInfo;
+ heapRelation = resultRelInfo->ri_RelationDesc;
+
+ /*
+ * We will use the EState's per-tuple context for evaluating predicates
+ * and index expressions (creating it if it's not already there).
+ */
+ econtext = GetPerTupleExprContext(estate);
+
+ /* Arrange for econtext's scan tuple to be the tuple under test */
+ econtext->ecxt_scantuple = slot;
+
+ /*
+ * for each index, form and insert the index tuple
+ */
+ for (i = 0; i < numIndices; i++)
+ {
+ Relation indexRelation = relationDescs[i];
+ IndexInfo *indexInfo;
+ bool satisfiesConstraint;
+
+ if (indexRelation == NULL)
+ continue;
+
+ indexInfo = indexInfoArray[i];
+
+ if (!indexInfo->ii_Unique && !indexInfo->ii_ExclusionOps)
+ continue;
+
+ /* If the index is marked as read-only, ignore it */
+ if (!indexInfo->ii_ReadyForInserts)
+ continue;
+
+ /* When specific arbiter index requested, only examine it */
+ if (arbiterIdx != InvalidOid &&
+ arbiterIdx != indexRelation->rd_index->indexrelid)
+ continue;
+
+ checkedIndex = true;
+
+ /* Check for partial index */
+ if (indexInfo->ii_Predicate != NIL)
+ {
+ List *predicate;
+
+ /*
+ * If predicate state not set up yet, create it (in the estate's
+ * per-query context)
+ */
+ predicate = indexInfo->ii_PredicateState;
+ if (predicate == NIL)
+ {
+ predicate = (List *)
+ ExecPrepareExpr((Expr *) indexInfo->ii_Predicate,
+ estate);
+ indexInfo->ii_PredicateState = predicate;
+ }
+
+ /* Skip this index-update if the predicate isn't satisfied */
+ if (!ExecQual(predicate, econtext, false))
+ continue;
+ }
+
+
+ /*
+ * FormIndexDatum fills in its values and isnull parameters with the
+ * appropriate values for the column(s) of the index.
+ */
+ FormIndexDatum(indexInfo,
+ slot,
+ estate,
+ values,
+ isnull);
+
+ satisfiesConstraint =
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &invalidItemPtr,
+ values, isnull, estate, false,
+ true, true, conflictTid);
+ if (!satisfiesConstraint)
+ return false;
+ }
+
+ if (arbiterIdx != InvalidOid && !checkedIndex)
+ elog(ERROR, "unexpected failure to find arbiter unique index");
+
+ return true;
+}
+
/*
- * Check for violation of an exclusion constraint
+ * Check for violation of an exclusion or unique constraint
*
* heap: the table containing the new tuple
* index: the index supporting the exclusion constraint
* indexInfo: info about the index, including the exclusion properties
- * tupleid: heap TID of the new tuple we have just inserted
+ * tupleid: heap TID of the new tuple we have just inserted (invalid if we
+ * haven't inserted a new tuple yet)
* values, isnull: the *index* column values computed for the new tuple
* estate: an EState we can do evaluation in
* newIndex: if true, we are trying to build a new index (this affects
* only the wording of error messages)
* errorOK: if true, don't throw error for violation
+ * wait: if true, wait for conflicting transaction to finish, even if !errorOK
+ * conflictTid: if not-NULL, the TID of conflicting tuple is returned here.
*
* Returns true if OK, false if actual or potential violation
*
@@ -1167,16 +1325,24 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* is convenient for deferred exclusion checks; we need not bother queuing
* a deferred event if there is definitely no conflict at insertion time.
*
- * When errorOK is false, we'll throw error on violation, so a false result
+ * When violationOK is false, we'll throw error on violation, so a false result
* is impossible.
+ *
+ * Note: The indexam is normally responsible for checking unique constraints,
+ * so this normally only needs to be used for exclusion constraints. But this
+ * function is also called when doing a "pre-check" for conflicts with "INSERT
+ * ... ON CONFLICT UPDATE", before inserting the actual tuple.
*/
bool
-check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
- ItemPointer tupleid, Datum *values, bool *isnull,
- EState *estate, bool newIndex, bool errorOK)
+check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo, ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate, bool newIndex,
+ bool violationOK, bool wait,
+ ItemPointer conflictTid)
{
- Oid *constr_procs = indexInfo->ii_ExclusionProcs;
- uint16 *constr_strats = indexInfo->ii_ExclusionStrats;
+ Oid *constr_procs;
+ uint16 *constr_strats;
Oid *index_collations = index->rd_indcollation;
int index_natts = index->rd_index->indnatts;
IndexScanDesc index_scan;
@@ -1190,6 +1356,17 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
TupleTableSlot *existing_slot;
TupleTableSlot *save_scantuple;
+ if (indexInfo->ii_ExclusionOps)
+ {
+ constr_procs = indexInfo->ii_ExclusionProcs;
+ constr_strats = indexInfo->ii_ExclusionStrats;
+ }
+ else
+ {
+ constr_procs = indexInfo->ii_UniqueProcs;
+ constr_strats = indexInfo->ii_UniqueStrats;
+ }
+
/*
* If any of the input values are NULL, the constraint check is assumed to
* pass (i.e., we assume the operators are strict).
@@ -1253,7 +1430,8 @@ retry:
/*
* Ignore the entry for the tuple we're trying to check.
*/
- if (ItemPointerEquals(tupleid, &tup->t_self))
+ if (ItemPointerIsValid(tupleid) &&
+ ItemPointerEquals(tupleid, &tup->t_self))
{
if (found_self) /* should not happen */
elog(ERROR, "found self tuple multiple times in index \"%s\"",
@@ -1287,9 +1465,11 @@ retry:
* we're not supposed to raise error, just return the fact of the
* potential conflict without waiting to see if it's real.
*/
- if (errorOK)
+ if (violationOK && !wait)
{
conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
break;
}
@@ -1307,14 +1487,29 @@ retry:
if (TransactionIdIsValid(xwait))
{
index_endscan(index_scan);
- XactLockTableWait(xwait, heap, &tup->t_data->t_ctid,
- XLTW_RecheckExclusionConstr);
+ if (DirtySnapshot.speculativeToken)
+ SpeculativeInsertionWait(DirtySnapshot.xmin,
+ DirtySnapshot.speculativeToken);
+ else if (violationOK)
+ XactLockTableWait(xwait, heap, &tup->t_self,
+ XLTW_RecheckExclusionConstr);
+ else
+ XactLockTableWait(xwait, heap, &tup->t_data->t_ctid,
+ XLTW_RecheckExclusionConstr);
goto retry;
}
/*
- * We have a definite conflict. Report it.
+ * We have a definite conflict. Return it to caller, or report it.
*/
+ if (violationOK)
+ {
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+ break;
+ }
+
error_new = BuildIndexValueDescription(index, values, isnull);
error_existing = BuildIndexValueDescription(index, existing_values,
existing_isnull);
@@ -1346,6 +1541,9 @@ retry:
* However, it is possible to define exclusion constraints for which that
* wouldn't be true --- for instance, if the operator is <>. So we no
* longer complain if found_self is still false.
+ *
+ * It would also not be true in the pre-check mode, when we haven't
+ * inserted a tuple yet.
*/
econtext->ecxt_scantuple = save_scantuple;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 48107d9..4699060 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -151,10 +151,11 @@ lnext:
* case, so as to avoid the "Halloween problem" of repeated
* update attempts. In the latter case it might be sensible
* to fetch the updated tuple instead, but doing so would
- * require changing heap_lock_tuple as well as heap_update and
- * heap_delete to not complain about updating "invisible"
- * tuples, which seems pretty scary. So for now, treat the
- * tuple as deleted and do not process.
+ * require changing heap_update and heap_delete to not complain
+ * about updating "invisible" tuples, which seems pretty scary
+ * (heap_lock_tuple will not complain, but few callers expect
+ * HeapTupleInvisible, and we're not one of them). So for now,
+ * treat the tuple as deleted and do not process.
*/
goto lnext;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index f96fb24..d03604c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -46,12 +46,20 @@
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/tqual.h"
+static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ EState *estate);
+
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
* target relation's rowtype
@@ -151,6 +159,37 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
return ExecProject(projectReturning, NULL);
}
+/*
+ * ExecCheckHeapTupleVisible -- verify heap tuple is visible
+ *
+ * It is not acceptable to proceed with avoiding insertion (taking
+ * speculative insertion's alternative IGNORE/UPDATE path) on the
+ * basis of another tuple that is not visible, iff xact uses higher
+ * isolation levels.
+ */
+static void
+ExecCheckHeapTupleVisible(EState *estate,
+ ResultRelInfo *relinfo,
+ ItemPointer tid)
+{
+
+ Relation rel = relinfo->ri_RelationDesc;
+ Buffer buffer;
+ HeapTupleData tuple;
+
+ if (!IsolationUsesXactSnapshot())
+ return;
+
+ tuple.t_self = *tid;
+ if (!heap_fetch(rel, estate->es_snapshot, &tuple, &buffer, false, NULL))
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent insert or update dictating alternative ON CONFLICT path"),
+ errhint("Even ON CONFLICT IGNORE must consider effects of concurrent transactions.")));
+
+ ReleaseBuffer(buffer);
+}
+
/* ----------------------------------------------------------------
* ExecInsert
*
@@ -163,6 +202,9 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
static TupleTableSlot *
ExecInsert(TupleTableSlot *slot,
TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ Oid arbiterIndex,
+ SpecType spec,
EState *estate,
bool canSetTag)
{
@@ -246,6 +288,9 @@ ExecInsert(TupleTableSlot *slot,
}
else
{
+ bool conflict;
+ ItemPointerData conflictTid;
+
/*
* Constraints might reference the tableoid column, so initialize
* t_tableOid before evaluating them.
@@ -259,20 +304,130 @@ ExecInsert(TupleTableSlot *slot,
ExecConstraints(resultRelInfo, slot, estate);
/*
- * insert the tuple
- *
- * Note: heap_insert returns the tid (location) of the new tuple in
- * the t_self field.
+ * If we are expecting duplicates, do a non-conclusive first check. We
+ * might still fail later, after inserting the heap tuple, if a
+ * conflicting row was inserted concurrently. We'll handle that by
+ * deleting the already-inserted tuple and retrying, but that's fairly
+ * expensive, so we try to avoid it.
*/
- newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, 0, NULL);
+vlock:
+ conflict = false;
+ ItemPointerSetInvalid(&conflictTid);
/*
- * insert index entries for tuple
+ * XXX If we know or assume that there are few duplicates, it would be
+ * better to skip this, and just optimistically proceed with the
+ * insertion below. You would then leave behind some garbage when a
+ * conflict happens, but if it's rare, it doesn't matter much. Some
+ * kind of heuristic might be in order here, like stop doing these
+ * pre-checks if the last 100 insertions have not been duplicates.
*/
- if (resultRelInfo->ri_NumIndices > 0)
- recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ if (spec != SPEC_NONE && resultRelInfo->ri_NumIndices > 0)
+ {
+ /*
+ * Check if it's required to proceed with the second phase
+ * ("insertion proper") of speculative insertion in respect of the
+ * slot. If insertion ultimately does not proceed, no firing of
+ * AFTER ROW INSERT triggers occurs.
+ *
+ * We don't suppress the effects (or, perhaps, side-effects) of
+ * BEFORE ROW INSERT triggers. This isn't ideal, but then we
+ * cannot proceed with even considering uniqueness violations until
+ * these triggers fire on the one hand, but on the other hand they
+ * have the ability to execute arbitrary user-defined code which
+ * may perform operations entirely outside the system's ability to
+ * nullify.
+ */
+ if (!ExecCheckIndexConstraints(slot, estate, &conflictTid,
+ arbiterIndex))
+ conflict = true;
+ }
+
+ if (!conflict)
+ {
+ /*
+ * Before we start the insertion, acquire our "promise tuple
+ * insertion lock". Others can use that (rather than an XID lock,
+ * which is appropriate only for non-promise tuples) to wait for us
+ * to decide if we're going to go ahead with the insertion.
+ */
+ if (spec != SPEC_NONE)
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+
+ /*
+ * insert the tuple
+ *
+ * Note: heap_insert returns the tid (location) of the new tuple in
+ * the t_self field.
+ */
+ newId = heap_insert(resultRelationDesc, tuple,
+ estate->es_output_cid,
+ spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0,
+ NULL);
+
+ /*
+ * insert index entries for tuple
+ */
+ if (resultRelInfo->ri_NumIndices > 0)
+ recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+ estate,
+ spec != SPEC_NONE,
+ arbiterIndex);
+
+ if (spec != SPEC_NONE && recheckIndexes)
+ {
+ HeapUpdateFailureData hufd;
+
+ /*
+ * Race: concurrent insertion conflicts with our speculative
+ * heap tuple
+ */
+ conflict = true;
+
+ /*
+ * Must "super-delete" the heap tuple and retry from the start.
+ *
+ * This is occasionally necessary so that "unprincipled
+ * deadlocks" are avoided; now that a conflict was found,
+ * other sessions should not wait on our speculative token, and
+ * they certainly shouldn't treat our speculatively-inserted
+ * heap tuple as an ordinary tuple that it must wait on the
+ * outcome of our xact to UPDATE/DELETE. This makes heap
+ * tuples behave as conceptual "value locks" of short duration,
+ * distinct from ordinary tuples that other xacts must wait on
+ * xmin-xact-end of in the event of a possible unique/exclusion
+ * violation (the violation that arbitrates taking the
+ * alternative UPDATE/IGNORE path).
+ */
+ heap_delete(resultRelationDesc, &(tuple->t_self),
+ estate->es_output_cid, NULL, false, &hufd, true);
+ }
+
+ if (spec != SPEC_NONE)
+ {
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+ }
+ }
+
+ if (conflict)
+ {
+ /*
+ * Lock and consider updating in the SPEC_INSERT case. For the
+ * SPEC_IGNORE case, it's still necessary to verify that the tuple
+ * is visible to the executor's MVCC snapshot.
+ */
+ if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
+ &conflictTid,
+ planSlot,
+ onConflict,
+ estate))
+ goto vlock;
+ else if (spec == SPEC_IGNORE)
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
+
+ return NULL;
+ }
}
if (canSetTag)
@@ -399,7 +554,8 @@ ldelete:;
estate->es_output_cid,
estate->es_crosscheck_snapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd,
+ false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -768,7 +924,7 @@ lreplace:;
*/
if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false, InvalidOid);
}
if (canSetTag)
@@ -793,6 +949,218 @@ lreplace:;
}
+/* ----------------------------------------------------------------
+ * Try to lock tuple for update as part of speculative insertion. If
+ * a qual originating from ON CONFLICT UPDATE is satisfied, update
+ * (but still lock row, even though it may not satisfy estate's
+ * snapshot).
+ *
+ * Returns value indicating if we're done (with or without an
+ * update), or if the executor must start from scratch.
+ * ----------------------------------------------------------------
+ */
+static bool
+ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ EState *estate)
+{
+ Relation relation = resultRelInfo->ri_RelationDesc;
+ HeapTupleData tuple;
+ HeapTuple copyTuple = NULL;
+ HeapUpdateFailureData hufd;
+ HTSU_Result test;
+ Buffer buffer;
+ TupleTableSlot *slot;
+
+ /*
+ * XXX We don't have the TID of the conflicting tuple if the index
+ * insertion failed and we had to kill the already inserted tuple. We'd
+ * need to modify the index AM to pass through the TID back here. So for
+ * now, we just retry, and hopefully the new pre-check will fail on the
+ * same tuple (or it's finished by now), and we'll get its TID that way.
+ */
+ if (!ItemPointerIsValid(conflictTid))
+ {
+ elog(DEBUG1, "insertion conflicted after pre-check");
+ return false;
+ }
+
+ /*
+ * Lock tuple for update.
+ *
+ * Like EvalPlanQualFetch(), don't follow updates. There is no actual
+ * benefit to doing so, since as discussed below, a conflict invalidates
+ * our previous conclusion that the tuple is the conclusively committed
+ * conflicting tuple.
+ */
+ tuple.t_self = *conflictTid;
+ test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
+ LockTupleExclusive, LockWaitBlock, false, &buffer,
+ &hufd);
+
+ if (test == HeapTupleMayBeUpdated)
+ copyTuple = heap_copytuple(&tuple);
+
+ switch (test)
+ {
+ case HeapTupleInvisible:
+ /*
+ * This may occur when an instantaneously invisible tuple is blamed
+ * as a conflict because multiple rows are inserted with the same
+ * constrained values.
+ *
+ * We cannot proceed, because to do so would leave users open to
+ * the risk that the same row will be updated a second time in the
+ * same command; allowing a second update affecting a single row
+ * within the same command a second time would leave the update
+ * order undefined. It is the user's responsibility to resolve
+ * these self-duplicates in advance of proposing for insertion a
+ * set of tuples, but warn them. These problems are why SQL-2003
+ * similarly specifies that for SQL MERGE, an exception must be
+ * raised in the event of an attempt to update the same row twice.
+ *
+ * XXX It might be preferable to do something similar when a row is
+ * locked twice (and not updated twice) by the same speculative
+ * insertion, as if to take each lock acquisition as a indication
+ * of a discrete, unfulfilled intent to update (perhaps in some
+ * later command of the same xact). This does not seem feasible,
+ * though.
+ */
+ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data)))
+ ereport(ERROR,
+ (errcode(ERRCODE_CARDINALITY_VIOLATION),
+ errmsg("ON CONFLICT UPDATE command could not lock/update self-inserted tuple"),
+ errhint("Ensure that no rows proposed for insertion within the same command have duplicate constrained values.")));
+
+ /* This shouldn't happen */
+ elog(ERROR, "attempted to lock invisible tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleSelfUpdated:
+ /*
+ * XXX In practice this is dead code, since BEFORE triggers fire
+ * prior to speculative insertion. Since a dirty snapshot is used
+ * to find possible conflict tuples, speculative insertion could
+ * not have seen the old/MVCC-current row version at all (even if
+ * it was only rendered old by this same command).
+ */
+ elog(ERROR,"unexpected self-updated tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleMayBeUpdated:
+ /*
+ * Success -- we're done, as tuple is locked. Verify that the
+ * tuple is known to be visible to our snapshot under conventional
+ * MVCC rules if the current isolation level mandates that. In
+ * READ COMMITTED mode, we can lock and update a tuple still in
+ * progress according to our snapshot, but higher isolation levels
+ * cannot avail of that, and must actively defend against doing so.
+ * We might get a serialization failure within ExecUpdate() anyway
+ * if this step was skipped, but this cannot be relied on, for
+ * example because the auxiliary WHERE clause happened to not be
+ * satisfied.
+ */
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &tuple.t_data->t_ctid);
+
+ /*
+ * This loosening of snapshot isolation for the benefit of READ
+ * COMMITTED speculative insertions is used consistently:
+ * speculative quals are only tested against already locked tuples.
+ * It would be rather inconsistent to UPDATE when no tuple version
+ * is MVCC-visible (which seems inevitable since we must *do
+ * something* there, and "READ COMMITTED serialization failures"
+ * are unappealing), while also avoiding updating here entirely on
+ * the basis of a non-conclusive tuple version (the version that
+ * happens to be visible to this command's MVCC snapshot, or a
+ * subsequent non-conclusive version).
+ *
+ * In other words: Only the final, conclusively locked tuple
+ * (which must have the same value in the relevant constrained
+ * attribute(s) as the value previously "value locked") matters.
+ */
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStartNode(onConflict->ps.instrument);
+
+ /*
+ * Conceptually, the parent ModifyTable is like a relation scan
+ * node that uses a dirty snapshot, returning rows which the
+ * auxiliary plan must operate on (if only to lock all such rows).
+ * EvalPlanQual() is involved in the evaluation of their UPDATE,
+ * regardless of whether or not the tuple is visible to the
+ * command's MVCC Snapshot.
+ */
+ EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
+
+ /*
+ * UPDATE affects the same ResultRelation as INSERT in the context
+ * of ON CONFLICT UPDATE, so parent's target rti is used
+ */
+ EvalPlanQualSetTuple(&onConflict->mt_epqstate,
+ resultRelInfo->ri_RangeTableIndex, copyTuple);
+
+ slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+
+ if (!TupIsNull(slot))
+ ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot,
+ &onConflict->mt_epqstate, onConflict->ps.state,
+ false);
+
+ ReleaseBuffer(buffer);
+
+ /*
+ * As when executing an UPDATE's ModifyTable node in the
+ * conventional manner, reset the per-output-tuple ExprContext
+ */
+ ResetPerTupleExprContext(onConflict->ps.state);
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStopNode(onConflict->ps.instrument, 0);
+
+ return true;
+ case HeapTupleUpdated:
+ if (IsolationUsesXactSnapshot())
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent update")));
+
+ /*
+ * Tell caller to try again from the very start. We don't use the
+ * usual EvalPlanQual() looping pattern here, fundamentally because
+ * we don't have a useful qual to verify the next tuple with. Our
+ * "qual" is really any user-supplied qual AND the unique
+ * constraint "col OP value" implied by a speculative insertion
+ * conflict. However, because of the selective evaluation of the
+ * former "qual" (the interactions with MVCC and row locking), this
+ * is an over-simplification.
+ *
+ * We might devise a means of verifying, by way of binary equality
+ * in a similar manner to HOT codepaths, if any unique indexed
+ * columns changed, but this would only serve to ameliorate the
+ * fundamental problem. It might well not be good enough, because
+ * those columns could change too. It seems unlikely that working
+ * harder here is worthwhile.
+ *
+ * At this point, all bets are off -- it might actually turn out to
+ * be okay to proceed with insertion instead of locking now (the
+ * tuple we attempted to lock could have been deleted, for
+ * example). On the other hand, it might not be okay, but for an
+ * entirely different reason, with an entirely separate TID to
+ * blame and lock. This TID may not even be part of the same
+ * update chain.
+ */
+ ReleaseBuffer(buffer);
+ return false;
+ default:
+ elog(ERROR, "unrecognized heap_lock_tuple status: %u", test);
+ }
+
+ return false;
+}
+
+
/*
* Process BEFORE EACH STATEMENT triggers
*/
@@ -803,6 +1171,9 @@ fireBSTriggers(ModifyTableState *node)
{
case CMD_INSERT:
ExecBSInsertTriggers(node->ps.state, node->resultRelInfo);
+ if (node->spec == SPEC_INSERT)
+ ExecBSUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
break;
case CMD_UPDATE:
ExecBSUpdateTriggers(node->ps.state, node->resultRelInfo);
@@ -825,6 +1196,9 @@ fireASTriggers(ModifyTableState *node)
switch (node->operation)
{
case CMD_INSERT:
+ if (node->spec == SPEC_INSERT)
+ ExecASUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
ExecASInsertTriggers(node->ps.state, node->resultRelInfo);
break;
case CMD_UPDATE:
@@ -852,6 +1226,8 @@ ExecModifyTable(ModifyTableState *node)
{
EState *estate = node->ps.state;
CmdType operation = node->operation;
+ ModifyTableState *onConflict = (ModifyTableState *) node->onConflict;
+ SpecType spec = node->spec;
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
PlanState *subplanstate;
@@ -1022,7 +1398,9 @@ ExecModifyTable(ModifyTableState *node)
switch (operation)
{
case CMD_INSERT:
- slot = ExecInsert(slot, planSlot, estate, node->canSetTag);
+ slot = ExecInsert(slot, planSlot, onConflict,
+ node->arbiterIndex, spec, estate,
+ node->canSetTag);
break;
case CMD_UPDATE:
slot = ExecUpdate(tupleid, oldtuple, slot, planSlot,
@@ -1070,6 +1448,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ModifyTableState *mtstate;
CmdType operation = node->operation;
+ Plan *onConflictPlan = node->onConflictPlan;
int nplans = list_length(node->plans);
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
@@ -1097,6 +1476,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->resultRelInfo = estate->es_result_relations + node->resultRelIndex;
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
mtstate->mt_nplans = nplans;
+ mtstate->spec = node->spec;
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -1137,6 +1517,14 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
resultRelInfo->ri_IndexRelationDescs == NULL)
ExecOpenIndices(resultRelInfo);
+ /*
+ * ON CONFLICT UPDATE variant must have unique index to arbitrate on
+ * taking alternative path
+ */
+ Assert(node->spec != SPEC_INSERT || node->arbiterIndex != InvalidOid);
+
+ mtstate->arbiterIndex = node->arbiterIndex;
+
/* Now init the plan for this result rel */
estate->es_result_relation_info = resultRelInfo;
mtstate->mt_plans[i] = ExecInitNode(subplan, estate, eflags);
@@ -1308,7 +1696,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
break;
case CMD_UPDATE:
case CMD_DELETE:
- junk_filter_needed = true;
+ junk_filter_needed = (node->spec == SPEC_NONE);
break;
default:
elog(ERROR, "unknown operation");
@@ -1373,6 +1761,30 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
}
/*
+ * Initialize auxiliary ModifyTable node for INSERT...ON CONFLICT UPDATE.
+ *
+ * The UPDATE portion of the query is essentially represented as auxiliary
+ * to INSERT state at all stages of query processing, with a representation
+ * at each stage that is analogous to a regular UPDATE.
+ */
+ if (onConflictPlan)
+ {
+ PlanState *pstate;
+
+ Assert(mtstate->spec == SPEC_INSERT);
+
+ /*
+ * Initialize auxiliary child plan.
+ *
+ * ExecModifyTable() is never called for auxiliary update
+ * ModifyTableState. Execution of the auxiliary plan is driven by its
+ * parent in an ad-hoc fashion.
+ */
+ pstate = ExecInitNode(onConflictPlan, estate, eflags);
+ mtstate->onConflict = pstate;
+ }
+
+ /*
* Set up a tuple table slot for use for trigger output tuples. In a plan
* containing multiple ModifyTable nodes, all can share one such slot, so
* we keep it in the estate.
@@ -1387,11 +1799,18 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* ModifyTable node too, but there's no need.) Note the use of lcons not
* lappend: we need later-initialized ModifyTable nodes to be shut down
* before earlier ones. This ensures that we don't throw away RETURNING
- * rows that need to be seen by a later CTE subplan.
+ * rows that need to be seen by a later CTE subplan. We do not want to
+ * append an auxiliary ON CONFLICT UPDATE node either, since it must have a
+ * parent SPEC_INSERT ModifyTable node that it is auxiliary to that
+ * directly drives execution of what is logically a single unified
+ * statement (*that* plan will be appended here, though). If it must
+ * project updated rows, that will only ever be done through the parent.
*/
- if (!mtstate->canSetTag)
+ if (!mtstate->canSetTag && mtstate->spec != SPEC_UPDATE)
+ {
estate->es_auxmodifytables = lcons(mtstate,
estate->es_auxmodifytables);
+ }
return mtstate;
}
@@ -1442,6 +1861,8 @@ ExecEndModifyTable(ModifyTableState *node)
*/
for (i = 0; i < node->mt_nplans; i++)
ExecEndNode(node->mt_plans[i]);
+
+ ExecEndNode(node->onConflict);
}
void
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 00ffe4a..6c1a7f1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -178,6 +178,9 @@ _copyModifyTable(const ModifyTable *from)
COPY_NODE_FIELD(resultRelations);
COPY_SCALAR_FIELD(resultRelIndex);
COPY_NODE_FIELD(plans);
+ COPY_SCALAR_FIELD(spec);
+ COPY_SCALAR_FIELD(arbiterIndex);
+ COPY_NODE_FIELD(onConflictPlan);
COPY_NODE_FIELD(withCheckOptionLists);
COPY_NODE_FIELD(returningLists);
COPY_NODE_FIELD(fdwPrivLists);
@@ -2120,6 +2123,31 @@ _copyWithClause(const WithClause *from)
return newnode;
}
+static InferClause *
+_copyInferClause(const InferClause *from)
+{
+ InferClause *newnode = makeNode(InferClause);
+
+ COPY_NODE_FIELD(indexElems);
+ COPY_NODE_FIELD(whereClause);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
+static ConflictClause *
+_copyConflictClause(const ConflictClause *from)
+{
+ ConflictClause *newnode = makeNode(ConflictClause);
+
+ COPY_SCALAR_FIELD(specclause);
+ COPY_NODE_FIELD(infer);
+ COPY_NODE_FIELD(updatequery);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
static CommonTableExpr *
_copyCommonTableExpr(const CommonTableExpr *from)
{
@@ -2525,6 +2553,10 @@ _copyQuery(const Query *from)
COPY_NODE_FIELD(jointree);
COPY_NODE_FIELD(targetList);
COPY_NODE_FIELD(withCheckOptions);
+ COPY_SCALAR_FIELD(specClause);
+ COPY_NODE_FIELD(arbiterExpr);
+ COPY_NODE_FIELD(arbiterWhere);
+ COPY_NODE_FIELD(onConflict);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(groupClause);
COPY_NODE_FIELD(havingQual);
@@ -2548,6 +2580,7 @@ _copyInsertStmt(const InsertStmt *from)
COPY_NODE_FIELD(relation);
COPY_NODE_FIELD(cols);
COPY_NODE_FIELD(selectStmt);
+ COPY_NODE_FIELD(confClause);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(withClause);
@@ -4721,6 +4754,12 @@ copyObject(const void *from)
case T_WithClause:
retval = _copyWithClause(from);
break;
+ case T_InferClause:
+ retval = _copyInferClause(from);
+ break;
+ case T_ConflictClause:
+ retval = _copyConflictClause(from);
+ break;
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 79035b2..4127269 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -863,6 +863,10 @@ _equalQuery(const Query *a, const Query *b)
COMPARE_NODE_FIELD(jointree);
COMPARE_NODE_FIELD(targetList);
COMPARE_NODE_FIELD(withCheckOptions);
+ COMPARE_SCALAR_FIELD(specClause);
+ COMPARE_NODE_FIELD(arbiterExpr);
+ COMPARE_NODE_FIELD(arbiterWhere);
+ COMPARE_NODE_FIELD(onConflict);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(groupClause);
COMPARE_NODE_FIELD(havingQual);
@@ -884,6 +888,7 @@ _equalInsertStmt(const InsertStmt *a, const InsertStmt *b)
COMPARE_NODE_FIELD(relation);
COMPARE_NODE_FIELD(cols);
COMPARE_NODE_FIELD(selectStmt);
+ COMPARE_NODE_FIELD(confClause);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(withClause);
@@ -2426,6 +2431,27 @@ _equalWithClause(const WithClause *a, const WithClause *b)
}
static bool
+_equalInferClause(const InferClause *a, const InferClause *b)
+{
+ COMPARE_NODE_FIELD(indexElems);
+ COMPARE_NODE_FIELD(whereClause);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
+_equalConflictClause(const ConflictClause *a, const ConflictClause *b)
+{
+ COMPARE_SCALAR_FIELD(specclause);
+ COMPARE_NODE_FIELD(infer);
+ COMPARE_NODE_FIELD(updatequery);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
_equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
{
COMPARE_STRING_FIELD(ctename);
@@ -3148,6 +3174,12 @@ equal(const void *a, const void *b)
case T_WithClause:
retval = _equalWithClause(a, b);
break;
+ case T_InferClause:
+ retval = _equalInferClause(a, b);
+ break;
+ case T_ConflictClause:
+ retval = _equalConflictClause(a, b);
+ break;
case T_CommonTableExpr:
retval = _equalCommonTableExpr(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 21dfda7..4107cc9 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1474,6 +1474,12 @@ exprLocation(const Node *expr)
case T_WithClause:
loc = ((const WithClause *) expr)->location;
break;
+ case T_InferClause:
+ loc = ((const InferClause *) expr)->location;
+ break;
+ case T_ConflictClause:
+ loc = ((const ConflictClause *) expr)->location;
+ break;
case T_CommonTableExpr:
loc = ((const CommonTableExpr *) expr)->location;
break;
@@ -1958,6 +1964,12 @@ query_tree_walker(Query *query,
return true;
if (walker((Node *) query->withCheckOptions, context))
return true;
+ if (walker((Node *) query->arbiterExpr, context))
+ return true;
+ if (walker(query->arbiterWhere, context))
+ return true;
+ if (walker(query->onConflict, context))
+ return true;
if (walker((Node *) query->returningList, context))
return true;
if (walker((Node *) query->jointree, context))
@@ -2699,6 +2711,9 @@ query_tree_mutator(Query *query,
MUTATE(query->targetList, query->targetList, List *);
MUTATE(query->withCheckOptions, query->withCheckOptions, List *);
+ MUTATE(query->arbiterExpr, query->arbiterExpr, List *);
+ MUTATE(query->arbiterWhere, query->arbiterWhere, Node *);
+ MUTATE(query->onConflict, query->onConflict, Node *);
MUTATE(query->returningList, query->returningList, List *);
MUTATE(query->jointree, query->jointree, FromExpr *);
MUTATE(query->setOperations, query->setOperations, Node *);
@@ -2968,6 +2983,8 @@ raw_expression_tree_walker(Node *node,
return true;
if (walker(stmt->selectStmt, context))
return true;
+ if (walker(stmt->confClause, context))
+ return true;
if (walker(stmt->returningList, context))
return true;
if (walker(stmt->withClause, context))
@@ -3207,6 +3224,25 @@ raw_expression_tree_walker(Node *node,
break;
case T_WithClause:
return walker(((WithClause *) node)->ctes, context);
+
+ case T_InferClause:
+ {
+ InferClause *stmt = (InferClause *) node;
+
+ if (walker(stmt->indexElems, context))
+ return true;
+ if (walker(stmt->whereClause, context))
+ return true;
+ }
+ case T_ConflictClause:
+ {
+ ConflictClause *stmt = (ConflictClause *) node;
+
+ if (walker(stmt->infer, context))
+ return true;
+ if (walker(stmt->updatequery, context))
+ return true;
+ }
case T_CommonTableExpr:
return walker(((CommonTableExpr *) node)->ctequery, context);
default:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b4a2667..a32fbaa 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -330,6 +330,9 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_INT_FIELD(resultRelIndex);
WRITE_NODE_FIELD(plans);
+ WRITE_ENUM_FIELD(spec, SpecType);
+ WRITE_OID_FIELD(arbiterIndex);
+ WRITE_NODE_FIELD(onConflictPlan);
WRITE_NODE_FIELD(withCheckOptionLists);
WRITE_NODE_FIELD(returningLists);
WRITE_NODE_FIELD(fdwPrivLists);
@@ -2301,6 +2304,10 @@ _outQuery(StringInfo str, const Query *node)
WRITE_NODE_FIELD(jointree);
WRITE_NODE_FIELD(targetList);
WRITE_NODE_FIELD(withCheckOptions);
+ WRITE_ENUM_FIELD(specClause, SpecType);
+ WRITE_NODE_FIELD(arbiterExpr);
+ WRITE_NODE_FIELD(arbiterWhere);
+ WRITE_NODE_FIELD(onConflict);
WRITE_NODE_FIELD(returningList);
WRITE_NODE_FIELD(groupClause);
WRITE_NODE_FIELD(havingQual);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index dbc162a..9f6570f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -214,6 +214,10 @@ _readQuery(void)
READ_NODE_FIELD(jointree);
READ_NODE_FIELD(targetList);
READ_NODE_FIELD(withCheckOptions);
+ READ_ENUM_FIELD(specClause, SpecType);
+ READ_NODE_FIELD(arbiterExpr);
+ READ_NODE_FIELD(arbiterWhere);
+ READ_NODE_FIELD(onConflict);
READ_NODE_FIELD(returningList);
READ_NODE_FIELD(groupClause);
READ_NODE_FIELD(havingQual);
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index b86a3cd..fc4bb08 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4013,3 +4013,59 @@ string_to_const(const char *str, Oid datatype)
return makeConst(datatype, -1, collation, constlen,
conval, false, false);
}
+
+/*
+ * plan_speculative_use_index
+ * Use the planner to decide speculative insertion arbiter index
+ *
+ * rel is the target to undergo ON CONFLICT UPDATE/IGNORE. Decide which index
+ * to use. This should be called infrequently in practice, because its unusual
+ * for more than one index to be available that can satisfy a user-specified
+ * unique index inference specification.
+ *
+ * Note: caller had better already hold some type of lock on the table.
+ */
+Oid
+plan_speculative_use_index(PlannerInfo *root, List *indexList)
+{
+ IndexOptInfo *indexInfo;
+ RelOptInfo *rel;
+ IndexPath *cheapest;
+ IndexPath *indexScanPath;
+ ListCell *lc;
+
+ /* Set up RTE/RelOptInfo arrays if needed */
+ if (!root->simple_rel_array)
+ setup_simple_rel_arrays(root);
+
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, root->parse->resultRelation, RELOPT_BASEREL);
+
+ /* Locate cheapest IndexOptInfo for the target index */
+ cheapest = NULL;
+
+ foreach(lc, rel->indexlist)
+ {
+ indexInfo = (IndexOptInfo *) lfirst(lc);
+
+ if (!list_member_oid(indexList, indexInfo->indexoid))
+ continue;
+
+ /* Estimate the cost of index scan */
+ indexScanPath = create_index_path(root, indexInfo,
+ NIL, NIL, NIL, NIL, NIL,
+ ForwardScanDirection, false,
+ NULL, 1.0);
+
+ if (!cheapest || compare_fractional_path_costs(&cheapest->path,
+ &indexScanPath->path,
+ DEFAULT_RANGE_INEQ_SEL) > 0)
+ cheapest = indexScanPath;
+
+ }
+
+ if (!cheapest)
+ return InvalidOid;
+ else
+ return cheapest->indexinfo->indexoid;
+}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 1258961..263ff5f 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -255,13 +255,17 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
/*
* We don't support pushing join clauses into the quals of a tidscan, but
* it could still have required parameterization due to LATERAL refs in
- * its tlist.
+ * its tlist. To be tidy, we disallow TID scans as the unexecuted scan
+ * node of an ON CONFLICT UPDATE auxiliary query, even though there is no
+ * reason to think that would be harmful; the optimizer should always
+ * prefer a SeqScan or Result node (actually, we assert that it's one of
+ * those two in several places, so accepting TID scans would break those).
*/
required_outer = rel->lateral_relids;
tidquals = TidQualFromRestrictinfo(rel->baserestrictinfo, rel->relid);
- if (tidquals)
+ if (tidquals && root->parse->specClause != SPEC_UPDATE)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
required_outer));
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 655be81..e8eed55 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4811,7 +4811,8 @@ make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam)
+ List *rowMarks, Plan *onConflictPlan, SpecType spec,
+ int epqParam)
{
ModifyTable *node = makeNode(ModifyTable);
Plan *plan = &node->plan;
@@ -4860,6 +4861,9 @@ make_modifytable(PlannerInfo *root,
node->resultRelations = resultRelations;
node->resultRelIndex = -1; /* will be set correctly in setrefs.c */
node->plans = subplans;
+ node->spec = spec;
+ node->arbiterIndex = InvalidOid;
+ node->onConflictPlan = onConflictPlan;
node->withCheckOptionLists = withCheckOptionLists;
node->returningLists = returningLists;
node->rowMarks = rowMarks;
@@ -4912,6 +4916,16 @@ make_modifytable(PlannerInfo *root,
}
node->fdwPrivLists = fdw_private_list;
+ /*
+ * If a set of unique index inference expressions was provided (for
+ * INSERT...ON CONFLICT UPDATE/IGNORE), then infer appropriate
+ * unique index (or throw an error if none is available). It's
+ * possible that there will be a costing step in the event of
+ * having to choose between multiple alternatives.
+ */
+ if (root->parse->arbiterExpr)
+ node->arbiterIndex = infer_unique_index(root);
+
return node;
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9cbbcfb..4e154fb 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -612,7 +612,55 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
+
+ if (parse->onConflict)
+ {
+ Query *conflictQry = (Query*) parse->onConflict;
+ ModifyTable *parent = (ModifyTable *) plan;
+
+ /*
+ * An ON CONFLICT UPDATE query is a subquery of its parent
+ * INSERT ModifyTable, but isn't formally a subplan -- it's an
+ * "auxiliary" plan.
+ *
+ * During execution, the auxiliary plan state is used to
+ * execute the UPDATE query in an ad-hoc manner, driven by the
+ * parent. The executor will only ever execute the auxiliary
+ * plan through its parent. onConflictPlan is "auxiliary" to
+ * its parent in the sense that it's strictly encapsulated from
+ * other code (for example, the executor does not separately
+ * track it within estate as a plan that needs to have
+ * execution finished when it appears within a data-modifying
+ * CTE -- only the parent is specifically tracked in that
+ * manner).
+ *
+ * There is a fundamental nexus between parent and auxiliary
+ * plans that makes a fully unified representation seem
+ * compelling (a "CMD_UPSERT" ModifyTable plan and Query).
+ * That would obviate the need to specially track auxiliary
+ * state across all stages of execution just for this case; the
+ * optimizer would then not have to generate a fully-formed,
+ * independent UPDATE subquery plan (with a scanstate only
+ * useful for EvalPlanQual() re-evaluation). However, it's
+ * convenient to plan each ModifyTable separately, as doing so
+ * maximizes code reuse. The alternative must be to introduce
+ * abstractions that (for example) allow a single "CMD_UPSERT"
+ * ModifyTable to have two distinct types of targetlist (that
+ * will need to be processed differently during parsing and
+ * rewriting anyway). The "auxiliary" UPDATE plan is a good
+ * trade-off between a fully-fledged "CMD_UPSERT"
+ * representation, and the opposite extreme of tracking two
+ * separate ModifyTable nodes, joined by a contrived join type,
+ * with (for example) odd properties around tuple visibility
+ * not well encapsulated.
+ */
+ parent->onConflictPlan = subquery_planner(glob, conflictQry,
+ root, hasRecursion,
+ 0, NULL);
+ }
}
}
@@ -1056,6 +1104,8 @@ inheritance_planner(PlannerInfo *root)
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5d865b0..3368173 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -779,9 +779,28 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
* global list.
*/
splan->resultRelIndex = list_length(root->glob->resultRelations);
- root->glob->resultRelations =
- list_concat(root->glob->resultRelations,
- list_copy(splan->resultRelations));
+
+ if (!splan->onConflictPlan)
+ {
+ /*
+ * Only actually append result relation for non-auxiliary
+ * ModifyTable plans
+ */
+ root->glob->resultRelations =
+ list_concat(root->glob->resultRelations,
+ list_copy(splan->resultRelations));
+ }
+ else
+ {
+ splan->onConflictPlan = (Plan *) set_plan_refs(root,
+ (Plan *) splan->onConflictPlan,
+ rtoffset);
+ /*
+ * Set up the visible plan targetlist as being the same as
+ * the parent. Again, this is for the use of EXPLAIN only.
+ */
+ splan->onConflictPlan->targetlist = splan->plan.targetlist;
+ }
}
break;
case T_Append:
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 78fb6b1..f7a0523 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2345,6 +2345,12 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
valid_params,
scan_params));
}
+
+ /*
+ * No need to directly handle onConflictPlan here, since it
+ * cannot have params (due to parse analysis enforced
+ * restrictions prohibiting subqueries).
+ */
}
break;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 265c865..6868f25 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -31,6 +31,7 @@
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
+#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
@@ -125,10 +126,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/*
* Make list of indexes. Ignore indexes on system catalogs if told to.
- * Don't bother with indexes for an inheritance parent, either.
+ * Don't bother with indexes for an inheritance parent or speculative
+ * insertion UPDATE auxiliary queries, either.
*/
if (inhparent ||
- (IgnoreSystemIndexes && IsSystemRelation(relation)))
+ (IgnoreSystemIndexes && IsSystemRelation(relation)) ||
+ root->parse->specClause == SPEC_UPDATE)
hasindex = false;
else
hasindex = relation->rd_rel->relhasindex;
@@ -394,6 +397,221 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
}
/*
+ * infer_unique_index -
+ * Retrieves unique index to arbitrate speculative insertion.
+ *
+ * Uses user-supplied inference clause expressions and predicate to match a
+ * unique index from those defined and ready on the heap relation (target). An
+ * exact match is required on columns/expressions (although they can appear in
+ * any order). However, the predicate given by the user need only restrict
+ * insertion to a subset of some part of the table covered by some particular
+ * unique index (in particular, a partial unique index) in order to be
+ * inferred.
+ *
+ * The implementation does not consider which B-Tree operator class any
+ * particular available unique index uses. In particular, there is no system
+ * dependency on the default operator class for the purposes of inference.
+ * This should be okay, since by convention non-default opclasses only
+ * introduce alternative sort orders, not alternative notions of equality
+ * (there are only trivial known exceptions to this convention, where "equals"
+ * operator of a type's opclasses do not match across opclasses, exceptions
+ * that exist precisely to discourage user code from using the divergent
+ * opclass). Even if we assume that a type could usefully have multiple
+ * alternative concepts of equality, surely the definition actually implied by
+ * the operator class of actually indexed attributes is pertinent. However,
+ * this is a bit of a wart, because strictly speaking there is leeway for a
+ * query to be interpreted in deference to available unique indexes, and
+ * indexes are traditionally only an implementation detail. It hardly seems
+ * worth it to waste cycles on this corner case, though.
+ *
+ * This logic somewhat mirrors get_relation_info(). This process is not
+ * deferred to a get_relation_info() call while planning because there may not
+ * be any such call. In the ON CONFLICT UPDATE case get_relation_info() will
+ * be called, for auxiliary query planning, but even then indexes won't be
+ * examined since they're not generally interesting to that case (building
+ * index paths is explicitly avoided for auxiliary query planning, in fact).
+ */
+Oid
+infer_unique_index(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+ Relation relation;
+ Oid relationObjectId;
+ Bitmapset *plainAttrs = NULL;
+ List *candidates = NIL;
+ ListCell *l;
+ List *indexList;
+
+ Assert(parse->specClause == SPEC_INSERT ||
+ parse->specClause == SPEC_IGNORE);
+
+ /*
+ * We need not lock the relation since it was already locked, either by
+ * the rewriter or when expand_inherited_rtentry() added it to the query's
+ * rangetable.
+ */
+ relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid;
+
+ relation = heap_open(relationObjectId, NoLock);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(l, parse->arbiterExpr)
+ {
+ Expr *elem;
+ Var *var;
+ int attno;
+
+ elem = (Expr *) lfirst(l);
+
+ /*
+ * Parse analysis of inference elements performs full parse analysis of
+ * Vars, even for non-expression indexes (in contrast with utility
+ * command related use of IndexElem). However, indexes are cataloged
+ * with simple attribute numbers for non-expression indexes.
+ * Therefore, we must build a compatible bms representation here.
+ */
+ if (!IsA(elem, Var))
+ continue;
+
+ var = (Var*) elem;
+ attno = var->varattno;
+
+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("system columns may not appear in unique index inference specification")));
+ else if (attno == 0)
+ elog(ERROR, "whole row unique index inference specifications are not valid");
+
+ plainAttrs = bms_add_member(plainAttrs, attno);
+ }
+
+ indexList = RelationGetIndexList(relation);
+
+ /*
+ * Using that representation, iterate through the list of indexes on the
+ * target relation to try and find a match
+ */
+ foreach(l, indexList)
+ {
+ Oid indexoid = lfirst_oid(l);
+ Relation idxRel;
+ Form_pg_index idxForm;
+ Bitmapset *indexedPlainAttrs = NULL;
+ List *idxExprs;
+ List *predExprs;
+ List *whereExplicit;
+ AttrNumber natt;
+ ListCell *e;
+
+ /*
+ * Extract info from the relation descriptor for the index. We know
+ * that this is a target, so get lock type it is known will ultimately
+ * be required by the executor.
+ *
+ * Let executor complain about !indimmediate case directly.
+ */
+ idxRel = index_open(indexoid, RowExclusiveLock);
+ idxForm = idxRel->rd_index;
+
+ if (!idxForm->indisunique ||
+ !IndexIsValid(idxForm))
+ goto next;
+
+ /*
+ * If the index is valid, but cannot yet be used, ignore it. See
+ * src/backend/access/heap/README.HOT for discussion.
+ */
+ if (idxForm->indcheckxmin &&
+ !TransactionIdPrecedes(HeapTupleHeaderGetXmin(idxRel->rd_indextuple->t_data),
+ TransactionXmin))
+ goto next;
+
+ /* Check in detail if the clause attributes/expressions match */
+ for (natt = 0; natt < idxForm->indnatts; natt++)
+ {
+ int attno = idxRel->rd_index->indkey.values[natt];
+
+ if (attno < 0)
+ elog(ERROR, "system column in index");
+
+ if (attno != 0)
+ indexedPlainAttrs = bms_add_member(indexedPlainAttrs, attno);
+ }
+
+ /*
+ * Since expressions were made unique during parse analysis, it's
+ * evident that we cannot proceed with this index if the number of
+ * attributes (plain or expression) does not match exactly. This
+ * precludes support for unique indexes created with redundantly
+ * referenced columns (which are not forbidden by CREATE INDEX), but
+ * this seems inconsequential.
+ */
+ if (list_length(parse->arbiterExpr) != idxForm->indnatts)
+ goto next;
+
+ idxExprs = RelationGetIndexExpressions(idxRel);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(e, parse->arbiterExpr)
+ {
+ Expr *elem = (Expr *) lfirst(e);
+
+ /* Plain Vars were already separately accounted for */
+ if (IsA(elem, Var))
+ continue;
+
+ if (!list_member(idxExprs, elem))
+ goto next;
+ }
+
+ /* Non-expression attributes (if any) must match */
+ if (!bms_equal(indexedPlainAttrs, plainAttrs))
+ goto next;
+
+ /*
+ * Any user-supplied ON CONFLICT unique index inference WHERE clause
+ * need only be implied by the cataloged index definitions predicate
+ */
+ predExprs = RelationGetIndexPredicate(idxRel);
+ whereExplicit = make_ands_implicit((Expr *) parse->arbiterWhere);
+
+ if (!predicate_implied_by(predExprs, whereExplicit))
+ goto next;
+
+ candidates = lappend_oid(candidates, idxForm->indexrelid);
+next:
+ index_close(idxRel, NoLock);
+ }
+
+ list_free(indexList);
+ heap_close(relation, NoLock);
+
+ /*
+ * In the common case where there is only a single candidate unique index,
+ * there is clearly no point in building index paths to determine which is
+ * cheapest.
+ */
+ if (list_length(candidates) == 1)
+ return linitial_oid(candidates);
+ else if (candidates == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT")));
+ else
+ /* Otherwise, deduce the least expensive unique index */
+ return plan_speculative_use_index(root, candidates);
+
+ return InvalidOid; /* keep compiler quiet */
+}
+
+/*
* estimate_rel_size - estimate # pages and # tuples in a table or index
*
* We also estimate the fraction of the pages that are marked all-visible in
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index df89065..caaa44c 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -387,6 +387,8 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -408,6 +410,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
{
Query *qry = makeNode(Query);
SelectStmt *selectStmt = (SelectStmt *) stmt->selectStmt;
+ SpecType spec = stmt->confClause? stmt->confClause->specclause : SPEC_NONE;
List *exprList = NIL;
bool isGeneralSelect;
List *sub_rtable;
@@ -425,6 +428,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
qry->commandType = CMD_INSERT;
pstate->p_is_insert = true;
+ pstate->p_is_speculative = spec != SPEC_NONE;
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -478,8 +482,9 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
* mentioned in the SELECT part. Note that the target table is not added
* to the joinlist or namespace.
*/
- qry->resultRelation = setTargetTable(pstate, stmt->relation,
- false, false, ACL_INSERT);
+ qry->resultRelation = setTargetTable(pstate, stmt->relation, false, false,
+ ACL_INSERT |
+ (spec == SPEC_INSERT ? ACL_UPDATE : 0));
/* Validate stmt->cols list, or build default list if no list given */
icolumns = checkInsertTargets(pstate, stmt->cols, &attrnos);
@@ -741,12 +746,13 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
}
/*
- * If we have a RETURNING clause, we need to add the target relation to
- * the query namespace before processing it, so that Var references in
- * RETURNING will work. Also, remove any namespace entries added in a
- * sub-SELECT or VALUES list.
+ * If we have a RETURNING clause, or there are attributes used as the
+ * condition on which to take an alternative ON CONFLICT path, we need to
+ * add the target relation to the query namespace before processing it, so
+ * that Var references in RETURNING/the alternative path key will work.
+ * Also, remove any namespace entries added in a sub-SELECT or VALUES list.
*/
- if (stmt->returningList)
+ if (stmt->returningList || stmt->confClause)
{
pstate->p_namespace = NIL;
addRTEtoQuery(pstate, pstate->p_target_rangetblentry,
@@ -758,8 +764,66 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
-
+ qry->specClause = spec;
qry->hasSubLinks = pstate->p_hasSubLinks;
+ qry->onConflict = NULL;
+
+ if (stmt->confClause)
+ {
+ /*
+ * ON CONFLICT UPDATE requires special parse analysis of auxiliary
+ * update Query
+ */
+ if (stmt->confClause->updatequery)
+ {
+ UpdateStmt *pupd;
+ Query *dqry;
+ ParseState *sub_pstate = make_parsestate(pstate);
+ RangeTblEntry *subTarget;
+
+ pupd = (UpdateStmt *) stmt->confClause->updatequery;
+
+ if (!IsA(pupd, UpdateStmt))
+ elog(ERROR, "unrecognized statement in ON CONFLICT clause");
+
+ /* Assign same target relation as parent InsertStmt */
+ pupd->relation = stmt->relation;
+
+ /*
+ * The optimizer is not prepared to accept a subquery RTE for a
+ * non-CMD_SELECT Query. The CMD_UPDATE Query is tracked as
+ * special auxiliary state, while there is more or less analogous
+ * auxiliary state tracked in later stages of query execution.
+ */
+ dqry = transformStmt(sub_pstate, (Node *) pupd);
+ dqry->specClause = SPEC_UPDATE;
+ dqry->canSetTag = false;
+
+ /* Save auxiliary query */
+ qry->onConflict = (Node *) dqry;
+
+ /*
+ * Mark parent Query as requiring appropriate UPDATE/SELECT
+ * privileges
+ */
+ subTarget = sub_pstate->p_target_rangetblentry;
+
+ rte->updatedCols = bms_copy(subTarget->updatedCols);
+ rte->selectedCols = bms_union(rte->selectedCols,
+ subTarget->selectedCols);
+
+ free_parsestate(sub_pstate);
+ }
+
+ /*
+ * Infer a unique index from columns/expressions. This is later used
+ * to infer a unique index which arbitrates whether or not to take the
+ * alternative ON CONFLICT path (i.e. whether or not to INSERT or
+ * UPDATE/IGNORE in respect of each slot proposed for insertion).
+ */
+ transformConflictClause(pstate, stmt->confClause, &qry->arbiterExpr,
+ &qry->arbiterWhere);
+ }
assign_query_collations(pstate, qry);
@@ -1006,6 +1070,8 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -1906,6 +1972,10 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->commandType = CMD_UPDATE;
pstate->p_is_update = true;
+ pstate->p_is_speculative = (pstate->parentParseState &&
+ (!pstate->p_parent_cte &&
+ pstate->parentParseState->p_is_insert &&
+ pstate->parentParseState->p_is_speculative));
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -1915,6 +1985,18 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
}
+ /*
+ * Having established that this is a speculative insertion's auxiliary
+ * update, do not allow the query to access parent parse state. This is a
+ * bit of a kludge, but is the most direct way of making parent RTEs
+ * invisible. If we failed to take this measure, the parent's spuriously
+ * visible target could be illegally referenced within the auxiliary query
+ * were it to use the original target table name (rather than the standard
+ * TARGET.* alias).
+ */
+ if (pstate->p_is_speculative)
+ pstate->parentParseState = NULL;
+
qry->resultRelation = setTargetTable(pstate, stmt->relation,
interpretInhOption(stmt->relation->inhOpt),
true,
@@ -1947,6 +2029,8 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 679e1bb..10b7199 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -215,6 +215,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RangeVar *range;
IntoClause *into;
WithClause *with;
+ InferClause *infer;
+ ConflictClause *conf;
A_Indices *aind;
ResTarget *target;
struct PrivTarget *privtarget;
@@ -415,6 +417,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <defelt> SeqOptElem
%type <istmt> insert_rest
+%type <infer> opt_conf_expr
+%type <conf> opt_on_conflict
%type <vsetstmt> generic_set set_rest set_rest_more generic_reset reset_rest
SetResetClause FunctionSetResetClause
@@ -513,6 +517,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> cte_list
%type <list> within_group_clause
+%type <node> UpdateInsertStmt
%type <node> filter_clause
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
@@ -551,8 +556,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
CACHE CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLLATION COLUMN COMMENT COMMENTS COMMIT
- COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
- CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
+ COMMITTED CONCURRENTLY CONFIGURATION CONFLICT CONNECTION CONSTRAINT
+ CONSTRAINTS CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
CROSS CSV CURRENT_P
CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
@@ -572,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+ IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -652,6 +657,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
+%nonassoc DISTINCT
+%nonassoc ON
%left POSTFIXOP /* dummy for postfix Op rules */
/*
* To support target_el without AS, we must give IDENT an explicit priority
@@ -9399,10 +9406,12 @@ DeallocateStmt: DEALLOCATE name
*****************************************************************************/
InsertStmt:
- opt_with_clause INSERT INTO qualified_name insert_rest returning_clause
+ opt_with_clause INSERT INTO qualified_name insert_rest
+ opt_on_conflict returning_clause
{
$5->relation = $4;
- $5->returningList = $6;
+ $5->confClause = $6;
+ $5->returningList = $7;
$5->withClause = $1;
$$ = (Node *) $5;
}
@@ -9447,6 +9456,44 @@ insert_column_item:
}
;
+opt_on_conflict:
+ ON CONFLICT opt_conf_expr UpdateInsertStmt
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_INSERT;
+ $$->infer = $3;
+ $$->updatequery = $4;
+ $$->location = @1;
+ }
+ |
+ ON CONFLICT opt_conf_expr IGNORE_P
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_IGNORE;
+ $$->infer = $3;
+ $$->updatequery = NULL;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
+opt_conf_expr:
+ '(' index_params where_clause ')'
+ {
+ $$ = makeNode(InferClause);
+ $$->indexElems = $2;
+ $$->whereClause = $3;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
returning_clause:
RETURNING target_list { $$ = $2; }
| /* EMPTY */ { $$ = NIL; }
@@ -9546,6 +9593,21 @@ UpdateStmt: opt_with_clause UPDATE relation_expr_opt_alias
}
;
+UpdateInsertStmt: UPDATE
+ SET set_clause_list
+ where_clause
+ {
+ UpdateStmt *n = makeNode(UpdateStmt);
+ n->relation = NULL;
+ n->targetList = $3;
+ n->fromClause = NULL;
+ n->whereClause = $4;
+ n->returningList = NULL;
+ n->withClause = NULL;
+ $$ = (Node *)n;
+ }
+ ;
+
set_clause_list:
set_clause { $$ = $1; }
| set_clause_list ',' set_clause { $$ = list_concat($1,$3); }
@@ -13188,6 +13250,7 @@ unreserved_keyword:
| COMMIT
| COMMITTED
| CONFIGURATION
+ | CONFLICT
| CONNECTION
| CONSTRAINTS
| CONTENT_P
@@ -13247,6 +13310,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE_P
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7b0e668..82ac526 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -342,6 +342,10 @@ transformAggregateCall(ParseState *pstate, Aggref *agg,
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("aggregate functions are not allowed in ON CONFLICT UPDATE");
+
if (err)
ereport(ERROR,
(errcode(ERRCODE_GROUPING_ERROR),
@@ -671,6 +675,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
* which is sane anyway.
*/
}
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("window functions are not allowed in ON CONFLICT UPDATE");
+
if (err)
ereport(ERROR,
(errcode(ERRCODE_WINDOWING_ERROR),
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 654dce6..6487559 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -75,6 +75,8 @@ static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node,
List **tlist, ParseExprKind exprKind);
static int get_matching_location(int sortgroupref,
List *sortgrouprefs, List *exprs);
+static List* resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel);
static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle,
List *grouplist, List *targetlist, int location,
bool resolveUnknown);
@@ -2166,6 +2168,167 @@ get_matching_location(int sortgroupref, List *sortgrouprefs, List *exprs)
}
/*
+ * resolve_unique_index_expr
+ * Infer a unique index from a list of indexElems, for ON
+ * CONFLICT UPDATE/IGNORE
+ *
+ * Perform parse analysis of expressions and columns appearing within ON
+ * CONFLICT clause. During planning, the returned list of expressions is used
+ * to infer which unique index to use.
+ */
+static List *
+resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel)
+{
+ List *clauseexprs = NIL;
+ ListCell *l;
+
+ if (heapRel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" is not an ordinary table",
+ RelationGetRelationName(heapRel)),
+ errhint("Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ if (heapRel->rd_rel->relhassubclass)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" has inheritance children",
+ RelationGetRelationName(heapRel)),
+ errhint("Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ foreach(l, infer->indexElems)
+ {
+ IndexElem *ielem = (IndexElem *) lfirst(l);
+ Node *trans;
+
+ /*
+ * Raw grammar re-uses CREATE INDEX infrastructure for unique index
+ * inference clause, and so will accept opclasses by name and so on.
+ * Reject these here explicitly.
+ */
+ if (ielem->ordering != SORTBY_DEFAULT ||
+ ielem->nulls_ordering != SORTBY_NULLS_DEFAULT)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT does not accept ordering or NULLS FIRST/LAST specifications"),
+ errhint("These factors do not affect uniqueness of indexed datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->collation != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT collation specification is unnecessary"),
+ errhint("Collations do not affect uniqueness of collatable datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->opclass != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ON CONFLICT cannot accept non-default operator class specifications"),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (!ielem->expr)
+ {
+ /* Simple index attribute */
+ ColumnRef *n;
+
+ /*
+ * Grammar won't have built raw expression for us in event of plain
+ * column reference. Create one directly, and perform expression
+ * transformation, which seems better principled than simply
+ * propagating catalog-style simple attribute numbers. For
+ * example, it means the Var is marked for SELECT privileges, which
+ * speculative insertion requires. Planner expects this, and
+ * performs its own normalization for the purposes of matching
+ * against pg_index.
+ */
+ n = makeNode(ColumnRef);
+ n->fields = list_make1(makeString(ielem->name));
+ /* Location is approximately that of inference specification */
+ n->location = infer->location;
+ trans = (Node *) n;
+ }
+ else
+ {
+ /* Do parse transformation of the raw expression */
+ trans = (Node *) ielem->expr;
+ }
+
+ /*
+ * transformExpr() should have already rejected subqueries,
+ * aggregates, and window functions, based on the EXPR_KIND_ for an
+ * index expression. Expressions returning sets won't have been
+ * rejected, but don't bother doing so here; there should be no
+ * available expression unique index to match any such expression
+ * against anyway.
+ */
+ trans = transformExpr(pstate, trans, EXPR_KIND_INDEX_EXPRESSION);
+ /* Save in list of transformed expressions */
+ clauseexprs = list_append_unique(clauseexprs, trans);
+ }
+
+ return clauseexprs;
+}
+
+/*
+ * transformConflictClauseExpr -
+ * transform expressions of ON CONFLICT UPDATE/IGNORE.
+ *
+ * Transformed expressions used to infer one unique index relation to serve as
+ * an ON CONFLICT arbiter. Partial unique indexes may be inferred using WHERE
+ * clause from inference specification clause.
+ */
+void
+transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere)
+{
+ InferClause *infer = confClause->infer;
+
+ if (confClause->specclause == SPEC_INSERT && !infer)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from"),
+ parser_errposition(pstate,
+ exprLocation((Node *) confClause))));
+
+ /* Raw grammar must ensure this invariant holds */
+ Assert(confClause->specclause != SPEC_INSERT ||
+ confClause->updatequery != NULL);
+
+ /*
+ * If there is no inference clause, this might be an updatable view, which
+ * are supported by ON CONFLICT IGNORE (without columns/ expressions
+ * specified to infer a unique index from -- this is mandatory for the
+ * UPDATE variant). It might also be a relation with inheritance children,
+ * which would also make proceeding with inference fail.
+ */
+ if (infer)
+ {
+ *arbiterExpr = resolve_unique_index_expr(pstate, infer,
+ pstate->p_target_relation);
+
+ /* Handling inference WHERE clause (for partial unique index inference) */
+ if (infer->whereClause)
+ *arbiterWhere = transformExpr(pstate, infer->whereClause,
+ EXPR_KIND_INDEX_PREDICATE);
+ }
+
+ /*
+ * It's convenient to form a list of expressions based on the
+ * representation used by CREATE INDEX, since the same restrictions are
+ * appropriate (on subqueries and so on). However, from here on, the
+ * handling of those expressions is identical to ordinary optimizable
+ * statements. In particular, assign_query_collations() can be trusted to
+ * do the right thing with the post parse analysis query tree inference
+ * clause representation.
+ */
+}
+
+/*
* addTargetToSortList
* If the given targetlist entry isn't already in the SortGroupClause
* list, add it to the end of the list, using the given sort ordering
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f0f0488..70bf80f 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1564,6 +1564,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("cannot use subquery in ON CONFLICT UPDATE");
if (err)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index fab2948..5ab0cba 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -2961,6 +2961,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
CmdType event = parsetree->commandType;
bool instead = false;
bool returning = false;
+ bool updatableview = false;
Query *qual_product = NULL;
List *rewritten = NIL;
ListCell *lc1;
@@ -3094,6 +3095,19 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
/* Process just the main targetlist */
rewriteTargetListIU(parsetree, rt_entry_relation, NULL);
}
+
+ if (parsetree->specClause == SPEC_INSERT)
+ {
+ Query *qry;
+
+ /*
+ * While user-defined rules will never be applied in the
+ * auxiliary update query, normalization of tlist is still
+ * required
+ */
+ qry = (Query *) parsetree->onConflict;
+ rewriteTargetListIU(qry, rt_entry_relation, NULL);
+ }
}
else if (event == CMD_UPDATE)
{
@@ -3160,6 +3174,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
instead = true;
returning = true;
+ updatableview = true;
}
/*
@@ -3240,6 +3255,17 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
}
}
+ /*
+ * Updatable views are supported on a limited basis by ON CONFLICT
+ * IGNORE (if there is no unique index inference required, speculative
+ * insertion proceeds).
+ */
+ if (parsetree->specClause != SPEC_NONE && product_queries &&
+ !updatableview)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ON CONFLICT is not supported with rules")));
+
heap_close(rt_entry_relation, NoLock);
}
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index ad1cd4b..0c86f8e 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -421,6 +421,13 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
latestXid))
ShmemVariableCache->latestCompletedXid = latestXid;
+ /* Also clear any speculative insertion information */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+
LWLockRelease(ProcArrayLock);
}
else
@@ -438,6 +445,11 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
pgxact->delayChkpt = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
Assert(pgxact->nxids == 0);
Assert(pgxact->overflowed == false);
@@ -476,6 +488,13 @@ ProcArrayClearTransaction(PGPROC *proc)
/* Clear the subtransaction-XID cache too */
pgxact->nxids = 0;
pgxact->overflowed = false;
+
+ /* these should be clear, but just in case.. */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
}
/*
@@ -1108,6 +1127,83 @@ TransactionIdIsActive(TransactionId xid)
return result;
}
+void
+SetSpeculativeInsertionToken(uint32 token)
+{
+ MyProc->specInsertToken = token;
+}
+
+void
+SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel = relnode;
+ ItemPointerCopy(tid, &MyProc->specInsertTid);
+ LWLockRelease(ProcArrayLock);
+}
+
+void
+ClearSpeculativeInsertionState(void)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * Returns a speculative insertion token for waiting for the insertion to
+ * finish
+ */
+uint32
+SpeculativeInsertionIsInProgress(TransactionId xid, RelFileNode rel,
+ ItemPointer tid)
+{
+ uint32 result = 0;
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+
+ if (TransactionIdPrecedes(xid, RecentXmin))
+ return false;
+
+ /*
+ * Get the top transaction id.
+ *
+ * XXX We could search the proc array first, like
+ * TransactionIdIsInProgress() does, but this isn't performance-critical.
+ */
+ xid = SubTransGetTopmostTransaction(xid);
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+
+ if (pgxact->xid == xid)
+ {
+ /*
+ * Found the backend. Is it doing a speculative insertion of the
+ * given tuple?
+ */
+ if (RelFileNodeEquals(proc->specInsertRel, rel) &&
+ ItemPointerEquals(tid, &proc->specInsertTid))
+ result = proc->specInsertToken;
+
+ break;
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+
+ return result;
+}
+
/*
* GetOldestXmin -- returns oldest transaction that was running
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index d13a167..7a1df22 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -575,6 +575,69 @@ ConditionalXactLockTableWait(TransactionId xid)
return true;
}
+static uint32 speculativeInsertionToken = 0;
+
+/*
+ * SpeculativeInsertionLockAcquire
+ *
+ * Insert a lock showing that the given transaction ID is inserting a tuple,
+ * but hasn't yet decided whether it's going to keep it. The lock can then be
+ * used to wait for the decision to go ahead with the insertion, or aborting
+ * it.
+ *
+ * The token is used to distinguish multiple insertions by the same
+ * transaction. A counter will do, for example.
+ */
+void
+SpeculativeInsertionLockAcquire(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ speculativeInsertionToken++;
+ SetSpeculativeInsertionToken(speculativeInsertionToken);
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ (void) LockAcquire(&tag, ExclusiveLock, false, false);
+}
+
+/*
+ * SpeculativeInsertionLockRelease
+ *
+ * Delete the lock showing that the given transaction is speculatively
+ * inserting a tuple.
+ */
+void
+SpeculativeInsertionLockRelease(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ LockRelease(&tag, ExclusiveLock, false);
+}
+
+/*
+ * SpeculativeInsertionWait
+ *
+ * Wait for the specified transaction to finish or abort the insertion of a
+ * tuple.
+ */
+void
+SpeculativeInsertionWait(TransactionId xid, uint32 token)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, token);
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(token != 0);
+
+ (void) LockAcquire(&tag, ShareLock, false, false);
+ LockRelease(&tag, ShareLock, false);
+}
+
+
/*
* XactLockTableWaitErrorContextCb
* Error context callback for transaction lock waits.
@@ -873,6 +936,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
tag->locktag_field1,
tag->locktag_field2);
break;
+ case LOCKTAG_PROMISE_TUPLE_INSERTION:
+ appendStringInfo(buf,
+ _("tuple insertion by transaction %u"),
+ tag->locktag_field1);
+ break;
case LOCKTAG_OBJECT:
appendStringInfo(buf,
_("object %u of class %u of database %u"),
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index a1967b69..95d62cb 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -28,6 +28,7 @@ static const char *const LockTagTypeNames[] = {
"tuple",
"transactionid",
"virtualxid",
+ "inserter transactionid",
"object",
"userlock",
"advisory"
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 777f55c..f16e6af 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -726,6 +726,17 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Assert(htup->t_tableOid != InvalidOid);
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
+ snapshot->speculativeToken = 0;
+
+ /*
+ * Never return "super-deleted" tuples
+ *
+ * XXX: Comment this code out and you'll get conflicts within
+ * ExecLockUpdateTuple(), which result in an infinite loop.
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -807,6 +818,26 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
{
+ RelFileNode rnode;
+ ForkNumber forkno;
+ BlockNumber blockno;
+
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+
+ /* tuples can only be in the main fork */
+ Assert(forkno == MAIN_FORKNUM);
+ Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
+
+ /*
+ * Set speculative token. Caller can worry about xmax, since it
+ * requires a conclusively locked row version, and a concurrent
+ * update to this tuple is a conflict of its purposes.
+ */
+ snapshot->speculativeToken =
+ SpeculativeInsertionIsInProgress(HeapTupleHeaderGetRawXmin(tuple),
+ rnode,
+ &htup->t_self);
+
snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
/* XXX shouldn't we fall through to look at xmax? */
return true; /* in insertion by other */
@@ -922,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1126,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
+ * Immediately VACUUM "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return HEAPTUPLE_DEAD;
+
+ /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 939d93d..62e760a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -28,6 +28,7 @@
#define HEAP_INSERT_SKIP_WAL 0x0001
#define HEAP_INSERT_SKIP_FSM 0x0002
#define HEAP_INSERT_FROZEN 0x0004
+#define HEAP_INSERT_SPECULATIVE 0x0008
typedef struct BulkInsertStateData *BulkInsertState;
@@ -141,7 +142,7 @@ extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
CommandId cid, int options, BulkInsertState bistate);
extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd);
+ HeapUpdateFailureData *hufd, bool killspeculative);
extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index a2ed2a0..ae21789 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -73,6 +73,8 @@
#define XLOG_HEAP_SUFFIX_FROM_OLD (1<<6)
/* last xl_heap_multi_insert record for one heap_multi_insert() call */
#define XLOG_HEAP_LAST_MULTI_INSERT (1<<7)
+/* XXX: Make sure that re-use of bits is safe here */
+#define XLOG_HEAP_KILLED_SPECULATIVE_TUPLE (XLOG_HEAP_LAST_MULTI_INSERT | XLOG_HEAP_PREFIX_FROM_OLD)
/* convenience macro for checking whether any form of old tuple was logged */
#define XLOG_HEAP_CONTAINS_OLD \
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 40fde83..9400801 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -354,14 +354,19 @@ extern void ExecCloseScanRelation(Relation scanrel);
extern void ExecOpenIndices(ResultRelInfo *resultRelInfo);
extern void ExecCloseIndices(ResultRelInfo *resultRelInfo);
+extern List *ExecLockIndexValues(TupleTableSlot *slot, EState *estate,
+ SpecType specReason);
extern List *ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid,
- EState *estate);
-extern bool check_exclusion_constraint(Relation heap, Relation index,
- IndexInfo *indexInfo,
- ItemPointer tupleid,
- Datum *values, bool *isnull,
- EState *estate,
- bool newIndex, bool errorOK);
+ EState *estate, bool noDupErr, Oid arbiterIdx);
+extern bool ExecCheckIndexConstraints(TupleTableSlot *slot, EState *estate,
+ ItemPointer conflictTid, Oid arbiterIdx);
+extern bool check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo,
+ ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate,
+ bool newIndex, bool errorOK,
+ bool wait, ItemPointer conflictTid);
extern void RegisterExprContextCallback(ExprContext *econtext,
ExprContextCallbackFunction function,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41288ed..19b5e29 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -41,6 +41,9 @@
* ExclusionOps Per-column exclusion operators, or NULL if none
* ExclusionProcs Underlying function OIDs for ExclusionOps
* ExclusionStrats Opclass strategy numbers for ExclusionOps
+ * UniqueOps Theses are like Exclusion*, but for unique indexes
+ * UniqueProcs
+ * UniqueStrats
* Unique is it a unique index?
* ReadyForInserts is it valid for inserts?
* Concurrent are we doing a concurrent index build?
@@ -62,6 +65,9 @@ typedef struct IndexInfo
Oid *ii_ExclusionOps; /* array with one entry per column */
Oid *ii_ExclusionProcs; /* array with one entry per column */
uint16 *ii_ExclusionStrats; /* array with one entry per column */
+ Oid *ii_UniqueOps; /* array with one entry per column */
+ Oid *ii_UniqueProcs; /* array with one entry per column */
+ uint16 *ii_UniqueStrats; /* array with one entry per column */
bool ii_Unique;
bool ii_ReadyForInserts;
bool ii_Concurrent;
@@ -1088,6 +1094,9 @@ typedef struct ModifyTableState
int mt_whichplan; /* which one is being executed (0..n-1) */
ResultRelInfo *resultRelInfo; /* per-subplan target relations */
List **mt_arowmarks; /* per-subplan ExecAuxRowMark lists */
+ SpecType spec; /* reason for speculative insertion */
+ Oid arbiterIndex; /* unique index to arbitrate taking alt path */
+ PlanState *onConflict; /* associated OnConflict state */
EPQState mt_epqstate; /* for evaluating EvalPlanQual rechecks */
bool fireBSTriggers; /* do we need to fire stmt triggers? */
} ModifyTableState;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 97ef0fc..cac6b15 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -412,6 +412,8 @@ typedef enum NodeTag
T_RowMarkClause,
T_XmlSerialize,
T_WithClause,
+ T_InferClause,
+ T_ConflictClause,
T_CommonTableExpr,
/*
@@ -624,4 +626,16 @@ typedef enum JoinType
(1 << JOIN_RIGHT) | \
(1 << JOIN_ANTI))) != 0)
+/* SpecType - "Speculative insertion" clause
+ *
+ * This also appears across various subsystems
+ */
+typedef enum
+{
+ SPEC_NONE, /* Not involved in speculative insertion */
+ SPEC_IGNORE, /* INSERT of "ON CONFLICT IGNORE" */
+ SPEC_INSERT, /* INSERT of "ON CONFLICT UPDATE" */
+ SPEC_UPDATE /* UPDATE of "ON CONFLICT UPDATE" */
+} SpecType;
+
#endif /* NODES_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 86d1c07..9ae3bb5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -132,6 +132,11 @@ typedef struct Query
List *withCheckOptions; /* a list of WithCheckOption's */
+ SpecType specClause; /* speculative insertion clause */
+ List *arbiterExpr; /* Unique index arbiter exprs */
+ Node *arbiterWhere; /* Unique index arbiter WHERE clause */
+ Node *onConflict; /* ON CONFLICT Query */
+
List *returningList; /* return-values list (of TargetEntry) */
List *groupClause; /* a list of SortGroupClause's */
@@ -564,7 +569,7 @@ typedef enum TableLikeOption
} TableLikeOption;
/*
- * IndexElem - index parameters (used in CREATE INDEX)
+ * IndexElem - index parameters (used in CREATE INDEX, and in ON CONFLICT)
*
* For a plain index attribute, 'name' is the name of the table column to
* index, and 'expr' is NULL. For an index expression, 'name' is NULL and
@@ -999,6 +1004,36 @@ typedef struct WithClause
} WithClause;
/*
+ * InferClause -
+ * ON CONFLICT unique index inference clause
+ *
+ * Note: InferClause does not propagate into the Query representation.
+ */
+typedef struct InferClause
+{
+ NodeTag type;
+ List *indexElems; /* IndexElems to infer unique index */
+ Node *whereClause; /* qualification (partial-index predicate) */
+ int location; /* token location, or -1 if unknown */
+} InferClause;
+
+/*
+ * ConflictClause -
+ * representation of ON CONFLICT clause
+ *
+ * Note: ConflictClause does not propagate into the Query representation.
+ * However, Query may contain onConflict child Query.
+ */
+typedef struct ConflictClause
+{
+ NodeTag type;
+ SpecType specclause; /* Variant specified */
+ InferClause *infer; /* Optional index inference clause */
+ Node *updatequery; /* Update parse stmt */
+ int location; /* token location, or -1 if unknown */
+} ConflictClause;
+
+/*
* CommonTableExpr -
* representation of WITH list element
*
@@ -1048,6 +1083,7 @@ typedef struct InsertStmt
RangeVar *relation; /* relation to insert into */
List *cols; /* optional: names of the target columns */
Node *selectStmt; /* the source SELECT/VALUES, or NULL */
+ ConflictClause *confClause; /* ON CONFLICT clause */
List *returningList; /* list of expressions to return */
WithClause *withClause; /* WITH clause */
} InsertStmt;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 316c9ce..c2269bb 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -177,6 +177,9 @@ typedef struct ModifyTable
List *resultRelations; /* integer list of RT indexes */
int resultRelIndex; /* index of first resultRel in plan's list */
List *plans; /* plan(s) producing source data */
+ SpecType spec; /* speculative insertion specification */
+ Oid arbiterIndex; /* Oid of ON CONFLICT arbiter index */
+ Plan *onConflictPlan; /* Plan for ON CONFLICT UPDATE auxiliary query */
List *withCheckOptionLists; /* per-target-table WCO lists */
List *returningLists; /* per-target-table RETURNING tlists */
List *fdwPrivLists; /* per-target-table FDW private data lists */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..801effe 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -64,6 +64,7 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
int indexcol,
List **indexcolnos,
bool *var_on_left_p);
+extern Oid plan_speculative_use_index(PlannerInfo *root, List *indexList);
/*
* tidpath.h
diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h
index 8eb2e57..878adfe 100644
--- a/src/include/optimizer/plancat.h
+++ b/src/include/optimizer/plancat.h
@@ -28,6 +28,8 @@ extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
extern void get_relation_info(PlannerInfo *root, Oid relationObjectId,
bool inhparent, RelOptInfo *rel);
+extern Oid infer_unique_index(PlannerInfo *root);
+
extern void estimate_rel_size(Relation rel, int32 *attr_widths,
BlockNumber *pages, double *tuples, double *allvisfrac);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 082f7d7..a5f3b5a 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -84,7 +84,8 @@ extern ModifyTable *make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam);
+ List *rowMarks, Plan *onConflictPlan, SpecType spec,
+ int epqParam);
extern bool is_projection_capable_plan(Plan *plan);
/*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c243ec..cf501e6 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -87,6 +87,7 @@ PG_KEYWORD("commit", COMMIT, UNRESERVED_KEYWORD)
PG_KEYWORD("committed", COMMITTED, UNRESERVED_KEYWORD)
PG_KEYWORD("concurrently", CONCURRENTLY, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("configuration", CONFIGURATION, UNRESERVED_KEYWORD)
+PG_KEYWORD("conflict", CONFLICT, UNRESERVED_KEYWORD)
PG_KEYWORD("connection", CONNECTION, UNRESERVED_KEYWORD)
PG_KEYWORD("constraint", CONSTRAINT, RESERVED_KEYWORD)
PG_KEYWORD("constraints", CONSTRAINTS, UNRESERVED_KEYWORD)
@@ -180,6 +181,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 6a4438f..d1d0d12 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -41,6 +41,8 @@ extern List *transformDistinctClause(ParseState *pstate,
List **targetlist, List *sortClause, bool is_agg);
extern List *transformDistinctOnClause(ParseState *pstate, List *distinctlist,
List **targetlist, List *sortClause);
+extern void transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere);
extern List *addTargetToSortList(ParseState *pstate, TargetEntry *tle,
List *sortlist, List *targetlist, SortBy *sortby,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 3103b71..2b5804e 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -153,6 +153,7 @@ struct ParseState
bool p_hasModifyingCTE;
bool p_is_insert;
bool p_is_update;
+ bool p_is_speculative;
bool p_locked_from_parent;
Relation p_target_relation;
RangeTblEntry *p_target_rangetblentry;
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index f5d70e5..6bb95fc 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -76,6 +76,11 @@ extern bool ConditionalXactLockTableWait(TransactionId xid);
extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
+/* Lock an XID for tuple insertion (used to wait for an insertion to finish) */
+extern void SpeculativeInsertionLockAcquire(TransactionId xid);
+extern void SpeculativeInsertionLockRelease(TransactionId xid);
+extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
+
/* Lock a general object (other than a relation) of the current database */
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1100923..9c21810 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -176,6 +176,8 @@ typedef enum LockTagType
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
+ LOCKTAG_PROMISE_TUPLE_INSERTION, /* tuple insertion, keyed by Xid */
+ /* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
@@ -261,6 +263,14 @@ typedef struct LOCKTAG
(locktag).locktag_type = LOCKTAG_VIRTUALTRANSACTION, \
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
+ ((locktag).locktag_field1 = (xid), \
+ (locktag).locktag_field2 = (token), \
+ (locktag).locktag_field3 = 0, \
+ (locktag).locktag_field4 = 0, \
+ (locktag).locktag_type = LOCKTAG_PROMISE_TUPLE_INSERTION, \
+ (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+
#define SET_LOCKTAG_OBJECT(locktag,dboid,classoid,objoid,objsubid) \
((locktag).locktag_field1 = (dboid), \
(locktag).locktag_field2 = (classoid), \
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index d194f38..47e791d 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,9 +16,11 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "storage/itemptr.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
+#include "storage/relfilenode.h"
/*
* Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
@@ -132,6 +134,14 @@ struct PGPROC
*/
SHM_QUEUE myProcLocks[NUM_LOCK_PARTITIONS];
+ /*
+ * If we're inserting a tuple, but we might still decide to kill it,
+ * pointer to that tuple.
+ */
+ RelFileNode specInsertRel;
+ ItemPointerData specInsertTid;
+ uint32 specInsertToken;
+
struct XidCache subxids; /* cache for subtransaction XIDs */
/* Per-backend LWLock. Protects fields below. */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 97c6e93..ea2bba9 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -55,6 +55,13 @@ extern TransactionId GetOldestXmin(Relation rel, bool ignoreVacuum);
extern TransactionId GetOldestActiveTransactionId(void);
extern TransactionId GetOldestSafeDecodingTransactionId(void);
+extern void SetSpeculativeInsertionToken(uint32 token);
+extern void SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid);
+extern void ClearSpeculativeInsertionState(void);
+extern uint32 SpeculativeInsertionIsInProgress(TransactionId xid,
+ RelFileNode rel,
+ ItemPointer tid);
+
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 591f0ef..7b72d18 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -86,6 +86,17 @@ typedef struct SnapshotData
bool copied; /* false if it's a static snapshot */
/*
+ * Snapshot's speculative token is value set by HeapTupleSatisfiesDirty,
+ * indicating that the tuple is being inserted speculatively, and may yet
+ * be "super-deleted" before EOX. The caller may use the value with
+ * PromiseTupleInsertionWait to wait for the inserter to decide. It is only
+ * set when a valid 'xmin' is set, too. By convention, when
+ * speculativeToken is zero, the caller must assume that is should wait on
+ * a non-speculative tuple (i.e. wait for xmin/xmax to commit).
+ */
+ uint32 speculativeToken;
+
+ /*
* note: all ids in subxip[] are >= xmin, but we don't bother filtering
* out any that are >= xmax
*/
--
1.9.1
0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchtext/x-patch; charset=US-ASCII; name=0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchDownload
From fcecd1586769f7363504e7e47541516a841b2951 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 26 Aug 2014 21:28:40 -0700
Subject: [PATCH 1/8] Make UPDATE privileges distinct from INSERT privileges in
RTEs
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held. This worked because
ExecCheckRTEPerms() was called with a ACL_INSERT or ACL_UPDATE
requiredPerms, and based on that it was evident which type of
optimizable statement was under consideration. Since historically no
type of optimizable statement could directly INSERT and UPDATE at the
same time, there was no ambiguity as to which privileges were required.
This largely mechanical commit is required infrastructure for the
INSERT...ON CONFLICT UPDATE feature, which introduces an optimizable
statement that may be subject to both INSERT and UPDATE permissions
enforcement. Tests follow in a later commit.
sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken.
---
contrib/sepgsql/dml.c | 31 ++++++++++-------
src/backend/commands/copy.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/trigger.c | 22 ++++++-------
src/backend/executor/execMain.c | 55 +++++++++++++++++++++++++------
src/backend/nodes/copyfuncs.c | 3 +-
src/backend/nodes/equalfuncs.c | 3 +-
src/backend/nodes/outfuncs.c | 3 +-
src/backend/nodes/readfuncs.c | 3 +-
src/backend/optimizer/plan/setrefs.c | 6 ++--
src/backend/optimizer/prep/prepsecurity.c | 6 ++--
src/backend/optimizer/prep/prepunion.c | 8 +++--
src/backend/parser/analyze.c | 4 +--
src/backend/parser/parse_relation.c | 21 ++++++++----
src/backend/rewrite/rewriteHandler.c | 52 ++++++++++++++++-------------
src/include/nodes/parsenodes.h | 14 ++++----
16 files changed, 152 insertions(+), 83 deletions(-)
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 36c6a37..4a71753 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -145,7 +145,8 @@ fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
static bool
check_relation_privileges(Oid relOid,
Bitmapset *selected,
- Bitmapset *modified,
+ Bitmapset *inserted,
+ Bitmapset *updated,
uint32 required,
bool abort_on_violation)
{
@@ -231,8 +232,9 @@ check_relation_privileges(Oid relOid,
* Check permissions on the columns
*/
selected = fixup_whole_row_references(relOid, selected);
- modified = fixup_whole_row_references(relOid, modified);
- columns = bms_union(selected, modified);
+ inserted = fixup_whole_row_references(relOid, inserted);
+ updated = fixup_whole_row_references(relOid, updated);
+ columns = bms_union(selected, bms_union(inserted, updated));
while ((index = bms_first_member(columns)) >= 0)
{
@@ -241,13 +243,16 @@ check_relation_privileges(Oid relOid,
if (bms_is_member(index, selected))
column_perms |= SEPG_DB_COLUMN__SELECT;
- if (bms_is_member(index, modified))
+ if (bms_is_member(index, inserted))
{
- if (required & SEPG_DB_TABLE__UPDATE)
- column_perms |= SEPG_DB_COLUMN__UPDATE;
if (required & SEPG_DB_TABLE__INSERT)
column_perms |= SEPG_DB_COLUMN__INSERT;
}
+ if (bms_is_member(index, updated))
+ {
+ if (required & SEPG_DB_TABLE__UPDATE)
+ column_perms |= SEPG_DB_COLUMN__UPDATE;
+ }
if (column_perms == 0)
continue;
@@ -304,7 +309,7 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
required |= SEPG_DB_TABLE__INSERT;
if (rte->requiredPerms & ACL_UPDATE)
{
- if (!bms_is_empty(rte->modifiedCols))
+ if (!bms_is_empty(rte->updatedCols))
required |= SEPG_DB_TABLE__UPDATE;
else
required |= SEPG_DB_TABLE__LOCK;
@@ -333,7 +338,8 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
{
Oid tableOid = lfirst_oid(li);
Bitmapset *selectedCols;
- Bitmapset *modifiedCols;
+ Bitmapset *insertedCols;
+ Bitmapset *updatedCols;
/*
* child table has different attribute numbers, so we need to fix
@@ -341,15 +347,18 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
*/
selectedCols = fixup_inherited_columns(rte->relid, tableOid,
rte->selectedCols);
- modifiedCols = fixup_inherited_columns(rte->relid, tableOid,
- rte->modifiedCols);
+ insertedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->insertedCols);
+ updatedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->updatedCols);
/*
* check permissions on individual tables
*/
if (!check_relation_privileges(tableOid,
selectedCols,
- modifiedCols,
+ insertedCols,
+ updatedCols,
required, abort_on_violation))
return false;
}
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 0e604b7..cf95aa8 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -837,7 +837,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
FirstLowInvalidHeapAttributeNumber;
if (is_from)
- rte->modifiedCols = bms_add_member(rte->modifiedCols, attno);
+ rte->insertedCols = bms_add_member(rte->insertedCols, attno);
else
rte->selectedCols = bms_add_member(rte->selectedCols, attno);
}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index abc0fe8..fc368c0 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -433,7 +433,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
rte->requiredPerms = ACL_INSERT;
for (attnum = 1; attnum <= intoRelationDesc->rd_att->natts; attnum++)
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attnum - FirstLowInvalidHeapAttributeNumber);
ExecCheckRTPerms(list_make1(rte), true);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 4899a27..3f5918f 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -65,8 +65,8 @@ int SessionReplicationRole = SESSION_REPLICATION_ROLE_ORIGIN;
/* How many levels deep into trigger execution are we? */
static int MyTriggerDepth = 0;
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
/* Local function prototypes */
static void ConvertTriggerToFK(CreateTrigStmt *stmt, Oid funcoid);
@@ -2337,7 +2337,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TriggerDesc *trigdesc;
int i;
TriggerData LocTriggerData;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
trigdesc = relinfo->ri_TrigDesc;
@@ -2346,7 +2346,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (!trigdesc->trig_update_before_statement)
return;
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
LocTriggerData.type = T_TriggerData;
LocTriggerData.tg_event = TRIGGER_EVENT_UPDATE |
@@ -2367,7 +2367,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, NULL, NULL))
+ updatedCols, NULL, NULL))
continue;
LocTriggerData.tg_trigger = trigger;
@@ -2392,7 +2392,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (trigdesc && trigdesc->trig_update_after_statement)
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
false, NULL, NULL, NIL,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
}
TupleTableSlot *
@@ -2410,7 +2410,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
HeapTuple oldtuple;
TupleTableSlot *newSlot;
int i;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
Bitmapset *keyCols;
LockTupleMode lockmode;
@@ -2419,10 +2419,10 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
* been modified, then we can use a weaker lock, allowing for better
* concurrency.
*/
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
keyCols = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc,
INDEX_ATTR_BITMAP_KEY);
- if (bms_overlap(keyCols, modifiedCols))
+ if (bms_overlap(keyCols, updatedCols))
lockmode = LockTupleExclusive;
else
lockmode = LockTupleNoKeyExclusive;
@@ -2476,7 +2476,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, trigtuple, newtuple))
+ updatedCols, trigtuple, newtuple))
continue;
LocTriggerData.tg_trigtuple = trigtuple;
@@ -2546,7 +2546,7 @@ ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo,
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
true, trigtuple, newtuple, recheckIndexes,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 5b70cc9..f6a379f 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -636,27 +636,27 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
}
/*
- * Basically the same for the mod columns, with either INSERT or
- * UPDATE privilege as specified by remainingPerms.
+ * Basically the same for the mod columns, for both INSERT and UPDATE
+ * privilege as specified by remainingPerms (INSERT...ON CONFLICT
+ * UPDATE may set both).
*/
- remainingPerms &= ~ACL_SELECT;
- if (remainingPerms != 0)
+ if (remainingPerms & ACL_INSERT)
{
/*
- * When the query doesn't explicitly change any columns, allow the
+ * When the query doesn't explicitly insert any columns, allow the
* query if we have permission on any column of the rel. This is
* to handle SELECT FOR UPDATE as well as possible corner cases in
- * INSERT and UPDATE.
+ * UPDATE.
*/
- if (bms_is_empty(rte->modifiedCols))
+ if (bms_is_empty(rte->insertedCols))
{
- if (pg_attribute_aclcheck_all(relOid, userid, remainingPerms,
+ if (pg_attribute_aclcheck_all(relOid, userid, ACL_INSERT,
ACLMASK_ANY) != ACLCHECK_OK)
return false;
}
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->insertedCols, col)) >= 0)
{
/* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -669,7 +669,42 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
else
{
if (pg_attribute_aclcheck(relOid, attno, userid,
- remainingPerms) != ACLCHECK_OK)
+ ACL_INSERT) != ACLCHECK_OK)
+ return false;
+ }
+ }
+ }
+
+ if (remainingPerms & ACL_UPDATE)
+ {
+ /*
+ * When the query doesn't explicitly update any columns, allow the
+ * query if we have permission on any column of the rel. This is
+ * to handle SELECT FOR UPDATE as well as possible corner cases in
+ * UPDATE.
+ */
+ if (bms_is_empty(rte->updatedCols))
+ {
+ if (pg_attribute_aclcheck_all(relOid, userid, ACL_UPDATE,
+ ACLMASK_ANY) != ACLCHECK_OK)
+ return false;
+ }
+
+ col = -1;
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
+ {
+ /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
+ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+
+ if (attno == InvalidAttrNumber)
+ {
+ /* whole-row reference can't happen here */
+ elog(ERROR, "whole-row update is not implemented");
+ }
+ else
+ {
+ if (pg_attribute_aclcheck(relOid, attno, userid,
+ ACL_UPDATE) != ACLCHECK_OK)
return false;
}
}
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f1a24f5..00ffe4a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2028,7 +2028,8 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(requiredPerms);
COPY_SCALAR_FIELD(checkAsUser);
COPY_BITMAPSET_FIELD(selectedCols);
- COPY_BITMAPSET_FIELD(modifiedCols);
+ COPY_BITMAPSET_FIELD(insertedCols);
+ COPY_BITMAPSET_FIELD(updatedCols);
COPY_NODE_FIELD(securityQuals);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6e8b308..79035b2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2345,7 +2345,8 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(requiredPerms);
COMPARE_SCALAR_FIELD(checkAsUser);
COMPARE_BITMAPSET_FIELD(selectedCols);
- COMPARE_BITMAPSET_FIELD(modifiedCols);
+ COMPARE_BITMAPSET_FIELD(insertedCols);
+ COMPARE_BITMAPSET_FIELD(updatedCols);
COMPARE_NODE_FIELD(securityQuals);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index dd1278b..b4a2667 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2456,7 +2456,8 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
WRITE_UINT_FIELD(requiredPerms);
WRITE_OID_FIELD(checkAsUser);
WRITE_BITMAPSET_FIELD(selectedCols);
- WRITE_BITMAPSET_FIELD(modifiedCols);
+ WRITE_BITMAPSET_FIELD(insertedCols);
+ WRITE_BITMAPSET_FIELD(updatedCols);
WRITE_NODE_FIELD(securityQuals);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ae24d05..dbc162a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1253,7 +1253,8 @@ _readRangeTblEntry(void)
READ_UINT_FIELD(requiredPerms);
READ_OID_FIELD(checkAsUser);
READ_BITMAPSET_FIELD(selectedCols);
- READ_BITMAPSET_FIELD(modifiedCols);
+ READ_BITMAPSET_FIELD(insertedCols);
+ READ_BITMAPSET_FIELD(updatedCols);
READ_NODE_FIELD(securityQuals);
READ_DONE();
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7703946..5d865b0 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -368,9 +368,9 @@ flatten_rtes_walker(Node *node, PlannerGlobal *glob)
*
* In the flat rangetable, we zero out substructure pointers that are not
* needed by the executor; this reduces the storage space and copying cost
- * for cached plans. We keep only the alias and eref Alias fields, which
- * are needed by EXPLAIN, and the selectedCols and modifiedCols bitmaps,
- * which are needed for executor-startup permissions checking and for
+ * for cached plans. We keep only the alias and eref Alias fields, which are
+ * needed by EXPLAIN, and the selectedCols, insertedCols and updatedCols
+ * bitmaps, which are needed for executor-startup permissions checking and for
* trigger event checking.
*/
static void
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index af3ee61..f86e792 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -115,7 +115,8 @@ expand_security_quals(PlannerInfo *root, List *tlist)
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the original relation
@@ -213,7 +214,8 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Now deal with any PlanRowMark on this RTE by requesting a lock
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 05f601e..1e28363 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1367,14 +1367,16 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
* if this is the parent table, leave copyObject's result alone.
*
* Note: we need to do this even though the executor won't run any
- * permissions checks on the child RTE. The modifiedCols bitmap may
- * be examined for trigger-firing purposes.
+ * permissions checks on the child RTE. The insertedCols/updatedCols
+ * bitmaps may be examined for trigger-firing purposes.
*/
if (childOID != parentOID)
{
childrte->selectedCols = translate_col_privs(rte->selectedCols,
appinfo->translated_vars);
- childrte->modifiedCols = translate_col_privs(rte->modifiedCols,
+ childrte->insertedCols = translate_col_privs(rte->insertedCols,
+ appinfo->translated_vars);
+ childrte->updatedCols = translate_col_privs(rte->updatedCols,
appinfo->translated_vars);
}
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a68f2e8..df89065 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -733,7 +733,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
false);
qry->targetList = lappend(qry->targetList, tle);
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attr_num - FirstLowInvalidHeapAttributeNumber);
icols = lnext(icols);
@@ -2002,7 +2002,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
origTarget->location);
/* Mark the target column as requiring update permissions */
- target_rte->modifiedCols = bms_add_member(target_rte->modifiedCols,
+ target_rte->updatedCols = bms_add_member(target_rte->updatedCols,
attrno - FirstLowInvalidHeapAttributeNumber);
origTargetList = lnext(origTargetList);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 8d4f79f..d2820d8 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1052,7 +1052,8 @@ addRangeTableEntry(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1105,7 +1106,8 @@ addRangeTableEntryForRelation(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1183,7 +1185,8 @@ addRangeTableEntryForSubquery(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1437,7 +1440,8 @@ addRangeTableEntryForFunction(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1509,7 +1513,8 @@ addRangeTableEntryForValues(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1577,7 +1582,8 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1677,7 +1683,8 @@ addRangeTableEntryForCTE(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index b8e6e7a..fab2948 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1403,7 +1403,8 @@ ApplyRetrieveRule(Query *parsetree,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the view should remain as
@@ -1466,12 +1467,14 @@ ApplyRetrieveRule(Query *parsetree,
subrte->requiredPerms = rte->requiredPerms;
subrte->checkAsUser = rte->checkAsUser;
subrte->selectedCols = rte->selectedCols;
- subrte->modifiedCols = rte->modifiedCols;
+ subrte->insertedCols = rte->insertedCols;
+ subrte->updatedCols = rte->updatedCols;
rte->requiredPerms = 0; /* no permission check on subquery itself */
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* If FOR [KEY] UPDATE/SHARE of view, mark all the contained tables as
@@ -2584,9 +2587,9 @@ rewriteTargetView(Query *parsetree, Relation view)
/*
* For INSERT/UPDATE the modified columns must all be updatable. Note that
* we get the modified columns from the query's targetlist, not from the
- * result RTE's modifiedCols set, since rewriteTargetListIU may have added
- * additional targetlist entries for view defaults, and these must also be
- * updatable.
+ * result RTE's insertedCols and/or updatedCols set, since
+ * rewriteTargetListIU may have added additional targetlist entries for
+ * view defaults, and these must also be updatable.
*/
if (parsetree->commandType != CMD_DELETE)
{
@@ -2723,26 +2726,31 @@ rewriteTargetView(Query *parsetree, Relation view)
*
* Initially, new_rte contains selectedCols permission check bits for all
* base-rel columns referenced by the view, but since the view is a SELECT
- * query its modifiedCols is empty. We set modifiedCols to include all
- * the columns the outer query is trying to modify, adjusting the column
- * numbers as needed. But we leave selectedCols as-is, so the view owner
- * must have read permission for all columns used in the view definition,
- * even if some of them are not read by the outer query. We could try to
- * limit selectedCols to only columns used in the transformed query, but
- * that does not correspond to what happens in ordinary SELECT usage of a
- * view: all referenced columns must have read permission, even if
- * optimization finds that some of them can be discarded during query
- * transformation. The flattening we're doing here is an optional
- * optimization, too. (If you are unpersuaded and want to change this,
- * note that applying adjust_view_column_set to view_rte->selectedCols is
- * clearly *not* the right answer, since that neglects base-rel columns
- * used in the view's WHERE quals.)
+ * query its insertedCols/updatedCols is empty. We set insertedCols and
+ * updatedCols to include all the columns the outer query is trying to
+ * modify, adjusting the column numbers as needed. But we leave
+ * selectedCols as-is, so the view owner must have read permission for all
+ * columns used in the view definition, even if some of them are not read
+ * by the outer query. We could try to limit selectedCols to only columns
+ * used in the transformed query, but that does not correspond to what
+ * happens in ordinary SELECT usage of a view: all referenced columns must
+ * have read permission, even if optimization finds that some of them can
+ * be discarded during query transformation. The flattening we're doing
+ * here is an optional optimization, too. (If you are unpersuaded and want
+ * to change this, note that applying adjust_view_column_set to
+ * view_rte->selectedCols is clearly *not* the right answer, since that
+ * neglects base-rel columns used in the view's WHERE quals.)
*
* This step needs the modified view targetlist, so we have to do things
* in this order.
*/
- Assert(bms_is_empty(new_rte->modifiedCols));
- new_rte->modifiedCols = adjust_view_column_set(view_rte->modifiedCols,
+ Assert(bms_is_empty(new_rte->insertedCols) &&
+ bms_is_empty(new_rte->updatedCols));
+
+ new_rte->insertedCols = adjust_view_column_set(view_rte->insertedCols,
+ view_targetlist);
+
+ new_rte->updatedCols = adjust_view_column_set(view_rte->updatedCols,
view_targetlist);
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..86d1c07 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -717,11 +717,12 @@ typedef struct XmlSerialize
* For SELECT/INSERT/UPDATE permissions, if the user doesn't have
* table-wide permissions then it is sufficient to have the permissions
* on all columns identified in selectedCols (for SELECT) and/or
- * modifiedCols (for INSERT/UPDATE; we can tell which from the query type).
- * selectedCols and modifiedCols are bitmapsets, which cannot have negative
- * integer members, so we subtract FirstLowInvalidHeapAttributeNumber from
- * column numbers before storing them in these fields. A whole-row Var
- * reference is represented by setting the bit for InvalidAttrNumber.
+ * insertedCols and/or updatedCols (INSERT with ON CONFLICT UPDATE may
+ * have all 3). selectedCols, insertedCols and updatedCols are
+ * bitmapsets, which cannot have negative integer members, so we subtract
+ * FirstLowInvalidHeapAttributeNumber from column numbers before storing
+ * them in these fields. A whole-row Var reference is represented by
+ * setting the bit for InvalidAttrNumber.
*--------------------
*/
typedef enum RTEKind
@@ -816,7 +817,8 @@ typedef struct RangeTblEntry
AclMode requiredPerms; /* bitmask of required access permissions */
Oid checkAsUser; /* if valid, check access as this role */
Bitmapset *selectedCols; /* columns needing SELECT permission */
- Bitmapset *modifiedCols; /* columns needing INSERT/UPDATE permission */
+ Bitmapset *insertedCols; /* columns needing INSERT permission */
+ Bitmapset *updatedCols; /* columns needing UPDATE permission */
List *securityQuals; /* any security barrier quals to apply */
} RangeTblEntry;
--
1.9.1
On Sat, Jan 10, 2015 at 8:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
I also include various bugfixes to approach #2 to value locking (these
were all previously separately posted, but are now integrated into the
main ON CONFLICT commit). Specifically, these are fixes for the bugs
that emerged thanks to Jeff Janes' great work on stress testing [4].
With these fixes, I have been unable to reproduce any problem with
this patch with the test suite, even after many days of running the
script on a quad-core server, with constant concurrent VACUUM runs,
etc.
I continued with this since posting V2.0. I've run this bash script,
that invokes Jeff's script at various client counts, with runs of
various duration (since each client does a fixed amount of work):
https://github.com/petergeoghegan/jjanes_upsert/blob/master/run_test.sh
As previously discussed, Jeff's script comprehensively verifies the
correctness of the final values of a few thousand rows within a table
after many concurrent upserts, within and across upserting sessions,
and with concurrent deletions, too.
When building Postgres for this stress test, I included Jeff's
modifications that increase the XID burn rate artificially (I chose a
burn rate of X50). This makes anti-wraparound VACUUMs much more
frequent. I'm also looking out for outlier query execution durations,
because in theory they could indicate an unknown lock starvation
problem. I haven't seen any notable outliers after over a week of
testing.
I think that we still need to think about the issues that
transpired with exclusion constraints, but since I couldn't find
another problem with an adapted version of Jeff's tool that tested
exclusion constraints, I'm inclined to think that it should be
possible to support exclusion constraints for the IGNORE variant.
Exclusion constraints were my focus with stress testing today. I
performed equivalent verification of upserts using exclusion
constraints (this is a hack; exclusion constraints are only intended
to be used with the IGNORE variant, but I get better test coverage
than I might otherwise this way). Unfortunately, even with the recent
bugfixes, there are still problems. On this server (rather than my
laptop), with 8 clients I can see errors like this before too long
(note that this output includes custom instrumentation from Jeff):
"""""""
6670 2015-01-17 18:02:54 PST LOG: JJ scan_all 1, relfrozenid -813636509
6670 2015-01-17 18:02:54 PST LOG: JJ freezeLimit -661025537
6670 2015-01-17 18:02:54 PST LOG: JJ freeze_min_age 50000000
vacuum_freeze_table_age 150000000 freeze_table_age 150000000 ReadNew
-611025384
6670 2015-01-17 18:02:54 PST LOG: JJ scan_all 1, relfrozenid -813636101
6670 2015-01-17 18:02:54 PST LOG: JJ transaction ID wrap limit is
1352632427, limited by database with OID 12746
6670 2015-01-17 18:02:54 PST LOG: autovacuum: done processing
database "postgres" at recent Xid of 3683945176 recent mxid of 1
6668 2015-01-17 18:02:54 PST ERROR: conflicting key value violates
exclusion constraint "upsert_race_test_index_excl"
6668 2015-01-17 18:02:54 PST DETAIL: Key (index)=(7142) conflicts
with existing key (index)=(600).
6668 2015-01-17 18:02:54 PST STATEMENT: insert into upsert_race_test
(index, count) values ('7142','1') on conflict
update set count=TARGET.count + EXCLUDED.count
where TARGET.index = EXCLUDED.index
returning upsert_race_test.count
"""""""
It's always an exclusion violation problem that I see here.
As you can see, the query involved has no "unique index inference"
specification, per the hack to make this work with exclusion
constraints (the artificially much greater XID burn rate might have
also increased the likelihood of this error dramatically). You'll also
note that the DETAIL message seems to indicate that this
btree_gist-based exclusion constraint doesn't behave like a unique
constraint at all, because the conflicting new value (7142) is not at
all the same as the existing value (600). But that's wrong -- it's
supposed to be B-Tree-like. In short, there are further race
conditions with exclusion constraints.
I think that the fundamental, unfixable race condition here is the
disconnect between index tuple insertion and checking for would-be
exclusion violations that exclusion constraints naturally have here,
that unique indexes naturally don't have [1]/messages/by-id/54A7C76D.3070101@vmware.com -- Peter Geoghegan (note that I'm talking
only about approach #2 to value locking here; approach #1 isn't in
V2.0). I suspect that the feature is not technically feasible to make
work correctly with exclusion constraints, end of story. VACUUM
interlocking is probably also involved here, but the unfixable race
condition seems like our fundamental problem.
We could possibly spend a lot of time discussing whether or not I'm
right about it being inherently impossible to make INSERT ... ON
CONFLICT IGNORE work with exclusion constraints. However, I strongly
suggest that we cut scope and at least leave them out of any version
that can be committed for 9.5, and instead work on other areas,
because it is at least now clear that they are much harder to get
right than unique constraints. Besides, making exclusion constraints
work with INSERT ... ON CONFLICT IGNORE is nice, but ultimately not
all that important. For that matter I think that INSERT ... ON
CONFLICT IGNORE is more generally not all that important compared to
ON CONFLICT UPDATE. I'd cut scope by cutting ON CONFLICT IGNORE if
that was the consensus....we could add back ON CONFLICT IGNORE in 9.6
when we had a better sense of exclusion constraints here. Exclusion
constraints can never be useful with ON CONFLICT UPDATE anyway.
Please work with me towards a committable patch. I think we have every
chance of committing this for 9.5, with value locking approach #2,
provided we now cut scope a bit. As I mention above, V2.0 has stood up
to more than a week of aggressive, comprehensive stress testing/custom
correctness verification on an 8 core box (plus numerous other stress
tests in months past). UPSERT (which never involved exclusion
constraints) is a very comprehensive and mature effort, and I think it
now needs one big push from a senior community member. I feel that I
cannot do anything more without that input.
[1]: /messages/by-id/54A7C76D.3070101@vmware.com -- Peter Geoghegan
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jan 17, 2015 at 6:48 PM, Peter Geoghegan <pg@heroku.com> wrote:
I continued with this since posting V2.0.
Attached version (V2.1) fixes bit-rot caused by the recent changes by
Stephen ("Fix column-privilege leak in error-message paths"). More
precisely, it is rebased on top of today's 17792b commit.
I have not addressed the recently described problems with exclusion
constraints. I hope we can do so shortly. Simply removing IGNORE
support until such time as we straighten that all out (9.6?) seems
like the simplest solution. No need to block the progress of "UPSERT",
since exclusion constraint support was only ever going to be useful
for the less compelling IGNORE variant. What do other people think? Do
you agree with my view that we should shelve IGNORE support for now,
Heikki?
There is one minor bugfix here: I have tightened up the conditions
under which user-defined rule application will be rejected.
Previously, I neglected to specifically check for UPDATE rules when an
INSERT ... ON CONFLICT UPDATE statement was considered. That's been
fixed.
On the stress-testing front, I'm still running Jeff Janes' tool [1]https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan,
while also continuing to use his Postgres modifications to
artificially increase the XID burn rate. However, my personal server
is no longer used for this task. I'm using an AWS EC2 instance - a
r3.8xlarge. This server provides 32 logical cores, and uses an "Intel
Xeon E5-2670 v2 @ 2.50GHz" CPU. It seems reasonable to suppose that
any latent concurrency bugs are more likely to reveal themselves when
using the new server.
Anyone who would like access to the server should contact me
privately. It's a throw-away EC2 instance, so this isn't particularly
difficult to do.
Thanks
[1]: https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan
--
Peter Geoghegan
Attachments:
0008-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchtext/x-patch; charset=US-ASCII; name=0008-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchDownload
From 79114cb4e3511e30ef207f0a45b8e2c024a01ad6 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 26 Sep 2014 20:59:04 -0700
Subject: [PATCH 8/8] User-visible documentation for INSERT ... ON CONFLICT
{UPDATE | IGNORE}
INSERT ... ON CONFLICT {UPDATE | IGNORE} is documented as a new clause
of the INSERT command. Some potentially surprising interactions with
triggers are noted -- BEFORE INSERT per-row triggers must fire without
the INSERT path necessarily being taken, for example.
All the existing features that INSERT ... ON CONFLICT {UPDATE | IGNORE}
interacts with have these interactions noted. This includes
postgres_fdw, updatable views, table inheritance, RLS and partial unique
indexes.
Finally, a user-level description of the new "MVCC violation" that the
ON CONFLICT UPDATE variant sometimes requires has been added to "Chapter
13 - Concurrency Control", beside existing commentary on READ COMMITTED
mode's special handling of concurrent updates. The new "MVCC violation"
introduced seems somewhat distinct from the existing one (i.e. READ
COMMITTED's handling of when an UPDATE affects a concurrently
updated/deleted tuple, which internally uses a mechanism called
EvalPlanQual()), because in READ COMMITTED mode it is no longer
necessary for any row version to be conventionally visible to the
command's MVCC snapshot for an UPDATE of the row to occur (or for the
row to be locked, should the UPDATE's WHERE clause not be satisfied).
---
doc/src/sgml/ddl.sgml | 23 +++
doc/src/sgml/fdwhandler.sgml | 8 +
doc/src/sgml/keywords.sgml | 7 +
doc/src/sgml/mvcc.sgml | 24 +++
doc/src/sgml/plpgsql.sgml | 14 +-
doc/src/sgml/postgres-fdw.sgml | 8 +
doc/src/sgml/protocol.sgml | 13 +-
doc/src/sgml/ref/alter_policy.sgml | 7 +-
doc/src/sgml/ref/create_policy.sgml | 37 +++-
doc/src/sgml/ref/create_rule.sgml | 7 +-
doc/src/sgml/ref/create_table.sgml | 5 +-
doc/src/sgml/ref/create_trigger.sgml | 5 +-
doc/src/sgml/ref/create_view.sgml | 33 ++-
doc/src/sgml/ref/insert.sgml | 373 ++++++++++++++++++++++++++++++++--
doc/src/sgml/ref/set_constraints.sgml | 6 +-
doc/src/sgml/trigger.sgml | 49 ++++-
16 files changed, 568 insertions(+), 51 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 570a003..7b43a10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2428,9 +2428,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
+ There is limited inheritance support for <command>INSERT</command>
+ commands with <literal>ON CONFLICT</> clauses. Tables with
+ children are not generally accepted as targets. One notable
+ exception is that such tables are accepted as targets for
+ <command>INSERT</command> commands with <literal>ON CONFLICT
+ IGNORE</> clauses, provided a unique index inference clause was
+ omitted (which implies that there is no concern about
+ <emphasis>which</> unique index any would-be conflict might arise
+ from). However, tables that happen to be inheritance children are
+ accepted as targets for all variants of <command>INSERT</command>
+ with <literal>ON CONFLICT</>.
+ </para>
+
+ <para>
All check constraints and not-null constraints on a parent table are
automatically inherited by its children. Other types of constraints
(unique, primary key, and foreign key constraints) are not inherited.
+ Therefore, <command>INSERT</command> with <literal>ON CONFLICT</>
+ unique index inference considers only unique constraints/indexes
+ directly associated with the child
+ table.
</para>
<para>
@@ -2515,6 +2533,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
+ <literal>INSERT</literal> with an <literal>ON CONFLICT
+ UPDATE</literal> clause does not support the
+ <literal>ONLY</literal> notation, and so in effect tables with
+ inheritance children are not supported for the <literal>ON
+ CONFLICT</literal> variant.
Commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
typically only work on individual, physical tables and do not
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..0c3dcb5 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,14 @@ GetForeignServerByName(const char *name, bool missing_ok);
source provides.
</para>
+ <para>
+ <command>INSERT</> with an <literal>ON CONFLICT</> clause is not supported
+ with a unique index inference specification (this implies that <literal>ON
+ CONFLICT UPDATE</> is never supported, since the specification is
+ mandatory there). When planning an <command>INSERT</>,
+ <function>PlanForeignModify</> should reject these cases.
+ </para>
+
</sect1>
</chapter>
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index b0dfd5f..ea58211 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -854,6 +854,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>CONFLICT</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>CONNECT</token></entry>
<entry></entry>
<entry>reserved</entry>
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0d6867..5e310d7 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -326,6 +326,30 @@
</para>
<para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</> clause is
+ another special case. In Read Committed mode, the implementation will
+ either insert or update each row proposed for insertion, with either one of
+ those two outcomes guaranteed. This is a useful guarantee for many
+ use-cases, but it implies that further liberties must be taken with
+ snapshot isolation. Should a conflict originate in another transaction
+ whose effects are not visible to the <command>INSERT</command>, the
+ <command>UPDATE</command> may affect that row, even though it may be the
+ case that <emphasis>no</> version of that row is conventionally visible to
+ the command. In the same vein, if the secondary search condition of the
+ command (an explicit <literal>WHERE</> clause) is supplied, it is only
+ evaluated on the most recent row version, which is not necessarily the
+ version conventionally visible to the command (if indeed there is a row
+ version conventionally visible to the command at all).
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT IGNORE</> clause may
+ have insertion not proceed for a row due to the outcome of another
+ transaction whose effects are not visible to the <command>INSERT</command>
+ snapshot. Again, this is only the case in Read Committed mode.
+ </para>
+
+ <para>
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands on the same rows it is trying to update, but it
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 69a0885..59a5945 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2607,7 +2607,11 @@ END;
<para>
This example uses exception handling to perform either
- <command>UPDATE</> or <command>INSERT</>, as appropriate:
+ <command>UPDATE</> or <command>INSERT</>, as appropriate. It is
+ recommended that applications use <command>INSERT</> with
+ <literal>ON CONFLICT UPDATE</> rather than actually emulating this
+ pattern. This example serves only to illustrate use of
+ <application>PL/pgSQL</application> control flow structures:
<programlisting>
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
@@ -3771,9 +3775,11 @@ RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;
<command>INSERT</> and <command>UPDATE</> operations, the return value
should be <varname>NEW</>, which the trigger function may modify to
support <command>INSERT RETURNING</> and <command>UPDATE RETURNING</>
- (this will also affect the row value passed to any subsequent triggers).
- For <command>DELETE</> operations, the return value should be
- <varname>OLD</>.
+ (this will also affect the row value passed to any subsequent triggers,
+ or passed to a special <varname>EXCLUDED</> alias reference within
+ an <command>INSERT</> statement with an <literal>ON CONFLICT UPDATE</>
+ clause). For <command>DELETE</> operations, the return
+ value should be <varname>OLD</>.
</para>
<para>
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fa39661 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -69,6 +69,14 @@
</para>
<para>
+ Note that <filename>postgres_fdw</> currently lacks support for
+ <command>INSERT</command> statements with an <literal>ON CONFLICT
+ UPDATE</> clause. However, the <literal>ON CONFLICT IGNORE</>
+ clause is supported, provided a unique index inference specification
+ is omitted.
+ </para>
+
+ <para>
It is generally recommended that the columns of a foreign table be declared
with exactly the same data types, and collations if applicable, as the
referenced columns of the remote table. Although <filename>postgres_fdw</>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index efe75ea..a198182 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2998,9 +2998,16 @@ CommandComplete (B)
<literal>INSERT <replaceable>oid</replaceable>
<replaceable>rows</replaceable></literal>, where
<replaceable>rows</replaceable> is the number of rows
- inserted. <replaceable>oid</replaceable> is the object ID
- of the inserted row if <replaceable>rows</replaceable> is 1
- and the target table has OIDs;
+ inserted. However, if and only if <literal>ON CONFLICT
+ UPDATE</> is specified, then the tag is <literal>UPSERT
+ <replaceable>oid</replaceable>
+ <replaceable>rows</replaceable></literal>, where
+ <replaceable>rows</replaceable> is the number of rows inserted
+ <emphasis>or updated</emphasis>.
+ <replaceable>oid</replaceable> is the object ID of the
+ inserted row if <replaceable>rows</replaceable> is 1 and the
+ target table has OIDs, and (for the <literal>UPSERT</literal>
+ tag), the row was actually inserted rather than updated;
otherwise <replaceable>oid</replaceable> is 0.
</para>
diff --git a/doc/src/sgml/ref/alter_policy.sgml b/doc/src/sgml/ref/alter_policy.sgml
index 796035e..86bda92 100644
--- a/doc/src/sgml/ref/alter_policy.sgml
+++ b/doc/src/sgml/ref/alter_policy.sgml
@@ -93,8 +93,11 @@ ALTER POLICY <replaceable class="parameter">name</replaceable> ON <replaceable c
The USING expression for the policy. This expression will be added as a
security-barrier qualification to queries which use the table
automatically. If multiple policies are being applied for a given
- table then they are all combined and added using OR. The USING
- expression applies to records which are being retrieved from the table.
+ table then they are all combined and added using OR (except as noted in
+ the <xref linkend="sql-createpolicy"> documentation for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ The USING expression applies to records which are being retrieved from the
+ table.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 646b08d..8c15798 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -63,11 +63,12 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
Policies can be applied for specific commands or for specific roles. The
default for newly created policies is that they apply for all commands and
roles, unless otherwise specified. If multiple policies apply to a given
- query, they will be combined using OR. Further, for commands which can have
- both USING and WITH CHECK policies (ALL and UPDATE), if no WITH CHECK policy
- is defined then the USING policy will be used for both what rows are visible
- (normal USING case) and which rows will be allowed to be added (WITH CHECK
- case).
+ query, they will be combined using OR (except as noted for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ Further, for commands which can have both USING and WITH CHECK policies (ALL
+ and UPDATE), if no WITH CHECK policy is defined then the USING policy will
+ be used for both what rows are visible (normal USING case) and which rows
+ will be allowed to be added (WITH CHECK case).
</para>
<para>
@@ -245,6 +246,19 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
as it only ever applies in cases where records are being added to the
relation.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>INSERT</literal> policy WITH
+ CHECK expression also passes for both any existing tuple in the target
+ table that necessitates that the <literal>UPDATE</literal> path be
+ taken, and the final tuple added back into the relation.
+ <literal>INSERT</literal> policies are separately combined using
+ <literal>OR</literal>, and this distinct set of policy expressions must
+ always pass, regardless of whether any or all <literal>UPDATE</literal>
+ policies also pass (in the same tuple check). However, successfully
+ inserted tuples are not subject to <literal>UPDATE</literal> policy
+ enforcement.
+ </para>
</listitem>
</varlistentry>
@@ -253,7 +267,9 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
<listitem>
<para>
Using <literal>UPDATE</literal> for a policy means that it will apply
- to <literal>UPDATE</literal> commands. As <literal>UPDATE</literal>
+ to <literal>UPDATE</literal> commands (or auxiliary <literal>ON
+ CONFLICT UPDATE</literal> clauses of <literal>INSERT</literal>
+ commands). As <literal>UPDATE</literal>
involves pulling an existing record and then making changes to some
portion (but possibly not all) of the record, the
<literal>UPDATE</literal> policy accepts both a USING expression and
@@ -269,6 +285,15 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
used for both <literal>USING</literal> and
<literal>WITH CHECK</literal> cases.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>UPDATE</literal> policy
+ USING expression always be treated as a WITH CHECK
+ expression. This <literal>UPDATE</literal> policy must
+ always pass, regardless of whether any
+ <literal>INSERT</literal> policy also passes in the same
+ tuple check.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml
index 677766a..34a4ae1 100644
--- a/doc/src/sgml/ref/create_rule.sgml
+++ b/doc/src/sgml/ref/create_rule.sgml
@@ -136,7 +136,12 @@ CREATE [ OR REPLACE ] RULE <replaceable class="parameter">name</replaceable> AS
<para>
The event is one of <literal>SELECT</literal>,
<literal>INSERT</literal>, <literal>UPDATE</literal>, or
- <literal>DELETE</literal>.
+ <literal>DELETE</literal>. Note that an
+ <command>INSERT</command> containing an <literal>ON
+ CONFLICT</literal> clause cannot be used on tables that have
+ either <literal>INSERT</literal> or <literal>UPDATE</literal>
+ rules. Consider using an updatable view instead, which have
+ limited support for <literal>ON CONFLICT IGNORE</literal> only.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 299cce8..a9c1124 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -708,7 +708,10 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<literal>EXCLUDE</>, and
<literal>REFERENCES</> (foreign key) constraints accept this
clause. <literal>NOT NULL</> and <literal>CHECK</> constraints are not
- deferrable.
+ deferrable. Note that constraints that were created with this
+ clause cannot be used as arbiters of whether or not to take the
+ alternative path with an <command>INSERT</command> statement
+ that includes an <literal>ON CONFLICT UPDATE</> clause.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index aae0b41..1b75b1a 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -76,7 +76,10 @@ CREATE [ CONSTRAINT ] TRIGGER <replaceable class="PARAMETER">name</replaceable>
executes once for any given operation, regardless of how many rows
it modifies (in particular, an operation that modifies zero rows
will still result in the execution of any applicable <literal>FOR
- EACH STATEMENT</literal> triggers).
+ EACH STATEMENT</literal> triggers). Note that since
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is considered an <command>INSERT</command> statement, no
+ <command>UPDATE</command> statement level trigger will be fired.
</para>
<para>
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 5dadab1..599c1cb 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -286,8 +286,9 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
<para>
Simple views are automatically updatable: the system will allow
<command>INSERT</>, <command>UPDATE</> and <command>DELETE</> statements
- to be used on the view in the same way as on a regular table. A view is
- automatically updatable if it satisfies all of the following conditions:
+ to be used on the view in the same way as on a regular table (aside from
+ the limitations on ON CONFLICT noted below). A view is automatically
+ updatable if it satisfies all of the following conditions:
<itemizedlist>
<listitem>
@@ -383,6 +384,34 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
not need any permissions on the underlying base relations (see
<xref linkend="rules-privileges">).
</para>
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT</> clause
+ is only supported on updatable views under specific circumstances.
+ If a set of columns/expressions has been provided with which to
+ infer a unique index to consider as the arbiter of whether the
+ statement ultimately takes an alternative path - if a would-be
+ duplicate violation in some particular unique index is tacitly
+ taken as provoking an alternative <command>UPDATE</command> or
+ <literal>IGNORE</> path - then updatable views are not supported.
+ Since this specification is already mandatory for
+ <command>INSERT</command> with <literal>ON CONFLICT UPDATE</>,
+ this implies that only the <literal>ON CONFLICT IGNORE</> variant
+ is supported, and only when there is no such specification. For
+ example:
+ </para>
+ <para>
+<programlisting>
+-- Unsupported:
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'foo') ON CONFLICT (key)
+ UPDATE SET val = EXCLUDED.val;
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'bar') ON CONFLICT (key)
+ IGNORE;
+
+-- Supported (note the omission of "key" column):
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'baz') ON CONFLICT
+ IGNORE;
+</programlisting>
+ </para>
</refsect2>
</refsect1>
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a3cccb9..40b7566 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -24,6 +24,14 @@ PostgreSQL documentation
[ WITH [ RECURSIVE ] <replaceable class="parameter">with_query</replaceable> [, ...] ]
INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) [, ...] | <replaceable class="PARAMETER">query</replaceable> }
+ [ ON CONFLICT [ ( { <replaceable class="parameter">column_name_index</replaceable> | ( <replaceable class="parameter">expression_index</replaceable> ) } [, ...] [ WHERE <replaceable class="PARAMETER">index_condition</replaceable> ] ) ]
+ { IGNORE | UPDATE
+ SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } |
+ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] )
+ } [, ...]
+ [ WHERE <replaceable class="PARAMETER">condition</replaceable> ]
+ }
+ ]
[ RETURNING * | <replaceable class="parameter">output_expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
</synopsis>
</refsynopsisdiv>
@@ -32,9 +40,15 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<title>Description</title>
<para>
- <command>INSERT</command> inserts new rows into a table.
- One can insert one or more rows specified by value expressions,
- or zero or more rows resulting from a query.
+ <command>INSERT</command> inserts new rows into a table. One can
+ insert one or more rows specified by value expressions, or zero or
+ more rows resulting from a query. An alternative path
+ (<literal>IGNORE</literal> or <literal>UPDATE</literal>) can
+ optionally be specified, to be taken in the event of detecting that
+ proceeding with insertion would result in a conflict (i.e. a
+ conflicting tuple already exists). The alternative path is
+ considered individually for each row proposed for insertion, and is
+ taken (or not taken) once per row.
</para>
<para>
@@ -59,25 +73,214 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</para>
<para>
+ The optional <literal>ON CONFLICT</> clause specifies a path to
+ take as an alternative to raising a conflict related error.
+ <literal>ON CONFLICT IGNORE</> simply avoids inserting any
+ individual row when it is determined that a conflict related error
+ would otherwise need to be raised. <literal>ON CONFLICT UPDATE</>
+ has the system take an <command>UPDATE</command> path in respect of
+ such rows instead. <literal>ON CONFLICT UPDATE</> guarantees an
+ atomic <command>INSERT</command> or <command>UPDATE</command>
+ outcome - provided there is no incidental error, one of those two
+ outcomes is guaranteed, even under high concurrency.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> optionally accepts a
+ <literal>WHERE</> clause <replaceable>condition</>. When provided,
+ the statement only proceeds with updating if the
+ <replaceable>condition</> is satisfied. Otherwise, unlike a
+ conventional <command>UPDATE</command>, the row is still locked for
+ update. Note that the <replaceable>condition</> is evaluated last,
+ after a conflict has been identified as a candidate to update.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> is effectively an auxiliary query of
+ its parent <command>INSERT</command>. Two aliases are visible to
+ the auxiliary query only - <varname>TARGET</> and
+ <varname>EXCLUDED</>. The first alias is just a standard alias for
+ the target relation in the context of the auxiliary query, while
+ the second alias refers to rows originally proposed for insertion.
+ Both aliases can be used in the auxiliary query targetlist and
+ <literal>WHERE</> clause. This allows expressions (in particular,
+ assignments) to reference rows originally proposed for insertion.
+ Note that the effects of all per-row <literal>BEFORE INSERT</>
+ triggers are carried forward. This is particularly useful for
+ multi-insert <literal>ON CONFLICT UPDATE</> statements; when
+ inserting or updating multiple rows, constants or parameter values
+ need only appear once.
+ </para>
+
+ <para>
+ There are several restrictions on the <literal>ON CONFLICT
+ UPDATE</> clause that do not apply to <command>UPDATE</command>
+ statements. Subqueries may not appear in either the
+ <command>UPDATE</command> targetlist, nor its <literal>WHERE</>
+ clause (although simple multi-assignment expressions are
+ supported). <literal>WHERE CURRENT OF</> cannot be used. In
+ general, only columns in the target table, and excluded values
+ originally proposed for insertion may be referenced. Operators and
+ functions may be used freely, though.
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is a <quote>deterministic</quote> statement. This means
+ that the command will not be allowed to affect any single existing
+ row more than once; a cardinality violation error will be raised
+ when this situation arises. Rows proposed for insertion should not
+ duplicate each other in terms of attributes constrained by the
+ conflict-arbitrating unique index. Note that the ordinary rules
+ for unique indexes with regard to null apply analogously to whether
+ or not an arbitrating unique index indicates if the alternative
+ path should be taken. This means that when a null value appears in
+ any uniquely constrained tuple's attribute in an
+ <command>INSERT</command> statement with <literal>ON CONFLICT
+ UPDATE</literal>, rows proposed for insertion will never take the
+ alternative path (provided that a <literal>BEFORE ROW
+ INSERT</literal> trigger does not make null values non-null before
+ insertion); the statement will always insert, assuming there is no
+ unrelated error. Note that merely locking a row (by having it not
+ satisfy the <literal>WHERE</> clause <replaceable>condition</>)
+ does not count towards whether or not the row has been affected
+ multiple times (and whether or not a cardinality violation error is
+ raised). However, the implementation checks for cardinality
+ violations after locking the row, and before updating (or
+ considering updating), so a cardinality violation may be raised
+ despite the fact that the row would not otherwise have gone on to
+ be updated if and only if the existing row was updated by the
+ <literal>ON CONFLICT UPDATE</literal> command at least once
+ already.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> requires a <emphasis>unique index
+ inference</emphasis> specification, which consists of one or more
+ <replaceable class="PARAMETER">column_name_index</replaceable>
+ columns and/or <replaceable
+ class="PARAMETER">expression_index</replaceable> expressions on
+ columns, appearing between parenthesis. These are used to infer a
+ single unique index to limit pre-checking for conflicts to (if no
+ appropriate index is available, an error is raised). A subset of
+ the table to limit the check for conflicts to can optionally also
+ be specified using <replaceable
+ class="PARAMETER">index_condition</replaceable>. Note that any
+ available unique index must only cover at least that subset in
+ order to be arbitrate taking the alternative path; it need not
+ match exactly, and so a non-partial unique index that otherwise
+ matches is applicable. <literal>ON CONFLICT IGNORE</> makes an
+ inference specification optional; omitting the specification
+ indicates a total indifference to where any conflict could occur,
+ which isn't always appropriate. At times, it may be desirable for
+ <literal>ON CONFLICT IGNORE</> to <emphasis>not</emphasis> suppress
+ a conflict related error associated with an index where that isn't
+ explicitly anticipated. Note that <literal>ON CONFLICT UPDATE</>
+ assignment may result in a uniqueness violation, just as with a
+ conventional <command>UPDATE</command>.
+ </para>
+
+ <para>
+ Columns and/or expressions appearing in a unique index inference
+ specification must match all the columns/expressions of some
+ existing unique index on <replaceable
+ class="PARAMETER">table_name</replaceable> - there can be no
+ columns/expressions from the unique index that do not appear in the
+ inference specification, nor can there be any columns/expressions
+ appearing in the inference specification that do not appear in the
+ unique index definition. However, the order of the
+ columns/expressions in the index definition, or whether or not the
+ index definition specified <literal>NULLS FIRST</> or
+ <literal>NULLS LAST</>, or the internal sort order of each column
+ (whether <literal>DESC</> or <literal>ASC</> were specified) are
+ all irrelevant. Deferred unique constraints are not supported as
+ arbiters of whether an alternative <literal>ON CONFLICT</> path
+ should be taken.
+ </para>
+
+ <para>
+ The definition of a conflict for the purposes of <literal>ON
+ CONFLICT</> is somewhat subtle, although the exact definition is
+ seldom of great interest. A conflict is either a unique violation
+ from a unique constraint (or unique index), or an exclusion
+ violation from an exclusion constraint. Only unique indexes can be
+ inferred with a unique index inference specification, which is
+ required for the <command>UPDATE</command> variant, so in effect
+ only unique constraints (and unique indexes) are supported by the
+ <command>UPDATE</command> variant. In contrast to the rules around
+ certain other SQL clauses, like the <literal>DISTINCT</literal>
+ clause, the definition of a duplicate (a conflict) is based on
+ whatever unique indexes happen to be defined on columns on the
+ table. This means that if a user-defined type has multiple sort
+ orders, and the "equals" operator of any of those available sort
+ orders happens to be inconsistent (which goes against an unenforced
+ convention of <productname>PostgreSQL</productname>), the exact
+ behavior depends on the choice of operator class when the unique
+ index was created initially, and not any other consideration such
+ as the default operator class for the type of each indexed column.
+ If there are multiple unique indexes available that seem like
+ equally suitable candidates, but with inconsistent definitions of
+ "equals", then the system chooses whatever it estimates to be the
+ cheapest one to use as an arbiter of taking the alternative
+ <command>UPDATE</command>/<literal>IGNORE</literal> path.
+ </para>
+
+ <para>
+ The optional <replaceable
+ class="PARAMETER">index_condition</replaceable> can be used to
+ allow the inference specification to infer that a partial unique
+ index can be used. Any unique index that otherwise satisfies the
+ inference specification, while also covering at least all the rows
+ in the table covered by <replaceable
+ class="PARAMETER">index_condition</replaceable> may be used. It is
+ recommended that the partial index predicate of the unique index
+ intended to be used as the arbiter of taking the alternative path
+ be matched exactly, but this is not required. Note that an error
+ will be raised if an arbiter unique index is chosen that does not
+ cover the tuple or tuples ultimately proposed for insertion.
+ However, an overly specific <replaceable
+ class="PARAMETER">index_condition</replaceable> does not imply that
+ arbitrating conflicts will be limited to the subset of rows covered
+ by the inferred unique index corresponding to <replaceable
+ class="PARAMETER">index_condition</replaceable>.
+ </para>
+
+ <para>
The optional <literal>RETURNING</> clause causes <command>INSERT</>
- to compute and return value(s) based on each row actually inserted.
+ to compute and return value(s) based on each row actually inserted
+ (or updated, if an <literal>ON CONFLICT UPDATE</> clause was used).
This is primarily useful for obtaining values that were supplied by
defaults, such as a serial sequence number. However, any expression
using the table's columns is allowed. The syntax of the
<literal>RETURNING</> list is identical to that of the output list
- of <command>SELECT</>.
+ of <command>SELECT</>. Only rows that were successfully inserted
+ or updated will be returned. If a row was locked but not updated
+ because an <literal>ON CONFLICT UPDATE</> <literal>WHERE</> clause
+ did not pass, the row will not be returned. Since
+ <literal>RETURNING</> is not part of the <command>UPDATE</>
+ auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
+ (<varname>TARGET</> and <varname>EXCLUDED</>) may not be
+ referenced; only the row as it exists after updating (or
+ inserting) is returned.
</para>
<para>
You must have <literal>INSERT</literal> privilege on a table in
- order to insert into it. If a column list is specified, you only
- need <literal>INSERT</literal> privilege on the listed columns.
- Use of the <literal>RETURNING</> clause requires <literal>SELECT</>
- privilege on all columns mentioned in <literal>RETURNING</>.
- If you use the <replaceable
- class="PARAMETER">query</replaceable> clause to insert rows from a
- query, you of course need to have <literal>SELECT</literal> privilege on
- any table or column used in the query.
+ order to insert into it, as well as <literal>UPDATE
+ privilege</literal> if and only if <literal>ON CONFLICT UPDATE</>
+ is specified. If a column list is specified, you only need
+ <literal>INSERT</literal> privilege on the listed columns.
+ Similarly, when <literal>ON CONFLICT UPDATE</> is specified, you
+ only need <literal>UPDATE</> privilege on the column(s) that are
+ listed to be updated, as well as SELECT privilege on any column
+ whose values are read in the <literal>ON CONFLICT UPDATE</>
+ expressions or <replaceable>condition</>. Use of the
+ <literal>RETURNING</> clause requires <literal>SELECT</> privilege
+ on all columns mentioned in <literal>RETURNING</>. If you use the
+ <replaceable class="PARAMETER">query</replaceable> clause to insert
+ rows from a query, you of course need to have
+ <literal>SELECT</literal> privilege on any table or column used in
+ the query.
</para>
</refsect1>
@@ -121,7 +324,54 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
The name of a column in the table named by <replaceable class="PARAMETER">table_name</replaceable>.
The column name can be qualified with a subfield name or array
subscript, if needed. (Inserting into only some fields of a
- composite column leaves the other fields null.)
+ composite column leaves the other fields null.) When
+ referencing a column with <literal>ON CONFLICT UPDATE</>, do not
+ include the table's name in the specification of a target
+ column. For example, <literal>INSERT ... ON CONFLICT UPDATE tab
+ SET TARGET.col = 1</> is invalid.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name_index</replaceable></term>
+ <listitem>
+ <para>
+ The name of a <replaceable
+ class="PARAMETER">table_name</replaceable> column (with several
+ columns potentially named). These are used to infer a
+ particular unique index defined on <replaceable
+ class="PARAMETER">table_name</replaceable>. This requires
+ <literal>ON CONFLICT UPDATE</> and <literal>ON CONFLICT
+ IGNORE</> to assume that all expected sources of uniqueness
+ violations originate within the columns/rows constrained by the
+ unique index. When this is omitted, (which is forbidden with
+ the <literal>ON CONFLICT UPDATE</> variant), the system checks
+ for sources of uniqueness violations ahead of time in all unique
+ indexes. Otherwise, only a single specified unique index is
+ checked ahead of time, and uniqueness violation errors can
+ appear for conflicts originating in any other unique index. If
+ a unique index cannot be inferred, an error is raised.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">expression_index</replaceable></term>
+ <listitem>
+ <para>
+ Equivalent to <replaceable
+ class="PARAMETER">column_name_index</replaceable>, but used to
+ infer a particular expressional index instead.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">index_condition</replaceable></term>
+ <listitem>
+ <para>
+ Used to allow inference of partial unique indexes.
</para>
</listitem>
</varlistentry>
@@ -167,12 +417,25 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</varlistentry>
<varlistentry>
+ <term><replaceable class="PARAMETER">condition</replaceable></term>
+ <listitem>
+ <para>
+ An expression that returns a value of type <type>boolean</type>.
+ Only rows for which this expression returns <literal>true</>
+ will be updated, although all rows will be locked when the
+ <literal>ON CONFLICT UPDATE</> path is taken.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+
<term><replaceable class="PARAMETER">output_expression</replaceable></term>
<listitem>
<para>
An expression to be computed and returned by the <command>INSERT</>
- command after each row is inserted. The expression can use any
- column names of the table named by <replaceable class="PARAMETER">table_name</replaceable>.
+ command after each row is inserted (not updated). The
+ expression can use any column names of the table named by
+ <replaceable class="PARAMETER">table_name</replaceable>.
Write <literal>*</> to return all columns of the inserted row(s).
</para>
</listitem>
@@ -198,20 +461,29 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<screen>
INSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
</screen>
+ However, in the event of an <literal>ON CONFLICT UPDATE</> clause
+ (but <emphasis>not</emphasis> in the event of an <literal>ON
+ CONFLICT IGNORE</> clause), the command tag reports the number of
+ rows inserted or updated together, of the form
+<screen>
+UPSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
+</screen>
The <replaceable class="parameter">count</replaceable> is the number
of rows inserted. If <replaceable class="parameter">count</replaceable>
is exactly one, and the target table has OIDs, then
<replaceable class="parameter">oid</replaceable> is the
- <acronym>OID</acronym> assigned to the inserted row. Otherwise
- <replaceable class="parameter">oid</replaceable> is zero.
+ <acronym>OID</acronym>
+ assigned to the inserted row (but not if there is only a single
+ updated row). Otherwise <replaceable
+ class="parameter">oid</replaceable> is zero..
</para>
<para>
If the <command>INSERT</> command contains a <literal>RETURNING</>
clause, the result will be similar to that of a <command>SELECT</>
statement containing the columns and values defined in the
- <literal>RETURNING</> list, computed over the row(s) inserted by the
- command.
+ <literal>RETURNING</> list, computed over the row(s) inserted or
+ updated by the command.
</para>
</refsect1>
@@ -311,7 +583,63 @@ WITH upd AS (
RETURNING *
)
INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
-</programlisting></para>
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Assumes a unique
+ index has been defined that constrains values appearing in the
+ <literal>did</literal> column. Note that an <varname>EXCLUDED</>
+ expression is used to reference values originally proposed for
+ insertion:
+<programlisting>
+ INSERT INTO distributors (did, dname)
+ VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc')
+ ON CONFLICT (did) UPDATE SET dname = EXCLUDED.dname
+</programlisting>
+ </para>
+ <para>
+ Insert a distributor, or do nothing for rows proposed for insertion
+ when an existing, excluded row (a row with a matching constrained
+ column or columns after before row insert triggers fire) exists.
+ Example assumes a unique index has been defined that constrains
+ values appearing in the <literal>did</literal> column (although
+ since the <literal>IGNORE</> variant was used, the specification of
+ columns to infer a unique index from is not mandatory):
+<programlisting>
+ INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH')
+ ON CONFLICT (did) IGNORE
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Example assumes
+ a unique index has been defined that constrains values appearing in
+ the <literal>did</literal> column. <literal>WHERE</> clause is
+ used to limit the rows actually updated (any existing row not
+ updated will still be locked, though):
+<programlisting>
+ -- Don't update existing distributors based in a certain ZIP code
+ INSERT INTO distributors (did, dname) VALUES (8, 'Anvil Distribution')
+ ON CONFLICT (did) UPDATE
+ SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'
+ WHERE TARGET.zipcode != '21201'
+</programlisting>
+ </para>
+ <para>
+ Insert new distributor if possible; otherwise
+ <literal>IGNORE</literal>. Example assumes a unique index has been
+ defined that constrains values appearing in the
+ <literal>did</literal> column on a subset of rows where the
+ <literal>is_active</literal> boolean column evaluates to
+ <literal>true</literal>:
+<programlisting>
+ -- This statement could infer a partial unique index on did
+ -- with a predicate of WHERE is_active, but it could also
+ -- just use a regular unique constraint on did if that was
+ -- all that was available.
+ INSERT INTO distributors (did, dname) VALUES (9, 'Antwerp Design')
+ ON CONFLICT (did WHERE is_active) IGNORE
+</programlisting>
+ </para>
</refsect1>
<refsect1>
@@ -321,7 +649,8 @@ INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
<command>INSERT</command> conforms to the SQL standard, except that
the <literal>RETURNING</> clause is a
<productname>PostgreSQL</productname> extension, as is the ability
- to use <literal>WITH</> with <command>INSERT</>.
+ to use <literal>WITH</> with <command>INSERT</>, and the ability to
+ specify an alternative path with <literal>ON CONFLICT</>.
Also, the case in
which a column name list is omitted, but not all the columns are
filled from the <literal>VALUES</> clause or <replaceable>query</>,
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
index 7c31871..1e0a2f8 100644
--- a/doc/src/sgml/ref/set_constraints.sgml
+++ b/doc/src/sgml/ref/set_constraints.sgml
@@ -69,7 +69,11 @@ SET CONSTRAINTS { ALL | <replaceable class="parameter">name</replaceable> [, ...
<para>
Currently, only <literal>UNIQUE</>, <literal>PRIMARY KEY</>,
<literal>REFERENCES</> (foreign key), and <literal>EXCLUDE</>
- constraints are affected by this setting.
+ constraints are affected by this setting. Note that constraints
+ that were created with this clause cannot be used as arbiters of
+ whether or not to take the alternative path with an
+ <command>INSERT</command> statement that includes an <literal>ON
+ CONFLICT UPDATE</> clause.
<literal>NOT NULL</> and <literal>CHECK</> constraints are
always checked immediately when a row is inserted or modified
(<emphasis>not</> at the end of the statement).
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f94aea1..5141690 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -40,14 +40,17 @@
On tables and foreign tables, triggers can be defined to execute either
before or after any <command>INSERT</command>, <command>UPDATE</command>,
or <command>DELETE</command> operation, either once per modified row,
- or once per <acronym>SQL</acronym> statement.
- <command>UPDATE</command> triggers can moreover be set to fire only if
- certain columns are mentioned in the <literal>SET</literal> clause of the
- <command>UPDATE</command> statement.
- Triggers can also fire for <command>TRUNCATE</command> statements.
- If a trigger event occurs, the trigger's function is called at the
- appropriate time to handle the event. Foreign tables do not support the
- TRUNCATE statement at all.
+ or once per <acronym>SQL</acronym> statement. If an
+ <command>INSERT</command> contains an <literal>ON CONFLICT UPDATE</>
+ clause, it is possible that the effects of a BEFORE insert trigger and
+ a BEFORE update trigger can both be applied twice, if a reference to
+ an <varname>EXCLUDED</> column appears. <command>UPDATE</command>
+ triggers can moreover be set to fire only if certain columns are
+ mentioned in the <literal>SET</literal> clause of the
+ <command>UPDATE</command> statement. Triggers can also fire for
+ <command>TRUNCATE</command> statements. If a trigger event occurs,
+ the trigger's function is called at the appropriate time to handle the
+ event. Foreign tables do not support the TRUNCATE statement at all.
</para>
<para>
@@ -119,6 +122,36 @@
</para>
<para>
+ If an <command>INSERT</command> contains an <literal>ON CONFLICT
+ UPDATE</> clause, it is possible that the effects of all row-level
+ <literal>BEFORE</> <command>INSERT</command> triggers and all
+ row-level BEFORE <command>UPDATE</command> triggers can both be
+ applied in a way that is apparent from the final state of the updated
+ row, if an <varname>EXCLUDED</> column is referenced. There need not
+ be an <varname>EXCLUDED</> column reference for both sets of BEFORE
+ row-level triggers to execute, though. The possibility of surprising
+ outcomes should be considered when there are both <literal>BEFORE</>
+ <command>INSERT</command> and <literal>BEFORE</>
+ <command>UPDATE</command> row-level triggers that both affect a row
+ being inserted/updated (this can still be problematic if the
+ modifications are more or less equivalent if they're not also
+ idempotent). Note that statement-level <command>UPDATE</command>
+ triggers are executed when <literal>ON CONFLICT UPDATE</> is
+ specified, regardless of whether or not any rows were affected by
+ the <command>UPDATE</command>. An <command>INSERT</command> with
+ an <literal>ON CONFLICT UPDATE</> clause will execute
+ statement-level <literal>BEFORE</> <command>INSERT</command>
+ triggers first, then statement-level <literal>BEFORE</>
+ <command>UPDATE</command> triggers, followed by statement-level
+ <literal>AFTER</> <command>UPDATE</command> triggers and finally
+ statement-level <literal>AFTER</> <command>INSERT</command>
+ triggers. <literal>ON CONFLICT UPDATE</> is not supported on
+ views (Only <literal>ON CONFLICT IGNORE</> is supported on
+ updatable views); therefore, unpredictable interactions with
+ <literal>INSTEAD OF</> triggers are not possible.
+ </para>
+
+ <para>
Trigger functions invoked by per-statement triggers should always
return <symbol>NULL</symbol>. Trigger functions invoked by per-row
triggers can return a table row (a value of
--
1.9.1
0007-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchtext/x-patch; charset=US-ASCII; name=0007-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchDownload
From 6b413f444cf2f84dda76e8d907d25b433fd9496e Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:16:11 -0700
Subject: [PATCH 7/8] Internal documentation for INSERT ... ON CONFLICT {UPDATE
| IGNORE}
Includes documentation for executor README. A high-level handling of
approach #2 to value locking also appears there, since in contrast with
design #1, that is something that lives in the head of the executor.
---
src/backend/executor/README | 49 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 8afa1e3..0c351c5 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -200,3 +200,52 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
that only the first result row of an SRF counts, because all subsequent
rows will result in attempts to re-update an already updated target row.
This is historical behavior and seems not worth changing.)
+
+Speculative insertion
+---------------------
+
+Speculative insertion is a process that the executor manages for the
+benefit of INSERT...ON CONFLICT UPDATE... . The basic idea is that
+values within AMs (that do not currently exist) are "speculatively
+locked". If a consensus to insert emerges among all unique indexes,
+we proceed with physical index tuple insertion for each unique index
+in turn, releasing value locks as each physical insertion is
+performed. Otherwise, we must UPDATE the existing value (or IGNORE).
+"Value locks" are implemented using special "speculative heap tuples",
+that represent an attempt to lock values (with special handling for
+race conditions).
+
+"Speculative insertion" is prepared to release "value locks" when a
+conflict occurs. This prevents "unprincipled deadlocks". In essence,
+we cannot allow other xacts to wait on our speculatively-inserted
+tuple as if it was a properly inserted tuple. They'd have to wait
+until xact end, which might be too long, while also implying
+"unprincipled deadlocks". We are prepared for conflicts both when
+"value locking", and when row locking.
+
+When we UPDATE, value locks are released before an opportunistic
+attempt at locking a conclusively visible conflicting tuple occurs. If
+this process fails, we retry. We may retry indefinitely. Failing to
+release value locks serves no practical purpose, since they don't
+prevent many types of conflicts that the UPDATE case must care about,
+and is actively harmful, since it will result in unprincipled
+deadlocking under high concurrency.
+
+The representation of the UPDATE query tree is as a separate query
+tree, auxiliary to the main INSERT query tree, and its plan is not
+formally a subplan of the parent INSERT's. Rather, the plan's state
+is used selectively by its parent.
+
+Having successfully locked a definitively visible tuple, we update it,
+applying the EvalPlanQual() query execution mechanism to the latest
+(at just determined by an amcanunique AM) conclusively visible, now
+locked tuple. Earlier versions are not evaluated against our qual,
+and we never directly walk the update chain in the event of the tuple
+being deleted/updated (which is conceptually a conflict). The process
+simply restarts without making useful progress in the present
+iteration. It is sometimes necessary to UPDATE a row where no row
+version is visible, so it seems inconsistent to require that earlier
+versions (including a version that may exist that is visible to our
+command's MVCC snapshot) must satisfy the qual just because there
+happened to be a version visible, where otherwise no evaluation would
+occur.
--
1.9.1
0006-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0006-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 840ab27666378d7405b782628be863615e66aafb Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:11:15 -0700
Subject: [PATCH 6/8] Tests for INSERT ... ON CONFLICT {UPDATE | IGNORE}
Add dedicated isolation tests for both UPDATE and IGNORE variants,
illustrating the "MVCC violation" that allows a READ COMMITTED
transaction's UPDATE to succeed in updating a tuple with no version
visible to its command's MVCC snapshot. Add regression tests, which for
the most part are intended to exercise interactions with other features
(e.g. updatable views, inheritance, triggers, RLS).
Add a few general purpose smoke tests too, testing everything from
EXPLAIN output to unique index inference (expression indexes, partial
indexes, etc).
---
contrib/postgres_fdw/expected/postgres_fdw.out | 7 +
contrib/postgres_fdw/sql/postgres_fdw.sql | 3 +
.../isolation/expected/insert-conflict-ignore.out | 23 ++
.../expected/insert-conflict-update-2.out | 23 ++
.../expected/insert-conflict-update-3.out | 26 +++
.../isolation/expected/insert-conflict-update.out | 23 ++
src/test/isolation/isolation_schedule | 4 +
.../isolation/specs/insert-conflict-ignore.spec | 41 ++++
.../isolation/specs/insert-conflict-update-2.spec | 41 ++++
.../isolation/specs/insert-conflict-update-3.spec | 69 ++++++
.../isolation/specs/insert-conflict-update.spec | 40 ++++
src/test/regress/expected/insert_conflict.out | 242 +++++++++++++++++++++
src/test/regress/expected/privileges.out | 7 +-
src/test/regress/expected/rowsecurity.out | 90 ++++++++
src/test/regress/expected/rules.out | 21 ++
src/test/regress/expected/subselect.out | 22 ++
src/test/regress/expected/triggers.out | 102 ++++++++-
src/test/regress/expected/updatable_views.out | 4 +
src/test/regress/expected/update.out | 27 +++
src/test/regress/expected/with.out | 74 +++++++
src/test/regress/input/constraints.source | 5 +
src/test/regress/output/constraints.source | 15 +-
src/test/regress/parallel_schedule | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/insert_conflict.sql | 192 ++++++++++++++++
src/test/regress/sql/privileges.sql | 5 +-
src/test/regress/sql/rowsecurity.sql | 73 +++++++
src/test/regress/sql/rules.sql | 14 ++
src/test/regress/sql/subselect.sql | 14 ++
src/test/regress/sql/triggers.sql | 69 +++++-
src/test/regress/sql/updatable_views.sql | 2 +
src/test/regress/sql/update.sql | 14 ++
src/test/regress/sql/with.sql | 37 ++++
33 files changed, 1323 insertions(+), 8 deletions(-)
create mode 100644 src/test/isolation/expected/insert-conflict-ignore.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-2.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-3.out
create mode 100644 src/test/isolation/expected/insert-conflict-update.out
create mode 100644 src/test/isolation/specs/insert-conflict-ignore.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-2.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-3.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update.spec
create mode 100644 src/test/regress/expected/insert_conflict.out
create mode 100644 src/test/regress/sql/insert_conflict.sql
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 583cce7..5133386 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2327,6 +2327,13 @@ INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
ERROR: duplicate key value violates unique constraint "t1_pkey"
DETAIL: Key ("C 1")=(11) already exists.
CONTEXT: Remote SQL command: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
ERROR: new row for relation "T 1" violates check constraint "c2positive"
DETAIL: Failing row contains (1111, -2, null, null, null, null, ft1 , null).
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83e8fa7..e01d34e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -372,6 +372,9 @@ UPDATE ft2 SET c2 = c2 + 600 WHERE c1 % 10 = 8 AND c1 < 1200 RETURNING *;
ALTER TABLE "S 1"."T 1" ADD CONSTRAINT c2positive CHECK (c2 >= 0);
INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
UPDATE ft1 SET c2 = -c2 WHERE c1 = 1; -- c2positive
diff --git a/src/test/isolation/expected/insert-conflict-ignore.out b/src/test/isolation/expected/insert-conflict-ignore.out
new file mode 100644
index 0000000..e6cc2a1
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-ignore.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: ignore1 ignore2 c1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step c1: COMMIT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore1
+step c2: COMMIT;
+
+starting permutation: ignore1 ignore2 a1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step a1: ABORT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-2.out b/src/test/isolation/expected/insert-conflict-update-2.out
new file mode 100644
index 0000000..6a5ddfe
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-2.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-3.out b/src/test/isolation/expected/insert-conflict-update-3.out
new file mode 100644
index 0000000..29dd8b0
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-3.out
@@ -0,0 +1,26 @@
+Parsed test spec with 2 sessions
+
+starting permutation: update2 insert1 c2 select1surprise c1
+step update2: UPDATE colors SET is_active = true WHERE key = 1;
+step insert1:
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key; <waiting ...>
+step c2: COMMIT;
+step insert1: <... completed>
+key color is_active
+
+1 Red f
+2 Green f
+3 Blue f
+step select1surprise: SELECT * FROM colors ORDER BY key;
+key color is_active
+
+1 Brown t
+2 Green f
+3 Blue f
+step c1: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update.out b/src/test/isolation/expected/insert-conflict-update.out
new file mode 100644
index 0000000..6976124
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index c055a53..50948a2 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -16,6 +16,10 @@ test: fk-deadlock2
test: eval-plan-qual
test: lock-update-delete
test: lock-update-traversal
+test: insert-conflict-ignore
+test: insert-conflict-update
+test: insert-conflict-update-2
+test: insert-conflict-update-3
test: delete-abort-savept
test: delete-abort-savept-2
test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/insert-conflict-ignore.spec b/src/test/isolation/specs/insert-conflict-ignore.spec
new file mode 100644
index 0000000..fde43b3
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-ignore.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT IGNORE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions during INSERT...ON CONFLICT IGNORE.
+#
+# The convention here is that session 1 always ends up inserting, and session 2
+# always ends up ignoring.
+
+setup
+{
+ CREATE TABLE ints (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE ints;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore1" { INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore2" { INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; }
+step "select2" { SELECT * FROM ints; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# Regular case where one session block-waits on another to determine if it
+# should proceed with an insert or ignore.
+permutation "ignore1" "ignore2" "c1" "select2" "c2"
+permutation "ignore1" "ignore2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-2.spec b/src/test/isolation/specs/insert-conflict-update-2.spec
new file mode 100644
index 0000000..3e6e944
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-2.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test shows a plausible scenario in which the user might wish to UPDATE a
+# value that is also constrained by the unique index that is the arbiter of
+# whether the alternative path should be taken.
+
+setup
+{
+ CREATE TABLE upsert (key text not null, payload text);
+ CREATE UNIQUE INDEX ON upsert(lower(key));
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. The user can still usefully UPDATE
+# a column constrained by a unique index, as the example illustrates.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-3.spec b/src/test/isolation/specs/insert-conflict-update-3.spec
new file mode 100644
index 0000000..94ae3df
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-3.spec
@@ -0,0 +1,69 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# Other INSERT...ON CONFLICT UPDATE isolation tests illustrate the "MVCC
+# violation" added to facilitate the feature, whereby a
+# not-visible-to-our-snapshot tuple can be updated by our command all the same.
+# This is generally needed to provide a guarantee of a successful INSERT or
+# UPDATE in READ COMMITTED mode. This MVCC violation is quite distinct from
+# the putative "MVCC violation" that has existed in PostgreSQL for many years,
+# the EvalPlanQual() mechanism, because that mechanism always starts from a
+# tuple that is visible to the command's MVCC snapshot. This test illustrates
+# a slightly distinct user-visible consequence of the same MVCC violation
+# generally associated with INSERT...ON CONFLICT UPDATE. The impact of the
+# MVCC violation goes a little beyond updating MVCC-invisible tuples.
+#
+# With INSERT...ON CONFLICT UPDATE, the UPDATE predicate is only evaluated
+# once, on this conclusively-locked tuple, and not any other version of the
+# same tuple. It is therefore possible (in READ COMMITTED mode) that the
+# predicate "fail to be satisfied" according to the command's MVCC snapshot.
+# It might simply be that there is no row version visible, but it's also
+# possible that there is some row version visible, but only as a version that
+# doesn't satisfy the predicate. If, however, the conclusively-locked version
+# satisfies the predicate, that's good enough, and the tuple is updated. The
+# MVCC-snapshot-visible row version is denied the opportunity to prevent the
+# UPDATE from taking place, because we don't walk the UPDATE chain in the usual
+# way.
+
+setup
+{
+ CREATE TABLE colors (key int4 PRIMARY KEY, color text, is_active boolean);
+ INSERT INTO colors (key, color, is_active) VALUES(1, 'Red', false);
+ INSERT INTO colors (key, color, is_active) VALUES(2, 'Green', false);
+ INSERT INTO colors (key, color, is_active) VALUES(3, 'Blue', false);
+}
+
+teardown
+{
+ DROP TABLE colors;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" {
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key;}
+step "select1surprise" { SELECT * FROM colors ORDER BY key; }
+step "c1" { COMMIT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "update2" { UPDATE colors SET is_active = true WHERE key = 1; }
+step "c2" { COMMIT; }
+
+# Perhaps surprisingly, the session 1 MVCC-snapshot-visible tuple (the tuple
+# with the pre-populated color 'Red') is denied the opportunity to prevent the
+# UPDATE from taking place -- only the conclusively-locked tuple version
+# matters, and so the tuple with key value 1 was updated to 'Brown' (but not
+# tuple with key value 2, since nothing changed there):
+permutation "update2" "insert1" "c2" "select1surprise" "c1"
diff --git a/src/test/isolation/specs/insert-conflict-update.spec b/src/test/isolation/specs/insert-conflict-update.spec
new file mode 100644
index 0000000..6529a0c
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update.spec
@@ -0,0 +1,40 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions.
+
+setup
+{
+ CREATE TABLE upsert (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. Notably, this entails updating a
+# tuple while there is no version of that tuple visible to the updating
+# session's snapshot. This is permitted only in READ COMMITTED mode.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
new file mode 100644
index 0000000..bd35585
--- /dev/null
+++ b/src/test/regress/expected/insert_conflict.out
@@ -0,0 +1,242 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+ QUERY PLAN
+---------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+(3 rows)
+
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+ QUERY PLAN
+---------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+ Filter: (fruit <> 'Cawesh'::text)
+(4 rows)
+
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+ QUERY PLAN
+----------------------------------------------------------
+ Insert on insertconflicttest
+ -> Result
+ -> Conflict Update on insertconflicttest
+ Filter: ((excluded.fruit) <> 'Elderberry'::text)
+(4 rows)
+
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+ QUERY PLAN
+--------------------------------------------------
+ [ +
+ { +
+ "Plan": { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Insert", +
+ "Relation Name": "insertconflicttest", +
+ "Alias": "insertconflicttest", +
+ "Arbiter Index": "key_index", +
+ "Plans": [ +
+ { +
+ "Node Type": "Result", +
+ "Parent Relationship": "Member" +
+ }, +
+ { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Conflict Update", +
+ "Parent Relationship": "Member", +
+ "Relation Name": "insertconflicttest",+
+ "Alias": "insertconflicttest", +
+ "Filter": "(fruit <> 'Lime'::text)" +
+ } +
+ ] +
+ } +
+ } +
+ ]
+(1 row)
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+ERROR: ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from
+LINE 1: ...nsert into insertconflicttest values (1, 'Apple') on conflic...
+ ^
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+ERROR: invalid reference to FROM-clause entry for table "insertconflicttest"
+LINE 1: ...(1, 'Apple') on conflict (key) update set fruit = insertconf...
+ ^
+HINT: Perhaps you meant to reference the table alias "excluded".
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index key_index;
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index comp_key_index;
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_key_index;
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_comp_key_index;
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+ERROR: duplicate key value violates unique constraint "fruit_index"
+DETAIL: Key (fruit)=(Peach) already exists.
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+drop index key_index;
+drop index fruit_index;
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+ERROR: partial arbiter unique index has predicate that does not cover tuple proposed for insertion
+DETAIL: ON CONFLICT inference clause implies that the tuple proposed for insertion actually be covered by partial predicate for index "partial_key_index".
+HINT: ON CONFLICT inference clause must infer a unique index that covers the final tuple, after BEFORE ROW INSERT triggers fire.
+drop index partial_key_index;
+-- Cleanup
+drop table insertconflicttest;
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+create table capitals (
+ state char(2)
+) inherits (cities);
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+-- Tests proper for inheritance:
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+-- Succeeds:
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 74b0450..bc44c45 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -269,7 +269,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
ERROR: permission denied for relation atest2
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -367,6 +367,11 @@ UPDATE atest5 SET one = 8; -- fail
ERROR: permission denied for relation atest5
UPDATE atest5 SET three = 5, one = 2; -- fail
ERROR: permission denied for relation atest5
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+ERROR: permission denied for relation atest5
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
+ERROR: permission denied for relation atest5
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
GRANT SELECT (one,two,blue) ON atest6 TO regressuser4;
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 21817d8..07cb54f 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -1179,6 +1179,96 @@ NOTICE: f_leak => yyyyyy
(3 rows)
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------
+ 33 | 22 | 1 | rls_regress_user1 | okay science fiction
+(1 row)
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+SET SESSION AUTHORIZATION rls_regress_user1;
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------
+ 2 | 11 | 2 | rls_regress_user1 | my first novel
+(1 row)
+
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+------------------
+ 78 | 11 | 1 | rls_regress_user1 | some other novel
+(1 row)
+
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------------------
+ 88 | 33 | 1 | rls_regress_user1 | technology book, can only insert
+(1 row)
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d50b103..c634579 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1123,6 +1123,10 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
SELECT * FROM shoelace_obsolete WHERE sl_avail = 0;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
+ERROR: INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
------------+----------+------------+--------+----------+-----------
@@ -2351,6 +2355,23 @@ DETAIL: Key (id3a, id3c)=(1, 13) is not present in table "rule_and_refint_t2".
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
+ERROR: relation "shoelace" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
where (((rule_and_refint_t3.id3a = new.id3a)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index b14410f..9ba3a44 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -639,6 +639,28 @@ from
(0 rows)
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 1: ...conflict (key) update set val = 'unsupported ' || (select f1...
+ ^
+select * from upsert;
+ key | val
+-----+-----
+ 1 | val
+(1 row)
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: on conflict (key) update set val = (select u from aa)
+ ^
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
create temp table outer_7597 (f1 int4, f2 int4);
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index f1a5fde..77dfa06 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -274,7 +274,7 @@ drop sequence ttdummy_seq;
-- tests for per-statement triggers
--
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
BEGIN
@@ -291,6 +291,14 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
--
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
+NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+NOTICE: trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
INSERT INTO main_table DEFAULT VALUES;
@@ -305,6 +313,8 @@ NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, lev
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
@@ -1731,3 +1741,93 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (1,black)
+WARNING: after insert (new): (1,black)
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (2,red)
+WARNING: before insert (new, modified): (3,"red trig modified")
+WARNING: after insert (new): (3,"red trig modified")
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (3,orange)
+WARNING: before update (old): (3,"red trig modified")
+WARNING: before update (new): (3,"updated red trig modified")
+WARNING: after update (old): (3,"updated red trig modified")
+WARNING: after update (new): (3,"updated red trig modified")
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (4,green)
+WARNING: before insert (new, modified): (5,"green trig modified")
+WARNING: after insert (new): (5,"green trig modified")
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (5,purple)
+WARNING: before update (old): (5,"green trig modified")
+WARNING: before update (new): (5,"updated green trig modified")
+WARNING: after update (old): (5,"updated green trig modified")
+WARNING: after update (new): (5,"updated green trig modified")
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (6,white)
+WARNING: before insert (new, modified): (7,"white trig modified")
+WARNING: after insert (new): (7,"white trig modified")
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (7,pink)
+WARNING: before update (old): (7,"white trig modified")
+WARNING: before update (new): (7,"updated white trig modified")
+WARNING: after update (old): (7,"updated white trig modified")
+WARNING: after update (new): (7,"updated white trig modified")
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (8,yellow)
+WARNING: before insert (new, modified): (9,"yellow trig modified")
+WARNING: after insert (new): (9,"yellow trig modified")
+select * from upsert;
+ key | color
+-----+-----------------------------
+ 1 | black
+ 3 | updated red trig modified
+ 5 | updated green trig modified
+ 7 | updated white trig modified
+ 9 | yellow trig modified
+(5 rows)
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 80c5706..22b5bc1 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -215,6 +215,10 @@ INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
DETAIL: View columns that are not columns of their base relation are not updatable.
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
+ERROR: relation "rw_view15" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index 1de2a86..58714ac 100644
--- a/src/test/regress/expected/update.out
+++ b/src/test/regress/expected/update.out
@@ -147,4 +147,31 @@ SELECT a, b, char_length(c) FROM update_test;
42 | 12 | 10000
(4 rows)
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a NOT IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE EXISTS(SELECT b FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ALL(SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ANY(SELECT a FROM update_test);
+ ^
DROP TABLE update_test;
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index 06b372b..81d664e 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -1806,6 +1806,80 @@ SELECT * FROM y;
-400
(22 rows)
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+ k | v | a
+---+--------+---
+ 0 | insert | 0
+ 0 | insert | 0
+(2 rows)
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+ k | v
+----+------------------
+ 0 | insert
+ 1 | 1 v, now update
+ 2 | insert
+ 3 | insert
+ 4 | 4 v, now update
+ 5 | insert
+ 6 | insert
+ 7 | 7 v, now update
+ 8 | insert
+ 9 | insert
+ 10 | 10 v, now update
+ 11 | insert
+ 12 | insert
+ 13 | 13 v, now update
+ 14 | insert
+ 15 | insert
+ 16 | 16 v, now update
+(17 rows)
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ...ICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a ...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+DROP TABLE z;
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
INSERT INTO y SELECT generate_series(1, 3);
diff --git a/src/test/regress/input/constraints.source b/src/test/regress/input/constraints.source
index 8ec0054..46bce36 100644
--- a/src/test/regress/input/constraints.source
+++ b/src/test/regress/input/constraints.source
@@ -292,6 +292,11 @@ INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+
SELECT '' AS five, * FROM UNIQUE_TBL;
DROP TABLE UNIQUE_TBL;
diff --git a/src/test/regress/output/constraints.source b/src/test/regress/output/constraints.source
index 0d32a9eab..add3f0c 100644
--- a/src/test/regress/output/constraints.source
+++ b/src/test/regress/output/constraints.source
@@ -421,16 +421,23 @@ INSERT INTO UNIQUE_TBL VALUES (4, 'four');
INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+ERROR: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
+HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
SELECT '' AS five, * FROM UNIQUE_TBL;
- five | i | t
-------+---+-------
+ five | i | t
+------+---+--------------------
| 1 | one
| 2 | two
| 4 | four
- | 5 | one
| | six
| | seven
-(6 rows)
+ | 5 | five-upsert-update
+ | 6 | six-upsert-insert
+(7 rows)
DROP TABLE UNIQUE_TBL;
CREATE TABLE UNIQUE_TBL (i int, t text,
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e0ae2f2..528d3b7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -36,6 +36,7 @@ test: geometry horology regex oidjoins type_sanity opr_sanity
# These four each depend on the previous one
# ----------
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7f762bd..b7c8f53 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -50,6 +50,7 @@ test: oidjoins
test: type_sanity
test: opr_sanity
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql
new file mode 100644
index 0000000..472d4ab
--- /dev/null
+++ b/src/test/regress/sql/insert_conflict.sql
@@ -0,0 +1,192 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+
+drop index key_index;
+
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index comp_key_index;
+
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index expr_key_index;
+
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+
+drop index expr_comp_key_index;
+
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index key_index;
+drop index fruit_index;
+
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+
+drop index partial_key_index;
+
+-- Cleanup
+drop table insertconflicttest;
+
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+
+create table capitals (
+ state char(2)
+) inherits (cities);
+
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+
+-- Tests proper for inheritance:
+
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+
+-- Succeeds:
+
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index f97a75a..861eac6 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -194,7 +194,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -245,6 +245,9 @@ INSERT INTO atest5 VALUES (5,5,5); -- fail
UPDATE atest5 SET three = 10; -- ok
UPDATE atest5 SET one = 8; -- fail
UPDATE atest5 SET three = 5, one = 2; -- fail
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index ed7adbf..5c660d5 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -436,6 +436,79 @@ DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1;
DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1;
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+
+SET SESSION AUTHORIZATION rls_regress_user1;
+
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index 1e15f84..7cb5f39 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -680,6 +680,9 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
SELECT * FROM shoelace_candelete;
@@ -844,6 +847,17 @@ insert into rule_and_refint_t3 values (1, 12, 11, 'row3');
insert into rule_and_refint_t3 values (1, 12, 12, 'row4');
insert into rule_and_refint_t3 values (1, 11, 13, 'row5');
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 4be2e40..2be9cb7 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -374,6 +374,20 @@ from
int4_tbl i4 on dummy = i4.f1;
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+
+select * from upsert;
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index 0ea2c31..323ca1a 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -208,7 +208,7 @@ drop sequence ttdummy_seq;
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
5 10
@@ -237,6 +237,12 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
@@ -246,6 +252,9 @@ UPDATE main_table SET a = a + 1 WHERE b < 30;
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
+
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
30 40
@@ -1173,3 +1182,61 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+
+select * from upsert;
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 60c7e29..48dd9a9 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -69,6 +69,8 @@ DELETE FROM rw_view14 WHERE a=3; -- should be OK
-- Partially updatable view
INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
UPDATE rw_view15 SET upper='ROW 3' WHERE a=3; -- should fail
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
index e71128c..903f3fb 100644
--- a/src/test/regress/sql/update.sql
+++ b/src/test/regress/sql/update.sql
@@ -74,4 +74,18 @@ UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+
DROP TABLE update_test;
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index c716369..8d49384 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -795,6 +795,43 @@ SELECT * FROM t LIMIT 10;
SELECT * FROM y;
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+
+DROP TABLE z;
+
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
--
1.9.1
0005-RLS-support-for-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0005-RLS-support-for-ON-CONFLICT-UPDATE.patchDownload
From 456255730ab590b0826197856ab1600c443e6259 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 6 Jan 2015 16:32:21 -0800
Subject: [PATCH 5/8] RLS support for ON CONFLICT UPDATE
Row-Level Security policies may apply to UPDATE commands or INSERT
commands only. UPDATE RLS policies can have both USING() security
barrier quals, and CHECK options (INSERT RLS policies may only have
CHECK options, though). It is necessary to carefully consider the
behavior of RLS policies in the context of INSERT with ON CONFLICT
UPDATE, since ON CONFLICT UPDATE is more or less a new top-level
command, conceptually quite different to two separate statements (an
INSERT and an UPDATE).
The approach taken is to "bunch together" both sets of policies, and to
enforce them in 3 different places against three different slots (3
different stages of query processing in the executor).
Note that UPDATE policy USING() barrier quals are always treated as
CHECK options. It is thought that silently failing when USING() barrier
quals are not satisfied is a more surprising outcome, even if it is
closer to the existing behavior of UPDATE statements. This is because
the user's intent to UPDATE one particular row based on simple criteria
is quite clear with ON CONFLICT UPDATE.
The 3 places that RLS policies are enforced are:
* Against row actually inserted, after insertion proceeds successfully
(INSERT-applicable policies only).
* Against row in target table that caused conflict. The implementation
is careful not to leak the contents of that row in diagnostic
messages (INSERT-applicable *and* UPDATE-applicable policies).
* Against the version of the row added by to the relation after
ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
policies).
Documentation and tests follow in later commits.
---
src/backend/executor/execMain.c | 25 ++++++---
src/backend/executor/nodeModifyTable.c | 53 ++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/nodes/outfuncs.c | 1 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/rewrite/rewriteHandler.c | 2 +
src/backend/rewrite/rowsecurity.c | 94 +++++++++++++++++++++++++++++-----
src/include/executor/executor.h | 3 +-
src/include/nodes/parsenodes.h | 1 +
10 files changed, 158 insertions(+), 24 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 36251f0..53cecd7 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1709,7 +1709,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
*/
void
ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate)
+ TupleTableSlot *slot, bool detail,
+ bool onlyInsert, EState *estate)
{
Relation rel = resultRelInfo->ri_RelationDesc;
TupleDesc tupdesc = RelationGetDescr(rel);
@@ -1734,6 +1735,15 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
ExprState *wcoExpr = (ExprState *) lfirst(l2);
/*
+ * INSERT ... ON CONFLICT UPDATE callers may require that not all WITH
+ * CHECK OPTIONs associated with resultRelInfo are enforced at all
+ * stages of query processing. (UPDATE-related policies are not
+ * enforced in respect of a successfully inserted tuple).
+ */
+ if (onlyInsert && wco->commandType == CMD_UPDATE)
+ continue;
+
+ /*
* WITH CHECK OPTION checks are intended to ensure that the new tuple
* is visible (in the case of a view) or that it passes the
* 'with-check' policy (in the case of row security).
@@ -1744,16 +1754,17 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
*/
if (!ExecQual((List *) wcoExpr, econtext, false))
{
- char *val_desc;
+ char *val_desc = NULL;
Bitmapset *modifiedCols;
modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
- val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
- slot,
- tupdesc,
- modifiedCols,
- 64);
+ if (detail)
+ val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
+ slot,
+ tupdesc,
+ modifiedCols,
+ 64);
ereport(ERROR,
(errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION),
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1603c45..90236ce 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -458,7 +458,8 @@ vlock:
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, spec == SPEC_INSERT,
+ estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -952,7 +953,7 @@ lreplace:;
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, false, estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -1148,6 +1149,54 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+ /*
+ * For RLS with ON CONFLICT UPDATE, security quals are always
+ * treated as WITH CHECK options, even when there were separate
+ * security quals and explicit WITH CHECK options (ordinarily,
+ * security quals are only treated as WITH CHECK options when there
+ * are no explicit WITH CHECK options). Also, CHECK OPTIONs
+ * (originating either explicitly, or implicitly as security quals)
+ * for both UPDATE and INSERT policies (or ALL policies) are
+ * checked (as CHECK OPTIONs) at three different points for three
+ * distinct but related tuples/slots in the context of ON CONFLICT
+ * UPDATE. There are three relevant ExecWithCheckOptions() calls:
+ *
+ * * After successful insertion, within ExecInsert(), against the
+ * inserted tuple. This only includes INSERT-applicable policies.
+ *
+ * * Here, after row locking but before calling ExecUpdate(), on
+ * the existing tuple in the target relation (which we cannot leak
+ * details of). This is conceptually like a security barrier qual
+ * for the purposes of the auxiliary update, although unlike
+ * regular updates that require security barrier quals we prefer to
+ * raise an error (by treating the security barrier quals as CHECK
+ * OPTIONS) rather than silently not affect rows, because the
+ * intent to update seems clear and unambiguous for ON CONFLICT
+ * UPDATE. This includes both INSERT-applicable and
+ * UPDATE-applicable policies.
+ *
+ * * On the final tuple created by the update within ExecUpdate (if
+ * any). This is also subject to INSERT policy enforcement, unlike
+ * conventional ExecUpdate() calls for UPDATE statements -- it
+ * includes both INSERT-applicable and UPDATE-applicable policies.
+ */
+ if (resultRelInfo->ri_WithCheckOptions != NIL)
+ {
+ TupleTableSlot *opts;
+
+ /* Construct temp slot for locked tuple from target */
+ opts = MakeSingleTupleTableSlot(slot->tts_tupleDescriptor);
+ ExecStoreTuple(copyTuple, opts, InvalidBuffer, false);
+
+ /*
+ * Check, but without leaking contents of tuple; user only
+ * supplied one conflicting value or composition of values, and
+ * not the entire tuple.
+ */
+ ExecWithCheckOptions(resultRelInfo, opts, false, false,
+ estate);
+ }
+
if (!TupIsNull(slot))
*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
planSlot, &onConflict->mt_epqstate,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df611d2..5c091e1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2074,6 +2074,7 @@ _copyWithCheckOption(const WithCheckOption *from)
COPY_STRING_FIELD(viewname);
COPY_NODE_FIELD(qual);
+ COPY_SCALAR_FIELD(commandType);
COPY_SCALAR_FIELD(cascaded);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 24e58fa..4057c27 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2384,6 +2384,7 @@ _equalWithCheckOption(const WithCheckOption *a, const WithCheckOption *b)
{
COMPARE_STRING_FIELD(viewname);
COMPARE_NODE_FIELD(qual);
+ COMPARE_SCALAR_FIELD(commandType);
COMPARE_SCALAR_FIELD(cascaded);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 34e9163..d077882 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2336,6 +2336,7 @@ _outWithCheckOption(StringInfo str, const WithCheckOption *node)
WRITE_STRING_FIELD(viewname);
WRITE_NODE_FIELD(qual);
+ WRITE_ENUM_FIELD(commandType, CmdType);
WRITE_BOOL_FIELD(cascaded);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index b471bbf..30b0eca 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -272,6 +272,7 @@ _readWithCheckOption(void)
READ_STRING_FIELD(viewname);
READ_NODE_FIELD(qual);
+ READ_ENUM_FIELD(commandType, CmdType);
READ_BOOL_FIELD(cascaded);
READ_DONE();
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index f37760b..a2cc4f3 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1767,6 +1767,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs, bool forUpdatePushedDown)
List *quals = NIL;
wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->commandType = parsetree->commandType;
quals = lcons(wco->qual, quals);
activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs);
@@ -2935,6 +2936,7 @@ rewriteTargetView(Query *parsetree, Relation view)
wco->viewname = pstrdup(RelationGetRelationName(view));
wco->qual = NULL;
wco->cascaded = cascaded;
+ wco->commandType = viewquery->commandType;
parsetree->withCheckOptions = lcons(wco,
parsetree->withCheckOptions);
diff --git a/src/backend/rewrite/rowsecurity.c b/src/backend/rewrite/rowsecurity.c
index 7669130..09f1ac3 100644
--- a/src/backend/rewrite/rowsecurity.c
+++ b/src/backend/rewrite/rowsecurity.c
@@ -56,12 +56,14 @@
#include "utils/syscache.h"
#include "tcop/utility.h"
-static List *pull_row_security_policies(CmdType cmd, Relation relation,
- Oid user_id);
+static List *pull_row_security_policies(CmdType cmd, bool onConflict,
+ Relation relation, Oid user_id);
static void process_policies(List *policies, int rt_index,
Expr **final_qual,
Expr **final_with_check_qual,
- bool *hassublinks);
+ bool *hassublinks,
+ Expr **spec_with_check_eval,
+ bool onConflict);
static bool check_role_for_policy(ArrayType *policy_roles, Oid user_id);
/*
@@ -88,6 +90,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
Expr *rowsec_with_check_expr = NULL;
Expr *hook_expr = NULL;
Expr *hook_with_check_expr = NULL;
+ Expr *hook_spec_with_check_expr = NULL;
List *rowsec_policies;
List *hook_policies = NIL;
@@ -149,8 +152,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Grab the built-in policies which should be applied to this relation. */
rel = heap_open(rte->relid, NoLock);
- rowsec_policies = pull_row_security_policies(root->commandType, rel,
- user_id);
+ rowsec_policies = pull_row_security_policies(root->commandType,
+ root->specClause == SPEC_INSERT,
+ rel, user_id);
/*
* Check if this is only the default-deny policy.
@@ -168,7 +172,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Now that we have our policies, build the expressions from them. */
process_policies(rowsec_policies, rt_index, &rowsec_expr,
- &rowsec_with_check_expr, &hassublinks);
+ &rowsec_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
/*
* Also, allow extensions to add their own policies.
@@ -198,7 +204,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Build the expression from any policies returned. */
process_policies(hook_policies, rt_index, &hook_expr,
- &hook_with_check_expr, &hassublinks);
+ &hook_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
}
/*
@@ -230,6 +238,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) rowsec_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
@@ -244,6 +253,23 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) hook_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
+ root->withCheckOptions = lcons(wco, root->withCheckOptions);
+ }
+
+ /*
+ * Also add the expression, if any, returned from the extension that
+ * applies to auxiliary UPDATE within ON CONFLICT UPDATE.
+ */
+ if (hook_spec_with_check_expr)
+ {
+ WithCheckOption *wco;
+
+ wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->viewname = RelationGetRelationName(rel);
+ wco->qual = (Node *) hook_spec_with_check_expr;
+ wco->cascaded = false;
+ wco->commandType = CMD_UPDATE;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
}
@@ -288,7 +314,8 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
*
*/
static List *
-pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
+pull_row_security_policies(CmdType cmd, bool onConflict, Relation relation,
+ Oid user_id)
{
List *policies = NIL;
ListCell *item;
@@ -322,7 +349,9 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
if (policy->polcmd == ACL_INSERT_CHR
&& check_role_for_policy(policy->roles, user_id))
policies = lcons(policy, policies);
- break;
+ if (!onConflict)
+ break;
+ /* FALL THRU */
case CMD_UPDATE:
if (policy->polcmd == ACL_UPDATE_CHR
&& check_role_for_policy(policy->roles, user_id))
@@ -384,26 +413,41 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
*/
static void
process_policies(List *policies, int rt_index, Expr **qual_eval,
- Expr **with_check_eval, bool *hassublinks)
+ Expr **with_check_eval, bool *hassublinks,
+ Expr **spec_with_check_eval, bool onConflict)
{
ListCell *item;
List *quals = NIL;
List *with_check_quals = NIL;
+ List *conflict_update_quals = NIL;
/*
* Extract the USING and WITH CHECK quals from each of the policies
- * and add them to our lists.
+ * and add them to our lists. CONFLICT UPDATE quals are always treated
+ * as CHECK OPTIONS.
*/
foreach(item, policies)
{
RowSecurityPolicy *policy = (RowSecurityPolicy *) lfirst(item);
if (policy->qual != NULL)
- quals = lcons(copyObject(policy->qual), quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ quals = lcons(copyObject(policy->qual), quals);
+ else
+ conflict_update_quals = lcons(copyObject(policy->qual), quals);
+ }
if (policy->with_check_qual != NULL)
- with_check_quals = lcons(copyObject(policy->with_check_qual),
- with_check_quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ with_check_quals = lcons(copyObject(policy->with_check_qual),
+ with_check_quals);
+ else
+ conflict_update_quals =
+ lcons(copyObject(policy->with_check_qual),
+ conflict_update_quals);
+ }
if (policy->hassublinks)
*hassublinks = true;
@@ -420,6 +464,10 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
/*
* If we end up with only USING quals, then use those as
* WITH CHECK quals also.
+ *
+ * For the INSERT with ON CONFLICT UPDATE case, we always enforce that the
+ * UPDATE's USING quals are treated like WITH CHECK quals, enforced against
+ * the target relation's tuple in multiple places.
*/
if (with_check_quals == NIL)
with_check_quals = copyObject(quals);
@@ -453,6 +501,24 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
else
*with_check_eval = (Expr*) linitial(with_check_quals);
+ /*
+ * For INSERT with ON CONFLICT UPDATE, *both* sets of WITH CHECK options
+ * (from any INSERT policy and any UPDATE policy) are enforced.
+ *
+ * These are handled separately because enforcement of each type of WITH
+ * CHECK option is based on the point in query processing of INSERT ... ON
+ * CONFLICT UPDATE. The INSERT path does not enforce UPDATE related CHECK
+ * OPTIONs.
+ */
+ if (conflict_update_quals != NIL)
+ {
+ if (list_length(conflict_update_quals) > 1)
+ *spec_with_check_eval = makeBoolExpr(AND_EXPR,
+ conflict_update_quals, -1);
+ else
+ *spec_with_check_eval = (Expr*) linitial(conflict_update_quals);
+ }
+
return;
}
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 9400801..6c535da 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -195,7 +195,8 @@ extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
extern void ExecConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
extern void ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate);
+ TupleTableSlot *slot, bool detail, bool onlyInsert,
+ EState *estate);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti);
extern ExecAuxRowMark *ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist);
extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 9ae3bb5..6447f45 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -868,6 +868,7 @@ typedef struct WithCheckOption
NodeTag type;
char *viewname; /* name of view that specified the WCO */
Node *qual; /* constraint qual to check */
+ CmdType commandType; /* select|insert|update|delete */
bool cascaded; /* true = WITH CASCADED CHECK OPTION */
} WithCheckOption;
--
1.9.1
0004-Project-updates-from-ON-CONFLICT-UPDATE-RETURNING.patchtext/x-patch; charset=US-ASCII; name=0004-Project-updates-from-ON-CONFLICT-UPDATE-RETURNING.patchDownload
From c40c1dfb555a65d25dcab12409ebef9b903f6e78 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 21 Nov 2014 16:59:54 -0800
Subject: [PATCH 4/8] Project updates from ON CONFLICT UPDATE RETURNING
This establishes that an INSERT with an ON CONFLICT UPDATE clause
processes all slots that are ultimately affected, regardless of whether
or not the alternative ON CONFLICT UPDATE path was taken. However, if
an ON CONFLICT UPDATE's WHERE clause is not satisfied in respect of some
slot/tuple, the post-update tuple is not projected (although the row is
still locked, just as before).
Also, for ON CONFLICT UPDATE variant INSERTs (but not ON CONFLICT IGNORE
variant INSERTs), the number of rows affected in total is reported by
the command tag using the new "UPSERT" command identifier, which
otherwise matches the format of the existing "INSERT" command tag.
There is no precedent for a top level command that uses a different
command tag identifier according to whether or not some clause was used,
but doing so seems appropriate, since client programs are expected to
have an interest in whether or not some number of rows projected by
RETURNING may have been updated, and in any case indicating that the
rows were affected by an "INSERT" when they may not have been inserted
is simply misleading. However, there is still no principled method for
client programs to distinguish between INSERT ... ON CONFLICT UPDATE
projected tuples generated by being inserted or by being updated. This
is thought not to matter, since the use of INSERT with ON CONFLICT
UPDATE indicates that either outcome is equivalent.
---
src/backend/executor/nodeModifyTable.c | 30 ++++++++++++++++++++++--------
src/backend/tcop/pquery.c | 16 +++++++++++++---
src/bin/psql/common.c | 5 ++++-
3 files changed, 39 insertions(+), 12 deletions(-)
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 05c78c9..1603c45 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -59,7 +59,9 @@ static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
TupleTableSlot *planSlot,
TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
- EState *estate);
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning);
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -413,6 +415,8 @@ vlock:
if (conflict)
{
+ TupleTableSlot *returning = NULL;
+
/*
* Lock and consider updating in the SPEC_INSERT case. For the
* SPEC_IGNORE case, it's still necessary to verify that the tuple
@@ -423,12 +427,20 @@ vlock:
planSlot,
slot,
onConflict,
- estate))
+ estate,
+ canSetTag,
+ &returning))
goto vlock;
else if (spec == SPEC_IGNORE)
ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
- return NULL;
+ /*
+ * RETURNING may have been processed already -- the target
+ * ResultRelInfo might have made representation within ExecUpdate()
+ * that this is required. Inserted and updated tuples are
+ * projected indifferently for ON CONFLICT UPDATE with RETURNING.
+ */
+ return returning;
}
}
@@ -967,7 +979,9 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
TupleTableSlot *planSlot,
TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
- EState *estate)
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning)
{
Relation relation = resultRelInfo->ri_RelationDesc;
HeapTupleData tuple;
@@ -1135,9 +1149,9 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
if (!TupIsNull(slot))
- ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot,
- &onConflict->mt_epqstate, onConflict->ps.state,
- false);
+ *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
+ planSlot, &onConflict->mt_epqstate,
+ onConflict->ps.state, canSetTag);
ReleaseBuffer(buffer);
@@ -1149,7 +1163,7 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
/* must provide our own instrumentation support */
if (onConflict->ps.instrument)
- InstrStopNode(onConflict->ps.instrument, 0);
+ InstrStopNode(onConflict->ps.instrument, *returning ? 1:0);
return true;
case HeapTupleUpdated:
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 9c14e8a..41c4191 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -189,7 +189,8 @@ ProcessQuery(PlannedStmt *plan,
*/
if (completionTag)
{
- Oid lastOid;
+ Oid lastOid;
+ ModifyTableState *pstate;
switch (queryDesc->operation)
{
@@ -198,12 +199,16 @@ ProcessQuery(PlannedStmt *plan,
"SELECT %u", queryDesc->estate->es_processed);
break;
case CMD_INSERT:
+ pstate = (((ModifyTableState *) queryDesc->planstate));
+ Assert(IsA(pstate, ModifyTableState));
+
if (queryDesc->estate->es_processed == 1)
lastOid = queryDesc->estate->es_lastoid;
else
lastOid = InvalidOid;
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
- "INSERT %u %u", lastOid, queryDesc->estate->es_processed);
+ "%s %u %u", pstate->spec == SPEC_INSERT? "UPSERT":"INSERT",
+ lastOid, queryDesc->estate->es_processed);
break;
case CMD_UPDATE:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
@@ -1356,7 +1361,10 @@ PortalRunMulti(Portal portal, bool isTopLevel,
* 0" here because technically there is no query of the matching tag type,
* and printing a non-zero count for a different query type seems wrong,
* e.g. an INSERT that does an UPDATE instead should not print "0 1" if
- * one row was updated. See QueryRewrite(), step 3, for details.
+ * one row was updated (unless the ON CONFLICT UPDATE, or "UPSERT" variant
+ * of INSERT was used to update the row, where it's logically a direct
+ * effect of the top level command). See QueryRewrite(), step 3, for
+ * details.
*/
if (completionTag && completionTag[0] == '\0')
{
@@ -1366,6 +1374,8 @@ PortalRunMulti(Portal portal, bool isTopLevel,
sprintf(completionTag, "SELECT 0 0");
else if (strcmp(completionTag, "INSERT") == 0)
strcpy(completionTag, "INSERT 0 0");
+ else if (strcmp(completionTag, "UPSERT") == 0)
+ strcpy(completionTag, "UPSERT 0 0");
else if (strcmp(completionTag, "UPDATE") == 0)
strcpy(completionTag, "UPDATE 0");
else if (strcmp(completionTag, "DELETE") == 0)
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 275bdcc..9302e41 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -894,9 +894,12 @@ PrintQueryResults(PGresult *results)
success = StoreQueryTuple(results);
else
success = PrintQueryTuples(results);
- /* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
+ /*
+ * if it's INSERT/UPSERT/UPDATE/DELETE RETURNING, also print status
+ */
cmdstatus = PQcmdStatus(results);
if (strncmp(cmdstatus, "INSERT", 6) == 0 ||
+ strncmp(cmdstatus, "UPSERT", 6) == 0 ||
strncmp(cmdstatus, "UPDATE", 6) == 0 ||
strncmp(cmdstatus, "DELETE", 6) == 0)
PrintQueryStatus(results);
--
1.9.1
0003-EXCLUDED-expressions-within-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0003-EXCLUDED-expressions-within-ON-CONFLICT-UPDATE.patchDownload
From 664c19774b6a4a74a21e053928473e173414aa4d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Thu, 18 Sep 2014 19:08:27 -0700
Subject: [PATCH 3/8] EXCLUDED expressions within ON CONFLICT UPDATE
EXCLUDED.* (which previously appeared as EXCLUDED(), and CONFLICTING()
before that) is an "internal" primnode expression which enables
referencing of rejected-for-insertion tuples within both the targetlist
and predicate of the UPDATE portion of an INSERT ... ON CONFLICT UPDATE
query. The expression is invoked using an alias-like syntax (more on
how this works later). The fact that a dedicated expression is used
(rather than a dedicated range table entry involved in query
optimization) is an implementation detail.
This additional support is particularly useful for ON CONFLICT queries
that propose multiple tuples for insertion, since it isn't otherwise
possible to succinctly decide which actual values to update each column
with (in the event of taking the update path in respect of a given
slot).
The effects of BEFORE INSERT row triggers on the slot/tuple proposed for
insertion are carried. This seems logical, since it might be the case
that the rejected values would not have been rejected had some BEFORE
INSERT trigger been disabled. On the other hand, the potential hazards
around equivalent modifications occurring when both INSERT and UPDATE
BEFORE triggers are fired for the same slot/tuple should be considered
by client applications. It's possible to imagine a use case in which
this behavior is surprising and undesirable -- essentially the same
non-idempotent modification may occur twice. (It might also be the case
that BEFORE trigger related side-effects undesirably occur twice, but
writing BEFORE triggers with external side-effects is already considered
a questionable practice for several reasons (consider commit 6868ed74),
and besides, the implementation cannot reasonably prevent this, as noted
in nodeModifyTable.c comments added by the main ON CONFLICT commit).
In this revision, the raw grammar does not generate an ExcludedExpr.
Parse analysis of ON CONFLICT UPDATE is made to add a new relation RTE
to the auxiliary sub_pstate parser state (an alias for the target).
This makes parse analysis build a query tree that is more or less
consistent with there actually being an EXCLUDED relation. Then, as
part of query rewrite, immediately after normalizing the UPDATE
targetlist, Vars referencing the pseudo-relation (using the EXCLUDED
alias) are replaced with ExcludedExpr that references Vars in the target
relation itself.
Speculative insertion/the executor arranges to rig the Vars and
UPDATE-related/EPQ scan planstate's expression context such that values
will actually originate from the rejected tuple's slot (driven, as
always for the UPDATE's execution, by the parent INSERT ModifyTable
node, changed once per slot proposed for insertion as appropriate).
This whole mechanism is somewhat similar to the handling of trigger WHEN
clauses, where a similar dance must also occur within the executor.
Note that pg_stat_statements does not fingerprint ExludedExpr, because
it cannot appear in the post-parse-analysis, pre-rewrite Query tree.
(pg_stat_statements does not fingerprint every primnode anyway, mostly
because some are only expected in utility statements). Other existing
Node handling sites that don't expect to see primnodes that appear only
after rewriting (ExcludedExpr may be in its own subcategory here in that
it is the only such non-utility related Node) do not have an
ExcludedExpr case added either.
---
src/backend/executor/execQual.c | 54 +++++++++++++++++++++
src/backend/executor/nodeModifyTable.c | 32 ++++++++++++
src/backend/nodes/copyfuncs.c | 16 ++++++
src/backend/nodes/equalfuncs.c | 11 +++++
src/backend/nodes/nodeFuncs.c | 38 +++++++++++++++
src/backend/nodes/outfuncs.c | 11 +++++
src/backend/nodes/readfuncs.c | 15 ++++++
src/backend/optimizer/plan/setrefs.c | 6 +++
src/backend/parser/analyze.c | 22 ++++++++-
src/backend/rewrite/rewriteHandler.c | 89 ++++++++++++++++++++++++++++++++++
src/backend/utils/adt/ruleutils.c | 39 +++++++++++++++
src/include/nodes/execnodes.h | 10 ++++
src/include/nodes/nodes.h | 2 +
src/include/nodes/primnodes.h | 47 ++++++++++++++++++
14 files changed, 391 insertions(+), 1 deletion(-)
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 0e7400f..57d726e 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -182,6 +182,9 @@ static Datum ExecEvalArrayCoerceExpr(ArrayCoerceExprState *astate,
bool *isNull, ExprDoneCond *isDone);
static Datum ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
bool *isNull, ExprDoneCond *isDone);
+static Datum ExecEvalExcluded(ExcludedExprState *excludedExpr,
+ ExprContext *econtext, bool *isNull,
+ ExprDoneCond *isDone);
/* ----------------------------------------------------------------
@@ -4338,6 +4341,33 @@ ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
return 0; /* keep compiler quiet */
}
+/* ----------------------------------------------------------------
+ * ExecEvalExcluded
+ * ----------------------------------------------------------------
+ */
+static Datum
+ExecEvalExcluded(ExcludedExprState *excludedExpr, ExprContext *econtext,
+ bool *isNull, ExprDoneCond *isDone)
+{
+ /*
+ * ExcludedExpr is essentially an expression that adapts its single Var
+ * argument to refer to the expression context inner slot's tuple, which is
+ * reserved for the purpose of referencing EXCLUDED.* tuples within ON
+ * CONFLICT UPDATE auxiliary queries' EPQ expression context (ON CONFLICT
+ * UPDATE makes special use of the EvalPlanQual() mechanism to update).
+ *
+ * nodeModifyTable.c assigns its own table slot in the auxiliary queries'
+ * EPQ expression state (originating in the parent INSERT node) on the
+ * assumption that it may only be used by ExcludedExpr, and on the
+ * assumption that the inner slot is not otherwise useful. This occurs in
+ * advance of the expression evaluation for UPDATE (which calls here are
+ * part of) once per slot proposed for insertion, and works because of
+ * restrictions on the structure of ON CONFLICT UPDATE auxiliary queries.
+ *
+ * Just evaluate nested Var.
+ */
+ return ExecEvalScalarVar(excludedExpr->arg, econtext, isNull, isDone);
+}
/*
* ExecEvalExprSwitchContext
@@ -5065,6 +5095,30 @@ ExecInitExpr(Expr *node, PlanState *parent)
state = (ExprState *) makeNode(ExprState);
state->evalfunc = ExecEvalCurrentOfExpr;
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExprState *cstate = makeNode(ExcludedExprState);
+ Var *contained = (Var*) excludedexpr->arg;
+
+ /*
+ * varno forced to INNER_VAR -- see remarks within
+ * ExecLockUpdateTuple().
+ *
+ * We rely on the assumption that the only place that
+ * ExcludedExpr may appear is where EXCLUDED Var references
+ * originally appeared after parse analysis. The rewriter
+ * replaces these with ExcludedExpr that reference the
+ * corresponding Var within the ON CONFLICT UPDATE target RTE.
+ */
+ Assert(IsA(contained, Var));
+
+ contained->varno = INNER_VAR;
+ cstate->arg = ExecInitExpr((Expr *) contained, parent);
+ state = (ExprState *) cstate;
+ state->evalfunc = (ExprStateEvalFunc) ExecEvalExcluded;
+ }
+ break;
case T_TargetEntry:
{
TargetEntry *tle = (TargetEntry *) node;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index d03604c..05c78c9 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -57,6 +57,7 @@
static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
ItemPointer conflictTid,
TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
EState *estate);
@@ -420,6 +421,7 @@ vlock:
if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
&conflictTid,
planSlot,
+ slot,
onConflict,
estate))
goto vlock;
@@ -963,6 +965,7 @@ static bool
ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
ItemPointer conflictTid,
TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
ModifyTableState *onConflict,
EState *estate)
{
@@ -973,6 +976,7 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
HTSU_Result test;
Buffer buffer;
TupleTableSlot *slot;
+ ExprContext *econtext;
/*
* XXX We don't have the TID of the conflicting tuple if the index
@@ -1094,12 +1098,40 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
/*
+ * Save EPQ expression context. Auxiliary plan's scan node (which
+ * would have been just initialized by EvalPlanQualBegin() on the
+ * first time through here per query) cannot fail to provide one.
+ */
+ econtext = onConflict->mt_epqstate.planstate->ps_ExprContext;
+
+ /*
* UPDATE affects the same ResultRelation as INSERT in the context
* of ON CONFLICT UPDATE, so parent's target rti is used
*/
EvalPlanQualSetTuple(&onConflict->mt_epqstate,
resultRelInfo->ri_RangeTableIndex, copyTuple);
+ /*
+ * Make available rejected tuple for referencing within UPDATE
+ * expression (that is, make available a slot with the rejected
+ * tuple, possibly already modified by BEFORE INSERT row triggers).
+ *
+ * This is for the benefit of any ExcludedExpr that may appear
+ * within UPDATE's targetlist or WHERE clause. The EXCLUDED tuple
+ * may be referenced as an ExcludedExpr, which exist purely for our
+ * benefit. The nested ExcludedExpr's Var will necessarily have an
+ * INNER_VAR varno on the assumption that the inner slot of the EPQ
+ * scan plan state's expression context will contain the EXCLUDED
+ * heaptuple slot (that is, on the assumption that during
+ * expression evaluation, the ecxt_innertuple will be assigned the
+ * insertSlot by this codepath, in advance of expression
+ * evaluation).
+ *
+ * See handling of ExcludedExpr within handleRewrite.c and
+ * execQual.c.
+ */
+ econtext->ecxt_innertuple = insertSlot;
+
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
if (!TupIsNull(slot))
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6c1a7f1..df611d2 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1779,6 +1779,19 @@ _copyCurrentOfExpr(const CurrentOfExpr *from)
}
/*
+ * _copyExcludedExpr
+ */
+static ExcludedExpr *
+_copyExcludedExpr(const ExcludedExpr *from)
+{
+ ExcludedExpr *newnode = makeNode(ExcludedExpr);
+
+ COPY_NODE_FIELD(arg);
+
+ return newnode;
+}
+
+/*
* _copyTargetEntry
*/
static TargetEntry *
@@ -4287,6 +4300,9 @@ copyObject(const void *from)
case T_CurrentOfExpr:
retval = _copyCurrentOfExpr(from);
break;
+ case T_ExcludedExpr:
+ retval = _copyExcludedExpr(from);
+ break;
case T_TargetEntry:
retval = _copyTargetEntry(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 4127269..24e58fa 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -681,6 +681,14 @@ _equalCurrentOfExpr(const CurrentOfExpr *a, const CurrentOfExpr *b)
}
static bool
+_equalExcludedExpr(const ExcludedExpr *a, const ExcludedExpr *b)
+{
+ COMPARE_NODE_FIELD(arg);
+
+ return true;
+}
+
+static bool
_equalTargetEntry(const TargetEntry *a, const TargetEntry *b)
{
COMPARE_NODE_FIELD(expr);
@@ -2720,6 +2728,9 @@ equal(const void *a, const void *b)
case T_CurrentOfExpr:
retval = _equalCurrentOfExpr(a, b);
break;
+ case T_ExcludedExpr:
+ retval = _equalExcludedExpr(a, b);
+ break;
case T_TargetEntry:
retval = _equalTargetEntry(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 4107cc9..a9e1e13 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -235,6 +235,13 @@ exprType(const Node *expr)
case T_CurrentOfExpr:
type = BOOLOID;
break;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ type = exprType((Node *) n->arg);
+ }
+ break;
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -469,6 +476,12 @@ exprTypmod(const Node *expr)
return ((const CoerceToDomainValue *) expr)->typeMod;
case T_SetToDefault:
return ((const SetToDefault *) expr)->typeMod;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ return ((const Var *) n->arg)->vartypmod;
+ }
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
default:
@@ -894,6 +907,9 @@ exprCollation(const Node *expr)
case T_CurrentOfExpr:
coll = InvalidOid; /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ coll = exprCollation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -1089,6 +1105,12 @@ exprSetCollation(Node *expr, Oid collation)
case T_CurrentOfExpr:
Assert(!OidIsValid(collation)); /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ {
+ Var *v = (Var *) ((ExcludedExpr *) expr)->arg;
+ v->varcollid = collation;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
break;
@@ -1487,6 +1509,10 @@ exprLocation(const Node *expr)
/* just use argument's location */
loc = exprLocation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_ExcludedExpr:
+ /* just use nested expr's location */
+ loc = exprLocation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
@@ -1916,6 +1942,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_ExcludedExpr:
+ return walker(((ExcludedExpr *) node)->arg, context);
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -2632,6 +2660,16 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExpr *newnode;
+
+ FLATCOPY(newnode, excludedexpr, ExcludedExpr);
+ MUTATE(newnode->arg, newnode->arg, Node *);
+ return (Node *) newnode;
+ }
+ break;
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index a32fbaa..34e9163 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1429,6 +1429,14 @@ _outCurrentOfExpr(StringInfo str, const CurrentOfExpr *node)
}
static void
+_outExcludedExpr(StringInfo str, const ExcludedExpr *node)
+{
+ WRITE_NODE_TYPE("EXCLUDED");
+
+ WRITE_NODE_FIELD(arg);
+}
+
+static void
_outTargetEntry(StringInfo str, const TargetEntry *node)
{
WRITE_NODE_TYPE("TARGETENTRY");
@@ -3069,6 +3077,9 @@ _outNode(StringInfo str, const void *obj)
case T_CurrentOfExpr:
_outCurrentOfExpr(str, obj);
break;
+ case T_ExcludedExpr:
+ _outExcludedExpr(str, obj);
+ break;
case T_TargetEntry:
_outTargetEntry(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 9f6570f..b471bbf 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1132,6 +1132,19 @@ _readCurrentOfExpr(void)
}
/*
+ * _readExcludedExpr
+ */
+static ExcludedExpr *
+_readExcludedExpr(void)
+{
+ READ_LOCALS(ExcludedExpr);
+
+ READ_NODE_FIELD(arg);
+
+ READ_DONE();
+}
+
+/*
* _readTargetEntry
*/
static TargetEntry *
@@ -1396,6 +1409,8 @@ parseNodeString(void)
return_value = _readSetToDefault();
else if (MATCH("CURRENTOFEXPR", 13))
return_value = _readCurrentOfExpr();
+ else if (MATCH("EXCLUDED", 8))
+ return_value = _readExcludedExpr();
else if (MATCH("TARGETENTRY", 11))
return_value = _readTargetEntry();
else if (MATCH("RANGETBLREF", 11))
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 3368173..9e73d6c 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -792,6 +792,12 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
}
else
{
+ /*
+ * Decrement rtoffset, to compensate for dummy RTE left by
+ * EXCLUDED.* alias. Auxiliary plan will have same
+ * resultRelation from flattened RTE as its parent.
+ */
+ rtoffset -= PRS2_OLD_VARNO;
splan->onConflictPlan = (Plan *) set_plan_refs(root,
(Plan *) splan->onConflictPlan,
rtoffset);
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index caaa44c..e0ec207 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -779,7 +779,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
UpdateStmt *pupd;
Query *dqry;
ParseState *sub_pstate = make_parsestate(pstate);
- RangeTblEntry *subTarget;
+ RangeTblEntry *subTarget, *exclRte;
pupd = (UpdateStmt *) stmt->confClause->updatequery;
@@ -788,6 +788,26 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* Assign same target relation as parent InsertStmt */
pupd->relation = stmt->relation;
+ pupd->relation->alias = makeAlias("target", NIL);
+
+ /*
+ * Create EXCLUDED alias for target relation. This can be used to
+ * reference the tuple originally proposed for insertion from
+ * within the ON CONFLICT UPDATE auxiliary query.
+ *
+ * NOTE: 'EXCLUDED' will always have a varno equal to 1 (at least
+ * until rewriting, where the RTE is effectively discarded).
+ */
+ exclRte = addRangeTableEntryForRelation(sub_pstate,
+ pstate->p_target_relation,
+ makeAlias("excluded", NIL),
+ false, false);
+
+ /*
+ * Add RTE. Vars referencing the alias are rewritten to reference
+ * "target", nested within an ExcludedExpr.
+ */
+ addRTEtoQuery(sub_pstate, exclRte, false, true, true);
/*
* The optimizer is not prepared to accept a subquery RTE for a
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index a076625..f37760b 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -43,6 +43,12 @@ typedef struct acquireLocksOnSubLinks_context
bool for_execute; /* AcquireRewriteLocks' forExecute param */
} acquireLocksOnSubLinks_context;
+typedef struct excluded_replace_context
+{
+ int varno; /* varno of EXLCUDED.* Vars */
+ int rvarno; /* replace varno */
+} excluded_replace_context;
+
static bool acquireLocksOnSubLinks(Node *node,
acquireLocksOnSubLinks_context *context);
static Query *rewriteRuleAction(Query *parsetree,
@@ -71,6 +77,10 @@ static Query *fireRIRrules(Query *parsetree, List *activeRIRs,
bool forUpdatePushedDown);
static bool view_has_instead_trigger(Relation view, CmdType event);
static Bitmapset *adjust_view_column_set(Bitmapset *cols, List *targetlist);
+static Node *excluded_replace_vars(Node *expr,
+ excluded_replace_context *context);
+static Node *excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context);
/*
@@ -3104,6 +3114,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
if (parsetree->specClause == SPEC_INSERT)
{
Query *qry;
+ excluded_replace_context context;
/*
* While user-defined rules will never be applied in the
@@ -3112,6 +3123,35 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
qry = (Query *) parsetree->onConflict;
rewriteTargetListIU(qry, rt_entry_relation, NULL);
+
+ /*
+ * Replace OLD Vars (associated with the EXCLUDED.* alias) with
+ * first (and only) "real" relation RTE in rtable. This allows
+ * the implementation to treat EXCLUDED.* as an alias for the
+ * target relation, which is useful during parse analysis,
+ * while ultimately having those references rewritten as
+ * special ExcludedExpr references to the corresponding Var in
+ * the target RTE.
+ *
+ * This is necessary because while we want a join-like syntax
+ * for aesthetic reasons, the resemblance is superficial. In
+ * fact, execution of the ModifyTable node (and its direct
+ * child auxiliary query) manages tupleslot state directly, and
+ * is directly tasked with making available the appropriate
+ * tupleslot to the expression context.
+ *
+ * This is a kludge, but appears necessary, since the slot made
+ * available for referencing via ExcludedExpr is in fact the
+ * slot just excluded from insertion by speculative insertion
+ * (with the effects of BEFORE ROW INSERT triggers carried).
+ * An ad-hoc method for making the excluded tuple available
+ * within the auxiliary expression context is appropriate.
+ */
+ context.varno = PRS2_OLD_VARNO;
+ context.rvarno = PRS2_OLD_VARNO + 1;
+
+ parsetree->onConflict =
+ excluded_replace_vars(parsetree->onConflict, &context);
}
}
else if (event == CMD_UPDATE)
@@ -3434,3 +3474,52 @@ QueryRewrite(Query *parsetree)
return results;
}
+
+/*
+ * Apply pullup variable replacement throughout an expression tree
+ *
+ * Returns modified tree, with user-specified rvarno replaced with varno.
+ */
+static Node *
+excluded_replace_vars(Node *expr, excluded_replace_context *context)
+{
+ /*
+ * Don't recurse into subqueries; they're forbidden in auxiliary ON
+ * CONFLICT query
+ */
+ return replace_rte_variables(expr,
+ context->varno, 0,
+ excluded_replace_vars_callback,
+ (void *) context,
+ NULL);
+}
+
+static Node *
+excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context)
+{
+ excluded_replace_context *rcon = (excluded_replace_context *) context->callback_arg;
+ ExcludedExpr *n = makeNode(ExcludedExpr);
+
+ /* Replace with an enclosing ExcludedExpr */
+ var->varno = rcon->rvarno;
+ n->arg = (Node *) var;
+
+ /*
+ * Would have to adjust varlevelsup if referenced item is from higher query
+ * (should not happen)
+ */
+ Assert(var->varlevelsup == 0);
+
+ if (var->varattno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference system column using EXCLUDED.* alias")));
+
+ if (var->varattno == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference whole-row using EXCLUDED.* alias")));
+
+ return (Node*) n;
+}
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index c1d860c..04235e2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5645,6 +5645,24 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
return NULL;
}
+ else if (var->varno == INNER_VAR)
+ {
+ /* Assume an EXCLUDED variable */
+ rte = rt_fetch(PRS2_OLD_VARNO, dpns->rtable);
+
+ /*
+ * Sanity check: EXCLUDED.* Vars should only appear in auxiliary ON
+ * CONFLICT UPDATE queries. Assert that rte and planstate are
+ * consistent with that.
+ */
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(IsA(dpns->planstate, SeqScanState) ||
+ IsA(dpns->planstate, ResultState));
+
+ refname = "excluded";
+ colinfo = deparse_columns_fetch(PRS2_OLD_VARNO, dpns);
+ attnum = var->varattno;
+ }
else
{
elog(ERROR, "bogus varno: %d", var->varno);
@@ -6385,6 +6403,7 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags)
case T_CoerceToDomainValue:
case T_SetToDefault:
case T_CurrentOfExpr:
+ case T_ExcludedExpr:
/* single words: always simple */
return true;
@@ -7610,6 +7629,26 @@ get_rule_expr(Node *node, deparse_context *context,
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ Var *variable = (Var *) excludedexpr->arg;
+ bool save_varprefix;
+
+ /*
+ * Force parentheses because our caller probably assumed our
+ * Var is a simple expression.
+ */
+ appendStringInfoChar(buf, '(');
+ save_varprefix = context->varprefix;
+ /* Ensure EXCLUDED.* prefix is always visible */
+ context->varprefix = true;
+ get_rule_expr((Node *) variable, context, true);
+ context->varprefix = save_varprefix;
+ appendStringInfoChar(buf, ')');
+ }
+ break;
+
case T_List:
{
char *sep;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 19b5e29..0274ebc 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -973,6 +973,16 @@ typedef struct DomainConstraintState
ExprState *check_expr; /* for CHECK, a boolean expression */
} DomainConstraintState;
+/* ----------------
+ * ExcludedExprState node
+ * ----------------
+ */
+typedef struct ExcludedExprState
+{
+ ExprState xprstate;
+ ExprState *arg; /* the argument */
+} ExcludedExprState;
+
/* ----------------------------------------------------------------
* Executor State Trees
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index cac6b15..ca568a2 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -168,6 +168,7 @@ typedef enum NodeTag
T_CoerceToDomainValue,
T_SetToDefault,
T_CurrentOfExpr,
+ T_ExcludedExpr,
T_TargetEntry,
T_RangeTblRef,
T_JoinExpr,
@@ -207,6 +208,7 @@ typedef enum NodeTag
T_NullTestState,
T_CoerceToDomainState,
T_DomainConstraintState,
+ T_ExcludedExprState,
/*
* TAGS FOR PLANNER NODES (relation.h)
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1d06f42..21c39dc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1147,6 +1147,53 @@ typedef struct CurrentOfExpr
int cursor_param; /* refcursor parameter number, or 0 */
} CurrentOfExpr;
+/*
+ * ExcludedExpr - an EXCLUDED.* expression
+ *
+ * During parse analysis of ON CONFLICT UPDATE auxiliary queries, a dummy
+ * EXCLUDED range table entry is generated, which is actually just an alias for
+ * the target relation. This is useful during parse analysis, allowing the
+ * parser to produce simple error messages, for example. There is the
+ * appearance of a join within the auxiliary ON CONFLICT UPDATE, superficially
+ * similar to a join in an UPDATE ... FROM; this is a limited, ad-hoc join
+ * though, as the executor needs to tightly control the referenced tuple/slot
+ * through which update evaluation references excluded values originally
+ * proposed for insertion. Note that EXCLUDED.* values carry forward the
+ * effects of BEFORE ROW INSERT triggers.
+ *
+ * To implement a limited "join" for ON CONFLICT UPDATE auxiliary queries,
+ * during the rewrite stage, Vars referencing the alias EXCLUDED.* RTE are
+ * swapped with ExcludedExprs, which also contain Vars; their Vars are
+ * equivalent, but reference the target instead. The ExcludedExpr Var actually
+ * evaluates against varno INNER_VAR during expression evaluation (and not a
+ * varno INDEX_VAR associated with an entry in the flattened range table
+ * representing the target, which is necessarily being scanned whenever an
+ * ExcludedExpr is evaluated) while still being logically associated with the
+ * target. The Var is only rigged to reference the inner slot during
+ * ExcludedExpr initialization. The executor closely controls the evaluation
+ * expression, installing the EXCLUDED slot actually excluded from insertion
+ * into the inner slot of the child/auxiliary evaluation context in an ad-hoc
+ * fashion, which, after ExcludedExpr initialization, is expected (i.e. it is
+ * expected during ExcludedExpr evaluation that the parent insert will make
+ * each excluded tuple available in the inner slot in turn). ExcludedExpr are
+ * only ever evaluated during special speculative insertion related EPQ
+ * expression evaluation, purely for the benefit of auxiliary UPDATE
+ * expressions.
+ *
+ * Aside from representing a logical choke point for this special expression
+ * evaluation, having a dedicated primnode also prevents the optimizer from
+ * considering various optimization that might otherwise be attempted.
+ * Obviously there is no useful join optimization possible within the auxiliary
+ * query, and an ExcludedExpr based post-rewrite query tree representation is a
+ * convenient way of preventing that, as well as related inapplicable
+ * optimizations concerning the equivalence of Vars.
+ */
+typedef struct ExcludedExpr
+{
+ Expr xpr;
+ Node *arg; /* argument (Var) */
+} ExcludedExpr;
+
/*--------------------
* TargetEntry -
* a target entry (used in query target lists)
--
1.9.1
0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 7f97b836b595c532bdd2eb4520ac4dc5459ca146 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:01:32 -0700
Subject: [PATCH 2/8] Support INSERT ... ON CONFLICT {UPDATE | IGNORE}
This non-standard INSERT clause allows DML statement authors to specify
that in the event of each of any of the tuples being inserted
duplicating an existing tuple in terms of a value or set of values
constrained by a unique index, an alternative path may be taken. The
statement may alternatively IGNORE the tuple being inserted without
raising an error, or go to UPDATE the existing tuple whose value is
duplicated by a value within one single tuple proposed for insertion.
The implementation loops until either an insert or an UPDATE/IGNORE
occurs. No existing tuple may be affected more than once per INSERT.
This is implemented using a new infrastructure called "speculative
insertion". (The approach to "Value locking" presenting here follows
design #2, as described on the value locking Postgres Wiki page).
Alternatively, we may go to UPDATE, using the EvalPlanQual() mechanism
to execute a special auxiliary plan.
READ COMMITTED isolation level is permitted to UPDATE a tuple even where
no version is visible to the command's MVCC snapshot. Similarly, any
query predicate associated with the UPDATE portion of the new statement
need only satisfy an already locked, conclusively committed and visible
conflict tuple. When the predicate isn't satisfied, the tuple is still
locked, which implies that at READ COMMITTED, a tuple may be locked
without any version being visible to the command's MVCC snapshot.
Users specify a single unique index to take the alternative path on,
which is inferred from a set of user-supplied column names (or
expressions). This is mandatory for the ON CONFLICT UPDATE variant,
which should address concerns about spuriously taking an incorrect
alternative ON CONFLICT path (i.e. the wrong unique index is used for
arbitration of whether or not to take the alternative path) due to there
being more than one would-be unique violation. Previous revisions of
the patch didn't mandate this. However, we may still IGNORE based on
the first would-be unique violation detected, on the assumption that it
doesn't particularly matter where it originated from for that variant
(iff the user didn't make a point of indicated his or her intent).
The auxiliary ModifyTable plan used by the UPDATE portion of the new
statement is not formally a subplan of its parent INSERT ModifyTable
plan. Rather, it's an independently planned subquery, whose execution
is tightly driven by its parent. Special auxiliary state pertaining to
the auxiliary UPDATE is tracked by its parent through all stages of
query execution.
The implementation imposes some restrictions on child auxiliary UPDATE
plans, which make the plans comport with their parent to the extent
required during the executor stage. One user-visible consequences of
this is that the special auxiliary UPDATE query cannot have subselects
within its targetlist or WHERE clause. UPDATEs may not reference any
other table, and UPDATE FROM is disallowed. INSERT's RETURNING clause
projects tuples successfully inserted (in a later commit, it is made to
project tuples inserted and updated, though).
---
contrib/pg_stat_statements/pg_stat_statements.c | 5 +
contrib/postgres_fdw/deparse.c | 7 +-
contrib/postgres_fdw/postgres_fdw.c | 16 +-
contrib/postgres_fdw/postgres_fdw.h | 2 +-
src/backend/access/heap/heapam.c | 97 ++++-
src/backend/access/nbtree/nbtinsert.c | 32 +-
src/backend/catalog/index.c | 52 ++-
src/backend/commands/constraint.c | 7 +-
src/backend/commands/copy.c | 5 +-
src/backend/commands/explain.c | 87 ++++-
src/backend/executor/execMain.c | 14 +-
src/backend/executor/execUtils.c | 244 +++++++++++--
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeModifyTable.c | 453 +++++++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 39 ++
src/backend/nodes/equalfuncs.c | 32 ++
src/backend/nodes/nodeFuncs.c | 36 ++
src/backend/nodes/outfuncs.c | 7 +
src/backend/nodes/readfuncs.c | 4 +
src/backend/optimizer/path/indxpath.c | 56 +++
src/backend/optimizer/path/tidpath.c | 8 +-
src/backend/optimizer/plan/createplan.c | 16 +-
src/backend/optimizer/plan/planner.c | 50 +++
src/backend/optimizer/plan/setrefs.c | 25 +-
src/backend/optimizer/plan/subselect.c | 6 +
src/backend/optimizer/util/plancat.c | 222 +++++++++++-
src/backend/parser/analyze.c | 100 +++++-
src/backend/parser/gram.y | 74 +++-
src/backend/parser/parse_agg.c | 7 +
src/backend/parser/parse_clause.c | 163 +++++++++
src/backend/parser/parse_expr.c | 3 +
src/backend/rewrite/rewriteHandler.c | 38 +-
src/backend/storage/ipc/procarray.c | 96 +++++
src/backend/storage/lmgr/lmgr.c | 68 ++++
src/backend/utils/adt/lockfuncs.c | 1 +
src/backend/utils/time/tqual.c | 45 +++
src/include/access/heapam.h | 3 +-
src/include/access/heapam_xlog.h | 2 +
src/include/executor/executor.h | 19 +-
src/include/nodes/execnodes.h | 9 +
src/include/nodes/nodes.h | 14 +
src/include/nodes/parsenodes.h | 38 +-
src/include/nodes/plannodes.h | 3 +
src/include/optimizer/paths.h | 1 +
src/include/optimizer/plancat.h | 2 +
src/include/optimizer/planmain.h | 3 +-
src/include/parser/kwlist.h | 2 +
src/include/parser/parse_clause.h | 2 +
src/include/parser/parse_node.h | 1 +
src/include/storage/lmgr.h | 5 +
src/include/storage/lock.h | 10 +
src/include/storage/proc.h | 10 +
src/include/storage/procarray.h | 7 +
src/include/utils/snapshot.h | 11 +
54 files changed, 2134 insertions(+), 134 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 95616b3..414ec83 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2198,6 +2198,11 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
JumbleRangeTable(jstate, query->rtable);
JumbleExpr(jstate, (Node *) query->jointree);
JumbleExpr(jstate, (Node *) query->targetList);
+ APP_JUMB(query->specClause);
+ JumbleExpr(jstate, (Node *) query->arbiterExpr);
+ JumbleExpr(jstate, query->arbiterWhere);
+ if (query->onConflict)
+ JumbleQuery(jstate, (Query *) query->onConflict);
JumbleExpr(jstate, (Node *) query->returningList);
JumbleExpr(jstate, (Node *) query->groupClause);
JumbleExpr(jstate, query->havingQual);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 59cb053..ca51586 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -847,8 +847,8 @@ appendWhereClause(StringInfo buf,
void
deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
- List **retrieved_attrs)
+ List *targetAttrs, bool ignore,
+ List *returningList, List **retrieved_attrs)
{
AttrNumber pindex;
bool first;
@@ -892,6 +892,9 @@ deparseInsertSql(StringInfo buf, PlannerInfo *root,
else
appendStringInfoString(buf, " DEFAULT VALUES");
+ if (ignore)
+ appendStringInfoString(buf, " ON CONFLICT IGNORE");
+
deparseReturningList(buf, root, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_insert_after_row,
returningList, retrieved_attrs);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..1539899 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1167,6 +1167,7 @@ postgresPlanForeignModify(PlannerInfo *root,
List *targetAttrs = NIL;
List *returningList = NIL;
List *retrieved_attrs = NIL;
+ bool ignore = false;
initStringInfo(&sql);
@@ -1201,7 +1202,7 @@ postgresPlanForeignModify(PlannerInfo *root,
int col;
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -1218,6 +1219,17 @@ postgresPlanForeignModify(PlannerInfo *root,
if (plan->returningLists)
returningList = (List *) list_nth(plan->returningLists, subplan_index);
+ if (root->parse->arbiterExpr)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT unique index inference")));
+ else if (plan->spec == SPEC_INSERT)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT UPDATE")));
+ else if (plan->spec == SPEC_IGNORE)
+ ignore = true;
+
/*
* Construct the SQL command string.
*/
@@ -1225,7 +1237,7 @@ postgresPlanForeignModify(PlannerInfo *root,
{
case CMD_INSERT:
deparseInsertSql(&sql, root, resultRelation, rel,
- targetAttrs, returningList,
+ targetAttrs, ignore, returningList,
&retrieved_attrs);
break;
case CMD_UPDATE:
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..3763a57 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -60,7 +60,7 @@ extern void appendWhereClause(StringInfo buf,
List **params);
extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
+ List *targetAttrs, bool ignore, List *returningList,
List **retrieved_attrs);
extern void deparseUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 21e9d06..3a9d40b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
*
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc->specInsert fields
+ * are filled.
+ *
* Note that these options will be applied when inserting into the heap's
* TOAST table, too, if the tuple requires any out-of-line data.
*
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
END_CRIT_SECTION();
+ /*
+ * Let others know that we speculatively inserted this tuple, before
+ * releasing the buffer lock.
+ */
+ if (options & HEAP_INSERT_SPECULATIVE)
+ SetSpeculativeInsertionTid(relation->rd_node, &heaptup->t_self);
+
UnlockReleaseBuffer(buffer);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we "super-delete" a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id. Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
*/
HTSU_Result
heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd)
+ HeapUpdateFailureData *hufd, bool killspeculative)
{
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ if (!killspeculative)
+ {
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ }
+ else
+ {
+ if (tp.t_data->t_choice.t_heap.t_xmin != xid ||
+ tp.t_data->t_choice.t_heap.t_field3.t_cid != cid)
+ elog(ERROR, "attempted to super-delete a tuple from other CID");
+ result = HeapTupleMayBeUpdated;
+ }
+
if (result == HeapTupleInvisible)
{
@@ -2823,12 +2850,15 @@ l1:
* using our own TransactionId below, since some other backend could
* incorporate our XID into a MultiXact immediately afterwards.)
*/
- MultiXactIdSetOldestMember();
+ if (!killspeculative)
+ {
+ MultiXactIdSetOldestMember();
- compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
- tp.t_data->t_infomask, tp.t_data->t_infomask2,
- xid, LockTupleExclusive, true,
- &new_xmax, &new_infomask, &new_infomask2);
+ compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
+ tp.t_data->t_infomask, tp.t_data->t_infomask2,
+ xid, LockTupleExclusive, true,
+ &new_xmax, &new_infomask, &new_infomask2);
+ }
START_CRIT_SECTION();
@@ -2855,8 +2885,23 @@ l1:
tp.t_data->t_infomask |= new_infomask;
tp.t_data->t_infomask2 |= new_infomask2;
HeapTupleHeaderClearHotUpdated(tp.t_data);
- HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
- HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ /*
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
+ * instead of setting xmax, to make the tuple clearly invisible to
+ * everyone. In particular, we want HeapTupleSatisfiesDirty() to regard
+ * the tuple as dead, so that another backend inserting a duplicate key
+ * value won't unnecessarily wait for our transaction to finish.
+ */
+ if (!killspeculative)
+ {
+ HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
+ HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ }
+ else
+ {
+ HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ }
+
/* Make sure there is no forward chain link in t_ctid */
tp.t_data->t_ctid = tp.t_self;
@@ -2872,7 +2917,11 @@ l1:
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);
- xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
+ xlrec.flags = 0;
+ if (all_visible_cleared)
+ xlrec.flags |= XLOG_HEAP_ALL_VISIBLE_CLEARED;
+ if (killspeculative)
+ xlrec.flags |= XLOG_HEAP_KILLED_SPECULATIVE_TUPLE;
xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask,
tp.t_data->t_infomask2);
xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self);
@@ -2977,7 +3026,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
result = heap_delete(relation, tid,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd, false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -4070,14 +4119,16 @@ get_mxact_status_for_lock(LockTupleMode mode, bool is_update)
*
* Function result may be:
* HeapTupleMayBeUpdated: lock was successfully acquired
+ * HeapTupleInvisible: lock failed because tuple instantaneously invisible
* HeapTupleSelfUpdated: lock failed because tuple updated by self
* HeapTupleUpdated: lock failed because tuple updated by other xact
* HeapTupleWouldBlock: lock couldn't be acquired and wait_policy is skip
*
- * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
- * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
- * (the last only for HeapTupleSelfUpdated, since we
- * cannot obtain cmax from a combocid generated by another transaction).
+ * In the failure cases other than HeapTupleInvisible, the routine fills
+ * *hufd with the tuple's t_ctid, t_xmax (resolving a possible MultiXact,
+ * if necessary), and t_cmax (the last only for HeapTupleSelfUpdated,
+ * since we cannot obtain cmax from a combocid generated by another
+ * transaction).
* See comments for struct HeapUpdateFailureData for additional info.
*
* See README.tuplock for a thorough explanation of this mechanism.
@@ -4115,8 +4166,15 @@ l3:
if (result == HeapTupleInvisible)
{
- UnlockReleaseBuffer(*buffer);
- elog(ERROR, "attempted to lock invisible tuple");
+ LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
+
+ /*
+ * This is possible, but only when locking a tuple for speculative
+ * insertion. We return this value here rather than throwing an error
+ * in order to give that case the opportunity to throw a more specific
+ * error.
+ */
+ return HeapTupleInvisible;
}
else if (result == HeapTupleBeingUpdated)
{
@@ -7326,7 +7384,10 @@ heap_xlog_delete(XLogReaderState *record)
HeapTupleHeaderClearHotUpdated(htup);
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ else
+ HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 932c6f7..1a4e18d 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -51,7 +51,8 @@ static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
Relation heapRel, Buffer buf, OffsetNumber offset,
ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique);
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken);
static void _bt_findinsertloc(Relation rel,
Buffer *bufptr,
OffsetNumber *offsetptr,
@@ -159,17 +160,27 @@ top:
*/
if (checkUnique != UNIQUE_CHECK_NO)
{
- TransactionId xwait;
+ TransactionId xwait;
+ uint32 speculativeToken;
offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey,
- checkUnique, &is_unique);
+ checkUnique, &is_unique, &speculativeToken);
if (TransactionIdIsValid(xwait))
{
/* Have to wait for the other guy ... */
_bt_relbuf(rel, buf);
- XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+ /*
+ * If it's a speculative insertion, wait for it to finish (ie.
+ * to go ahead with the insertion, or kill the tuple). Otherwise
+ * wait for the transaction to finish as usual.
+ */
+ if (speculativeToken)
+ SpeculativeInsertionWait(xwait, speculativeToken);
+ else
+ XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+
/* start over... */
_bt_freestack(stack);
goto top;
@@ -211,9 +222,12 @@ top:
* also point to end-of-page, which means that the first tuple to check
* is the first tuple on the next page.
*
- * Returns InvalidTransactionId if there is no conflict, else an xact ID
- * we must wait for to see if it commits a conflicting tuple. If an actual
- * conflict is detected, no return --- just ereport().
+ * Returns InvalidTransactionId if there is no conflict, else an xact ID we
+ * must wait for to see if it commits a conflicting tuple. If an actual
+ * conflict is detected, no return --- just ereport(). If an xact ID is
+ * returned, and the conflicting tuple still has a speculative insertion in
+ * progress, *speculativeToken is set to non-zero, and the caller can wait for
+ * the verdict on the insertion using SpeculativeInsertionWait().
*
* However, if checkUnique == UNIQUE_CHECK_PARTIAL, we always return
* InvalidTransactionId because we don't want to wait. In this case we
@@ -223,7 +237,8 @@ top:
static TransactionId
_bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
Buffer buf, OffsetNumber offset, ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique)
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int natts = rel->rd_rel->relnatts;
@@ -340,6 +355,7 @@ _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
if (nbuf != InvalidBuffer)
_bt_relbuf(rel, nbuf);
/* Tell _bt_doinsert to wait... */
+ *speculativeToken = SnapshotDirty.speculativeToken;
return xwait;
}
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 9bb9deb..b9c5c81 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1659,8 +1659,50 @@ BuildIndexInfo(Relation index)
ii->ii_ExclusionStrats = NULL;
}
+ /*
+ * fetch info for checking unique constraints. (this is currently only used
+ * by ExecCheckIndexConstraints(), for INSERT ... ON CONFLICT UPDATE, which
+ * must support "speculative insertion". In regular insertions, the index
+ * AM handles the unique check itself. Might make sense to do this lazily,
+ * only when needed)
+ */
+ if (indexStruct->indisunique)
+ {
+ int ncols = index->rd_rel->relnatts;
+
+ if (index->rd_rel->relam != BTREE_AM_OID)
+ elog(ERROR, "only b-tree indexes are supported for foreign keys");
+
+ ii->ii_UniqueOps = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueProcs = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueStrats = (uint16 *) palloc(sizeof(uint16) * ncols);
+
+ /*
+ * We have to look up the operator's strategy number. This
+ * provides a cross-check that the operator does match the index.
+ */
+ /* We need the func OIDs and strategy numbers too */
+ for (i = 0; i < ncols; i++)
+ {
+ ii->ii_UniqueStrats[i] = BTEqualStrategyNumber;
+ ii->ii_UniqueOps[i] =
+ get_opfamily_member(index->rd_opfamily[i],
+ index->rd_opcintype[i],
+ index->rd_opcintype[i],
+ ii->ii_UniqueStrats[i]);
+ ii->ii_UniqueProcs[i] = get_opcode(ii->ii_UniqueOps[i]);
+ }
+ ii->ii_Unique = true;
+ }
+ else
+ {
+ ii->ii_UniqueOps = NULL;
+ ii->ii_UniqueProcs = NULL;
+ ii->ii_UniqueStrats = NULL;
+ ii->ii_Unique = false;
+ }
+
/* other info */
- ii->ii_Unique = indexStruct->indisunique;
ii->ii_ReadyForInserts = IndexIsReady(indexStruct);
/* initialize index-build state to default */
@@ -2606,10 +2648,10 @@ IndexCheckExclusion(Relation heapRelation,
/*
* Check that this tuple has no conflicts.
*/
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- &(heapTuple->t_self), values, isnull,
- estate, true, false);
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &(heapTuple->t_self),
+ values, isnull, estate, true,
+ false, true, NULL);
}
heap_endscan(scan);
diff --git a/src/backend/commands/constraint.c b/src/backend/commands/constraint.c
index 561d8fa..d5ab12f 100644
--- a/src/backend/commands/constraint.c
+++ b/src/backend/commands/constraint.c
@@ -170,9 +170,10 @@ unique_key_recheck(PG_FUNCTION_ARGS)
* For exclusion constraints we just do the normal check, but now it's
* okay to throw error.
*/
- check_exclusion_constraint(trigdata->tg_relation, indexRel, indexInfo,
- &(new_row->t_self), values, isnull,
- estate, false, false);
+ check_exclusion_or_unique_constraint(trigdata->tg_relation, indexRel,
+ indexInfo, &(new_row->t_self),
+ values, isnull, estate, false,
+ false, true, NULL);
}
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 56f6b76..f289d9f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2431,7 +2431,8 @@ CopyFrom(CopyState cstate)
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false,
+ InvalidOid);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, tuple,
@@ -2538,7 +2539,7 @@ CopyFromInsertBatch(CopyState cstate, EState *estate, CommandId mycid,
ExecStoreTuple(bufferedTuples[i], myslot, InvalidBuffer, false);
recheckIndexes =
ExecInsertIndexTuples(myslot, &(bufferedTuples[i]->t_self),
- estate);
+ estate, false, InvalidOid);
ExecARInsertTriggers(estate, resultRelInfo,
bufferedTuples[i],
recheckIndexes);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7cfc9bb..e6a8d8e 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -103,7 +103,8 @@ static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
static void ExplainScanTarget(Scan *plan, ExplainState *es);
static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
-static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es);
+static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors);
static void ExplainMemberNodes(List *plans, PlanState **planstates,
List *ancestors, ExplainState *es);
static void ExplainSubPlans(List *plans, List *ancestors,
@@ -763,6 +764,9 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
ExplainPreScanMemberNodes(((ModifyTable *) plan)->plans,
((ModifyTableState *) planstate)->mt_plans,
rels_used);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainPreScanNode(((ModifyTableState *) planstate)->onConflict,
+ rels_used);
break;
case T_Append:
ExplainPreScanMemberNodes(((Append *) plan)->appendplans,
@@ -864,6 +868,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
const char *custom_name = NULL;
int save_indent = es->indent;
bool haschildren;
+ bool suppresschildren = false;
+ ModifyTable *mtplan;
switch (nodeTag(plan))
{
@@ -872,13 +878,33 @@ ExplainNode(PlanState *planstate, List *ancestors,
break;
case T_ModifyTable:
sname = "ModifyTable";
- switch (((ModifyTable *) plan)->operation)
+ mtplan = (ModifyTable *) plan;
+ switch (mtplan->operation)
{
case CMD_INSERT:
pname = operation = "Insert";
break;
case CMD_UPDATE:
- pname = operation = "Update";
+ if (mtplan->spec == SPEC_NONE)
+ {
+ pname = operation = "Update";
+ }
+ else
+ {
+ Assert(mtplan->spec == SPEC_UPDATE);
+
+ pname = operation = "Conflict Update";
+
+ /*
+ * Do not display child sequential scan/result node.
+ * Quals from child will be directly attributed to
+ * ModifyTable node, since we prefer to avoid
+ * displaying scan node to users, as it is merely an
+ * implementation detail; it is never executed in the
+ * conventional way.
+ */
+ suppresschildren = true;
+ }
break;
case CMD_DELETE:
pname = operation = "Delete";
@@ -1458,7 +1484,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate, es);
break;
case T_ModifyTable:
- show_modifytable_info((ModifyTableState *) planstate, es);
+ show_modifytable_info((ModifyTableState *) planstate, es,
+ ancestors);
break;
case T_Hash:
show_hash_info((HashState *) planstate, es);
@@ -1586,7 +1613,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate->subPlan;
if (haschildren)
{
- ExplainOpenGroup("Plans", "Plans", false, es);
+ if (!suppresschildren)
+ ExplainOpenGroup("Plans", "Plans", false, es);
/* Pass current PlanState as head of ancestors list for children */
ancestors = lcons(planstate, ancestors);
}
@@ -1609,9 +1637,13 @@ ExplainNode(PlanState *planstate, List *ancestors,
switch (nodeTag(plan))
{
case T_ModifyTable:
- ExplainMemberNodes(((ModifyTable *) plan)->plans,
- ((ModifyTableState *) planstate)->mt_plans,
- ancestors, es);
+ if (((ModifyTable *) plan)->spec != SPEC_UPDATE)
+ ExplainMemberNodes(((ModifyTable *) plan)->plans,
+ ((ModifyTableState *) planstate)->mt_plans,
+ ancestors, es);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainNode(((ModifyTableState *) planstate)->onConflict,
+ ancestors, "Member", NULL, es);
break;
case T_Append:
ExplainMemberNodes(((Append *) plan)->appendplans,
@@ -1649,7 +1681,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (haschildren)
{
ancestors = list_delete_first(ancestors);
- ExplainCloseGroup("Plans", "Plans", false, es);
+
+ if (!suppresschildren)
+ ExplainCloseGroup("Plans", "Plans", false, es);
}
/* in text format, undo whatever indentation we added */
@@ -2202,6 +2236,15 @@ ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
rti = linitial_int(plan->resultRelations);
ExplainTargetRel((Plan *) plan, rti, es);
+
+ if (plan->arbiterIndex != InvalidOid)
+ {
+ char *indexname = get_rel_name(plan->arbiterIndex);
+
+ /* nothing to do for text format explains */
+ if (es->format != EXPLAIN_FORMAT_TEXT && indexname != NULL)
+ ExplainPropertyText("Arbiter Index", indexname, es);
+ }
}
/*
@@ -2237,6 +2280,12 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
if (es->verbose)
namespace = get_namespace_name(get_rel_namespace(rte->relid));
objecttag = "Relation Name";
+
+ /*
+ * ON CONFLICT's "TARGET" alias will not appear in output for
+ * auxiliary ModifyTable as its alias, because target
+ * resultRelation is shared between parent and auxiliary queries
+ */
break;
case T_FunctionScan:
{
@@ -2315,7 +2364,8 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
* Show extra information for a ModifyTable node
*/
static void
-show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
+show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors)
{
FdwRoutine *fdwroutine = mtstate->resultRelInfo->ri_FdwRoutine;
@@ -2337,6 +2387,23 @@ show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
0,
es);
}
+ else if (mtstate->spec == SPEC_UPDATE)
+ {
+ PlanState *ps = (*mtstate->mt_plans);
+
+ /*
+ * Seqscan node is always used, unless optimizer determined that
+ * predicate precludes ever updating, in which case a simple Result
+ * node is possible
+ */
+ Assert(IsA(ps->plan, SeqScan) || IsA(ps->plan, Result));
+
+ /* Attribute child scan node's qual to ModifyTable node */
+ show_scan_qual(ps->plan->qual, "Filter", ps, ancestors, es);
+
+ if (ps->plan->qual)
+ show_instrumentation_count("Rows Removed by Filter", 1, ps, es);
+ }
}
/*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 9e11040..36251f0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2134,7 +2134,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* the latest version of the row was deleted, so we need do
* nothing. (Should be safe to examine xmin without getting
* buffer's content lock, since xmin never changes in an existing
- * tuple.)
+ * non-promise tuple, and there is no reason to lock a promise
+ * tuple until it is clear that it has been fulfilled.)
*/
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple.t_data),
priorXmax))
@@ -2215,11 +2216,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* case, so as to avoid the "Halloween problem" of
* repeated update attempts. In the latter case it might
* be sensible to fetch the updated tuple instead, but
- * doing so would require changing heap_lock_tuple as well
- * as heap_update and heap_delete to not complain about
- * updating "invisible" tuples, which seems pretty scary.
- * So for now, treat the tuple as deleted and do not
- * process.
+ * doing so would require changing heap_update and
+ * heap_delete to not complain about updating "invisible"
+ * tuples, which seems pretty scary (heap_lock_tuple will
+ * not complain, but few callers expect HeapTupleInvisible,
+ * and we're not one of them). So for now, treat the tuple
+ * as deleted and do not process.
*/
ReleaseBuffer(buffer);
return NULL;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 4b921fa..52a9b35 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -990,7 +990,8 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
*
* This returns a list of index OIDs for any unique or exclusion
* constraints that are deferred and that had
- * potential (unconfirmed) conflicts.
+ * potential (unconfirmed) conflicts. (if noDupErr == true, the
+ * same is done for non-deferred constraints)
*
* CAUTION: this must not be called for a HOT update.
* We can't defend against that here for lack of info.
@@ -1000,7 +1001,9 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
List *
ExecInsertIndexTuples(TupleTableSlot *slot,
ItemPointer tupleid,
- EState *estate)
+ EState *estate,
+ bool noDupErr,
+ Oid arbiterIdx)
{
List *result = NIL;
ResultRelInfo *resultRelInfo;
@@ -1070,7 +1073,18 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
/* Skip this index-update if the predicate isn't satisfied */
if (!ExecQual(predicate, econtext, false))
+ {
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ ereport(ERROR,
+ (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION),
+ errmsg("partial arbiter unique index has predicate that does not cover tuple proposed for insertion"),
+ errdetail("ON CONFLICT inference clause implies that the tuple proposed for insertion actually be covered by partial predicate for index \"%s\".",
+ RelationGetRelationName(indexRelation)),
+ errhint("ON CONFLICT inference clause must infer a unique index that covers the final tuple, after BEFORE ROW INSERT triggers fire."),
+ errtableconstraint(heapRelation,
+ RelationGetRelationName(indexRelation))));
continue;
+ }
}
/*
@@ -1092,9 +1106,16 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* For a deferrable unique index, we tell the index AM to just detect
* possible non-uniqueness, and we add the index OID to the result
* list if further checking is needed.
+ *
+ * For a speculative insertion (ON CONFLICT UPDATE/IGNORE), just detect
+ * possible non-uniqueness, and tell the caller if it failed.
*/
if (!indexRelation->rd_index->indisunique)
checkUnique = UNIQUE_CHECK_NO;
+ else if (noDupErr && arbiterIdx == InvalidOid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
+ else if (noDupErr && arbiterIdx == indexRelation->rd_index->indexrelid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
else if (indexRelation->rd_index->indimmediate)
checkUnique = UNIQUE_CHECK_YES;
else
@@ -1112,8 +1133,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* If the index has an associated exclusion constraint, check that.
* This is simpler than the process for uniqueness checks since we
* always insert first and then check. If the constraint is deferred,
- * we check now anyway, but don't throw error on violation; instead
- * we'll queue a recheck event.
+ * we check now anyway, but don't throw error on violation or wait for
+ * a conclusive outcome from a concurrent insertion; instead we'll
+ * queue a recheck event. Similarly, noDupErr callers (speculative
+ * inserters) will recheck later, and wait for a conclusive outcome
+ * then.
*
* An index for an exclusion constraint can't also be UNIQUE (not an
* essential property, we just don't allow it in the grammar), so no
@@ -1121,13 +1145,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*/
if (indexInfo->ii_ExclusionOps != NULL)
{
- bool errorOK = !indexRelation->rd_index->indimmediate;
+ bool violationOK = (!indexRelation->rd_index->indimmediate ||
+ noDupErr);
satisfiesConstraint =
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- tupleid, values, isnull,
- estate, false, errorOK);
+ check_exclusion_or_unique_constraint(heapRelation,
+ indexRelation, indexInfo,
+ tupleid, values, isnull,
+ estate, false,
+ violationOK, false, NULL);
}
if ((checkUnique == UNIQUE_CHECK_PARTIAL ||
@@ -1135,7 +1161,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
!satisfiesConstraint)
{
/*
- * The tuple potentially violates the uniqueness or exclusion
+ * The tuple potentially violates the unique index or exclusion
* constraint, so make a note of the index so that we can re-check
* it later.
*/
@@ -1146,18 +1172,150 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
return result;
}
+/* ----------------------------------------------------------------
+ * ExecCheckIndexConstraints
+ *
+ * This routine checks if a tuple violates any unique or
+ * exclusion constraints. If no conflict, returns true.
+ * Otherwise returns false, and the TID of the conflicting
+ * tuple is returned in *conflictTid
+ *
+ * Note that this doesn't lock the values in any way, so it's
+ * possible that a conflicting tuple is inserted immediately
+ * after this returns, and a later insert with the same values
+ * still conflicts. But this can be used for a pre-check before
+ * insertion.
+ * ----------------------------------------------------------------
+ */
+bool
+ExecCheckIndexConstraints(TupleTableSlot *slot,
+ EState *estate, ItemPointer conflictTid,
+ Oid arbiterIdx)
+{
+ ResultRelInfo *resultRelInfo;
+ int i;
+ int numIndices;
+ RelationPtr relationDescs;
+ Relation heapRelation;
+ IndexInfo **indexInfoArray;
+ ExprContext *econtext;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ ItemPointerData invalidItemPtr;
+ bool checkedIndex = false;
+
+ ItemPointerSetInvalid(&invalidItemPtr);
+
+ /*
+ * Get information from the result relation info structure.
+ */
+ resultRelInfo = estate->es_result_relation_info;
+ numIndices = resultRelInfo->ri_NumIndices;
+ relationDescs = resultRelInfo->ri_IndexRelationDescs;
+ indexInfoArray = resultRelInfo->ri_IndexRelationInfo;
+ heapRelation = resultRelInfo->ri_RelationDesc;
+
+ /*
+ * We will use the EState's per-tuple context for evaluating predicates
+ * and index expressions (creating it if it's not already there).
+ */
+ econtext = GetPerTupleExprContext(estate);
+
+ /* Arrange for econtext's scan tuple to be the tuple under test */
+ econtext->ecxt_scantuple = slot;
+
+ /*
+ * for each index, form and insert the index tuple
+ */
+ for (i = 0; i < numIndices; i++)
+ {
+ Relation indexRelation = relationDescs[i];
+ IndexInfo *indexInfo;
+ bool satisfiesConstraint;
+
+ if (indexRelation == NULL)
+ continue;
+
+ indexInfo = indexInfoArray[i];
+
+ if (!indexInfo->ii_Unique && !indexInfo->ii_ExclusionOps)
+ continue;
+
+ /* If the index is marked as read-only, ignore it */
+ if (!indexInfo->ii_ReadyForInserts)
+ continue;
+
+ /* When specific arbiter index requested, only examine it */
+ if (arbiterIdx != InvalidOid &&
+ arbiterIdx != indexRelation->rd_index->indexrelid)
+ continue;
+
+ checkedIndex = true;
+
+ /* Check for partial index */
+ if (indexInfo->ii_Predicate != NIL)
+ {
+ List *predicate;
+
+ /*
+ * If predicate state not set up yet, create it (in the estate's
+ * per-query context)
+ */
+ predicate = indexInfo->ii_PredicateState;
+ if (predicate == NIL)
+ {
+ predicate = (List *)
+ ExecPrepareExpr((Expr *) indexInfo->ii_Predicate,
+ estate);
+ indexInfo->ii_PredicateState = predicate;
+ }
+
+ /* Skip this index-update if the predicate isn't satisfied */
+ if (!ExecQual(predicate, econtext, false))
+ continue;
+ }
+
+
+ /*
+ * FormIndexDatum fills in its values and isnull parameters with the
+ * appropriate values for the column(s) of the index.
+ */
+ FormIndexDatum(indexInfo,
+ slot,
+ estate,
+ values,
+ isnull);
+
+ satisfiesConstraint =
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &invalidItemPtr,
+ values, isnull, estate, false,
+ true, true, conflictTid);
+ if (!satisfiesConstraint)
+ return false;
+ }
+
+ if (arbiterIdx != InvalidOid && !checkedIndex)
+ elog(ERROR, "unexpected failure to find arbiter unique index");
+
+ return true;
+}
+
/*
- * Check for violation of an exclusion constraint
+ * Check for violation of an exclusion or unique constraint
*
* heap: the table containing the new tuple
* index: the index supporting the exclusion constraint
* indexInfo: info about the index, including the exclusion properties
- * tupleid: heap TID of the new tuple we have just inserted
+ * tupleid: heap TID of the new tuple we have just inserted (invalid if we
+ * haven't inserted a new tuple yet)
* values, isnull: the *index* column values computed for the new tuple
* estate: an EState we can do evaluation in
* newIndex: if true, we are trying to build a new index (this affects
* only the wording of error messages)
* errorOK: if true, don't throw error for violation
+ * wait: if true, wait for conflicting transaction to finish, even if !errorOK
+ * conflictTid: if not-NULL, the TID of conflicting tuple is returned here.
*
* Returns true if OK, false if actual or potential violation
*
@@ -1167,16 +1325,24 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* is convenient for deferred exclusion checks; we need not bother queuing
* a deferred event if there is definitely no conflict at insertion time.
*
- * When errorOK is false, we'll throw error on violation, so a false result
+ * When violationOK is false, we'll throw error on violation, so a false result
* is impossible.
+ *
+ * Note: The indexam is normally responsible for checking unique constraints,
+ * so this normally only needs to be used for exclusion constraints. But this
+ * function is also called when doing a "pre-check" for conflicts with "INSERT
+ * ... ON CONFLICT UPDATE", before inserting the actual tuple.
*/
bool
-check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
- ItemPointer tupleid, Datum *values, bool *isnull,
- EState *estate, bool newIndex, bool errorOK)
+check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo, ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate, bool newIndex,
+ bool violationOK, bool wait,
+ ItemPointer conflictTid)
{
- Oid *constr_procs = indexInfo->ii_ExclusionProcs;
- uint16 *constr_strats = indexInfo->ii_ExclusionStrats;
+ Oid *constr_procs;
+ uint16 *constr_strats;
Oid *index_collations = index->rd_indcollation;
int index_natts = index->rd_index->indnatts;
IndexScanDesc index_scan;
@@ -1190,6 +1356,17 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
TupleTableSlot *existing_slot;
TupleTableSlot *save_scantuple;
+ if (indexInfo->ii_ExclusionOps)
+ {
+ constr_procs = indexInfo->ii_ExclusionProcs;
+ constr_strats = indexInfo->ii_ExclusionStrats;
+ }
+ else
+ {
+ constr_procs = indexInfo->ii_UniqueProcs;
+ constr_strats = indexInfo->ii_UniqueStrats;
+ }
+
/*
* If any of the input values are NULL, the constraint check is assumed to
* pass (i.e., we assume the operators are strict).
@@ -1253,7 +1430,8 @@ retry:
/*
* Ignore the entry for the tuple we're trying to check.
*/
- if (ItemPointerEquals(tupleid, &tup->t_self))
+ if (ItemPointerIsValid(tupleid) &&
+ ItemPointerEquals(tupleid, &tup->t_self))
{
if (found_self) /* should not happen */
elog(ERROR, "found self tuple multiple times in index \"%s\"",
@@ -1287,9 +1465,11 @@ retry:
* we're not supposed to raise error, just return the fact of the
* potential conflict without waiting to see if it's real.
*/
- if (errorOK)
+ if (violationOK && !wait)
{
conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
break;
}
@@ -1307,14 +1487,29 @@ retry:
if (TransactionIdIsValid(xwait))
{
index_endscan(index_scan);
- XactLockTableWait(xwait, heap, &tup->t_data->t_ctid,
- XLTW_RecheckExclusionConstr);
+ if (DirtySnapshot.speculativeToken)
+ SpeculativeInsertionWait(DirtySnapshot.xmin,
+ DirtySnapshot.speculativeToken);
+ else if (violationOK)
+ XactLockTableWait(xwait, heap, &tup->t_self,
+ XLTW_RecheckExclusionConstr);
+ else
+ XactLockTableWait(xwait, heap, &tup->t_data->t_ctid,
+ XLTW_RecheckExclusionConstr);
goto retry;
}
/*
- * We have a definite conflict. Report it.
+ * We have a definite conflict. Return it to caller, or report it.
*/
+ if (violationOK)
+ {
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+ break;
+ }
+
error_new = BuildIndexValueDescription(index, values, isnull);
error_existing = BuildIndexValueDescription(index, existing_values,
existing_isnull);
@@ -1350,6 +1545,9 @@ retry:
* However, it is possible to define exclusion constraints for which that
* wouldn't be true --- for instance, if the operator is <>. So we no
* longer complain if found_self is still false.
+ *
+ * It would also not be true in the pre-check mode, when we haven't
+ * inserted a tuple yet.
*/
econtext->ecxt_scantuple = save_scantuple;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 48107d9..4699060 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -151,10 +151,11 @@ lnext:
* case, so as to avoid the "Halloween problem" of repeated
* update attempts. In the latter case it might be sensible
* to fetch the updated tuple instead, but doing so would
- * require changing heap_lock_tuple as well as heap_update and
- * heap_delete to not complain about updating "invisible"
- * tuples, which seems pretty scary. So for now, treat the
- * tuple as deleted and do not process.
+ * require changing heap_update and heap_delete to not complain
+ * about updating "invisible" tuples, which seems pretty scary
+ * (heap_lock_tuple will not complain, but few callers expect
+ * HeapTupleInvisible, and we're not one of them). So for now,
+ * treat the tuple as deleted and do not process.
*/
goto lnext;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index f96fb24..d03604c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -46,12 +46,20 @@
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/tqual.h"
+static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ EState *estate);
+
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
* target relation's rowtype
@@ -151,6 +159,37 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
return ExecProject(projectReturning, NULL);
}
+/*
+ * ExecCheckHeapTupleVisible -- verify heap tuple is visible
+ *
+ * It is not acceptable to proceed with avoiding insertion (taking
+ * speculative insertion's alternative IGNORE/UPDATE path) on the
+ * basis of another tuple that is not visible, iff xact uses higher
+ * isolation levels.
+ */
+static void
+ExecCheckHeapTupleVisible(EState *estate,
+ ResultRelInfo *relinfo,
+ ItemPointer tid)
+{
+
+ Relation rel = relinfo->ri_RelationDesc;
+ Buffer buffer;
+ HeapTupleData tuple;
+
+ if (!IsolationUsesXactSnapshot())
+ return;
+
+ tuple.t_self = *tid;
+ if (!heap_fetch(rel, estate->es_snapshot, &tuple, &buffer, false, NULL))
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent insert or update dictating alternative ON CONFLICT path"),
+ errhint("Even ON CONFLICT IGNORE must consider effects of concurrent transactions.")));
+
+ ReleaseBuffer(buffer);
+}
+
/* ----------------------------------------------------------------
* ExecInsert
*
@@ -163,6 +202,9 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
static TupleTableSlot *
ExecInsert(TupleTableSlot *slot,
TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ Oid arbiterIndex,
+ SpecType spec,
EState *estate,
bool canSetTag)
{
@@ -246,6 +288,9 @@ ExecInsert(TupleTableSlot *slot,
}
else
{
+ bool conflict;
+ ItemPointerData conflictTid;
+
/*
* Constraints might reference the tableoid column, so initialize
* t_tableOid before evaluating them.
@@ -259,20 +304,130 @@ ExecInsert(TupleTableSlot *slot,
ExecConstraints(resultRelInfo, slot, estate);
/*
- * insert the tuple
- *
- * Note: heap_insert returns the tid (location) of the new tuple in
- * the t_self field.
+ * If we are expecting duplicates, do a non-conclusive first check. We
+ * might still fail later, after inserting the heap tuple, if a
+ * conflicting row was inserted concurrently. We'll handle that by
+ * deleting the already-inserted tuple and retrying, but that's fairly
+ * expensive, so we try to avoid it.
*/
- newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, 0, NULL);
+vlock:
+ conflict = false;
+ ItemPointerSetInvalid(&conflictTid);
/*
- * insert index entries for tuple
+ * XXX If we know or assume that there are few duplicates, it would be
+ * better to skip this, and just optimistically proceed with the
+ * insertion below. You would then leave behind some garbage when a
+ * conflict happens, but if it's rare, it doesn't matter much. Some
+ * kind of heuristic might be in order here, like stop doing these
+ * pre-checks if the last 100 insertions have not been duplicates.
*/
- if (resultRelInfo->ri_NumIndices > 0)
- recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ if (spec != SPEC_NONE && resultRelInfo->ri_NumIndices > 0)
+ {
+ /*
+ * Check if it's required to proceed with the second phase
+ * ("insertion proper") of speculative insertion in respect of the
+ * slot. If insertion ultimately does not proceed, no firing of
+ * AFTER ROW INSERT triggers occurs.
+ *
+ * We don't suppress the effects (or, perhaps, side-effects) of
+ * BEFORE ROW INSERT triggers. This isn't ideal, but then we
+ * cannot proceed with even considering uniqueness violations until
+ * these triggers fire on the one hand, but on the other hand they
+ * have the ability to execute arbitrary user-defined code which
+ * may perform operations entirely outside the system's ability to
+ * nullify.
+ */
+ if (!ExecCheckIndexConstraints(slot, estate, &conflictTid,
+ arbiterIndex))
+ conflict = true;
+ }
+
+ if (!conflict)
+ {
+ /*
+ * Before we start the insertion, acquire our "promise tuple
+ * insertion lock". Others can use that (rather than an XID lock,
+ * which is appropriate only for non-promise tuples) to wait for us
+ * to decide if we're going to go ahead with the insertion.
+ */
+ if (spec != SPEC_NONE)
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+
+ /*
+ * insert the tuple
+ *
+ * Note: heap_insert returns the tid (location) of the new tuple in
+ * the t_self field.
+ */
+ newId = heap_insert(resultRelationDesc, tuple,
+ estate->es_output_cid,
+ spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0,
+ NULL);
+
+ /*
+ * insert index entries for tuple
+ */
+ if (resultRelInfo->ri_NumIndices > 0)
+ recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
+ estate,
+ spec != SPEC_NONE,
+ arbiterIndex);
+
+ if (spec != SPEC_NONE && recheckIndexes)
+ {
+ HeapUpdateFailureData hufd;
+
+ /*
+ * Race: concurrent insertion conflicts with our speculative
+ * heap tuple
+ */
+ conflict = true;
+
+ /*
+ * Must "super-delete" the heap tuple and retry from the start.
+ *
+ * This is occasionally necessary so that "unprincipled
+ * deadlocks" are avoided; now that a conflict was found,
+ * other sessions should not wait on our speculative token, and
+ * they certainly shouldn't treat our speculatively-inserted
+ * heap tuple as an ordinary tuple that it must wait on the
+ * outcome of our xact to UPDATE/DELETE. This makes heap
+ * tuples behave as conceptual "value locks" of short duration,
+ * distinct from ordinary tuples that other xacts must wait on
+ * xmin-xact-end of in the event of a possible unique/exclusion
+ * violation (the violation that arbitrates taking the
+ * alternative UPDATE/IGNORE path).
+ */
+ heap_delete(resultRelationDesc, &(tuple->t_self),
+ estate->es_output_cid, NULL, false, &hufd, true);
+ }
+
+ if (spec != SPEC_NONE)
+ {
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+ }
+ }
+
+ if (conflict)
+ {
+ /*
+ * Lock and consider updating in the SPEC_INSERT case. For the
+ * SPEC_IGNORE case, it's still necessary to verify that the tuple
+ * is visible to the executor's MVCC snapshot.
+ */
+ if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
+ &conflictTid,
+ planSlot,
+ onConflict,
+ estate))
+ goto vlock;
+ else if (spec == SPEC_IGNORE)
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
+
+ return NULL;
+ }
}
if (canSetTag)
@@ -399,7 +554,8 @@ ldelete:;
estate->es_output_cid,
estate->es_crosscheck_snapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd,
+ false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -768,7 +924,7 @@ lreplace:;
*/
if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false, InvalidOid);
}
if (canSetTag)
@@ -793,6 +949,218 @@ lreplace:;
}
+/* ----------------------------------------------------------------
+ * Try to lock tuple for update as part of speculative insertion. If
+ * a qual originating from ON CONFLICT UPDATE is satisfied, update
+ * (but still lock row, even though it may not satisfy estate's
+ * snapshot).
+ *
+ * Returns value indicating if we're done (with or without an
+ * update), or if the executor must start from scratch.
+ * ----------------------------------------------------------------
+ */
+static bool
+ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ EState *estate)
+{
+ Relation relation = resultRelInfo->ri_RelationDesc;
+ HeapTupleData tuple;
+ HeapTuple copyTuple = NULL;
+ HeapUpdateFailureData hufd;
+ HTSU_Result test;
+ Buffer buffer;
+ TupleTableSlot *slot;
+
+ /*
+ * XXX We don't have the TID of the conflicting tuple if the index
+ * insertion failed and we had to kill the already inserted tuple. We'd
+ * need to modify the index AM to pass through the TID back here. So for
+ * now, we just retry, and hopefully the new pre-check will fail on the
+ * same tuple (or it's finished by now), and we'll get its TID that way.
+ */
+ if (!ItemPointerIsValid(conflictTid))
+ {
+ elog(DEBUG1, "insertion conflicted after pre-check");
+ return false;
+ }
+
+ /*
+ * Lock tuple for update.
+ *
+ * Like EvalPlanQualFetch(), don't follow updates. There is no actual
+ * benefit to doing so, since as discussed below, a conflict invalidates
+ * our previous conclusion that the tuple is the conclusively committed
+ * conflicting tuple.
+ */
+ tuple.t_self = *conflictTid;
+ test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
+ LockTupleExclusive, LockWaitBlock, false, &buffer,
+ &hufd);
+
+ if (test == HeapTupleMayBeUpdated)
+ copyTuple = heap_copytuple(&tuple);
+
+ switch (test)
+ {
+ case HeapTupleInvisible:
+ /*
+ * This may occur when an instantaneously invisible tuple is blamed
+ * as a conflict because multiple rows are inserted with the same
+ * constrained values.
+ *
+ * We cannot proceed, because to do so would leave users open to
+ * the risk that the same row will be updated a second time in the
+ * same command; allowing a second update affecting a single row
+ * within the same command a second time would leave the update
+ * order undefined. It is the user's responsibility to resolve
+ * these self-duplicates in advance of proposing for insertion a
+ * set of tuples, but warn them. These problems are why SQL-2003
+ * similarly specifies that for SQL MERGE, an exception must be
+ * raised in the event of an attempt to update the same row twice.
+ *
+ * XXX It might be preferable to do something similar when a row is
+ * locked twice (and not updated twice) by the same speculative
+ * insertion, as if to take each lock acquisition as a indication
+ * of a discrete, unfulfilled intent to update (perhaps in some
+ * later command of the same xact). This does not seem feasible,
+ * though.
+ */
+ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data)))
+ ereport(ERROR,
+ (errcode(ERRCODE_CARDINALITY_VIOLATION),
+ errmsg("ON CONFLICT UPDATE command could not lock/update self-inserted tuple"),
+ errhint("Ensure that no rows proposed for insertion within the same command have duplicate constrained values.")));
+
+ /* This shouldn't happen */
+ elog(ERROR, "attempted to lock invisible tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleSelfUpdated:
+ /*
+ * XXX In practice this is dead code, since BEFORE triggers fire
+ * prior to speculative insertion. Since a dirty snapshot is used
+ * to find possible conflict tuples, speculative insertion could
+ * not have seen the old/MVCC-current row version at all (even if
+ * it was only rendered old by this same command).
+ */
+ elog(ERROR,"unexpected self-updated tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleMayBeUpdated:
+ /*
+ * Success -- we're done, as tuple is locked. Verify that the
+ * tuple is known to be visible to our snapshot under conventional
+ * MVCC rules if the current isolation level mandates that. In
+ * READ COMMITTED mode, we can lock and update a tuple still in
+ * progress according to our snapshot, but higher isolation levels
+ * cannot avail of that, and must actively defend against doing so.
+ * We might get a serialization failure within ExecUpdate() anyway
+ * if this step was skipped, but this cannot be relied on, for
+ * example because the auxiliary WHERE clause happened to not be
+ * satisfied.
+ */
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &tuple.t_data->t_ctid);
+
+ /*
+ * This loosening of snapshot isolation for the benefit of READ
+ * COMMITTED speculative insertions is used consistently:
+ * speculative quals are only tested against already locked tuples.
+ * It would be rather inconsistent to UPDATE when no tuple version
+ * is MVCC-visible (which seems inevitable since we must *do
+ * something* there, and "READ COMMITTED serialization failures"
+ * are unappealing), while also avoiding updating here entirely on
+ * the basis of a non-conclusive tuple version (the version that
+ * happens to be visible to this command's MVCC snapshot, or a
+ * subsequent non-conclusive version).
+ *
+ * In other words: Only the final, conclusively locked tuple
+ * (which must have the same value in the relevant constrained
+ * attribute(s) as the value previously "value locked") matters.
+ */
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStartNode(onConflict->ps.instrument);
+
+ /*
+ * Conceptually, the parent ModifyTable is like a relation scan
+ * node that uses a dirty snapshot, returning rows which the
+ * auxiliary plan must operate on (if only to lock all such rows).
+ * EvalPlanQual() is involved in the evaluation of their UPDATE,
+ * regardless of whether or not the tuple is visible to the
+ * command's MVCC Snapshot.
+ */
+ EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
+
+ /*
+ * UPDATE affects the same ResultRelation as INSERT in the context
+ * of ON CONFLICT UPDATE, so parent's target rti is used
+ */
+ EvalPlanQualSetTuple(&onConflict->mt_epqstate,
+ resultRelInfo->ri_RangeTableIndex, copyTuple);
+
+ slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+
+ if (!TupIsNull(slot))
+ ExecUpdate(&tuple.t_data->t_ctid, NULL, slot, planSlot,
+ &onConflict->mt_epqstate, onConflict->ps.state,
+ false);
+
+ ReleaseBuffer(buffer);
+
+ /*
+ * As when executing an UPDATE's ModifyTable node in the
+ * conventional manner, reset the per-output-tuple ExprContext
+ */
+ ResetPerTupleExprContext(onConflict->ps.state);
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStopNode(onConflict->ps.instrument, 0);
+
+ return true;
+ case HeapTupleUpdated:
+ if (IsolationUsesXactSnapshot())
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent update")));
+
+ /*
+ * Tell caller to try again from the very start. We don't use the
+ * usual EvalPlanQual() looping pattern here, fundamentally because
+ * we don't have a useful qual to verify the next tuple with. Our
+ * "qual" is really any user-supplied qual AND the unique
+ * constraint "col OP value" implied by a speculative insertion
+ * conflict. However, because of the selective evaluation of the
+ * former "qual" (the interactions with MVCC and row locking), this
+ * is an over-simplification.
+ *
+ * We might devise a means of verifying, by way of binary equality
+ * in a similar manner to HOT codepaths, if any unique indexed
+ * columns changed, but this would only serve to ameliorate the
+ * fundamental problem. It might well not be good enough, because
+ * those columns could change too. It seems unlikely that working
+ * harder here is worthwhile.
+ *
+ * At this point, all bets are off -- it might actually turn out to
+ * be okay to proceed with insertion instead of locking now (the
+ * tuple we attempted to lock could have been deleted, for
+ * example). On the other hand, it might not be okay, but for an
+ * entirely different reason, with an entirely separate TID to
+ * blame and lock. This TID may not even be part of the same
+ * update chain.
+ */
+ ReleaseBuffer(buffer);
+ return false;
+ default:
+ elog(ERROR, "unrecognized heap_lock_tuple status: %u", test);
+ }
+
+ return false;
+}
+
+
/*
* Process BEFORE EACH STATEMENT triggers
*/
@@ -803,6 +1171,9 @@ fireBSTriggers(ModifyTableState *node)
{
case CMD_INSERT:
ExecBSInsertTriggers(node->ps.state, node->resultRelInfo);
+ if (node->spec == SPEC_INSERT)
+ ExecBSUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
break;
case CMD_UPDATE:
ExecBSUpdateTriggers(node->ps.state, node->resultRelInfo);
@@ -825,6 +1196,9 @@ fireASTriggers(ModifyTableState *node)
switch (node->operation)
{
case CMD_INSERT:
+ if (node->spec == SPEC_INSERT)
+ ExecASUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
ExecASInsertTriggers(node->ps.state, node->resultRelInfo);
break;
case CMD_UPDATE:
@@ -852,6 +1226,8 @@ ExecModifyTable(ModifyTableState *node)
{
EState *estate = node->ps.state;
CmdType operation = node->operation;
+ ModifyTableState *onConflict = (ModifyTableState *) node->onConflict;
+ SpecType spec = node->spec;
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
PlanState *subplanstate;
@@ -1022,7 +1398,9 @@ ExecModifyTable(ModifyTableState *node)
switch (operation)
{
case CMD_INSERT:
- slot = ExecInsert(slot, planSlot, estate, node->canSetTag);
+ slot = ExecInsert(slot, planSlot, onConflict,
+ node->arbiterIndex, spec, estate,
+ node->canSetTag);
break;
case CMD_UPDATE:
slot = ExecUpdate(tupleid, oldtuple, slot, planSlot,
@@ -1070,6 +1448,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ModifyTableState *mtstate;
CmdType operation = node->operation;
+ Plan *onConflictPlan = node->onConflictPlan;
int nplans = list_length(node->plans);
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
@@ -1097,6 +1476,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->resultRelInfo = estate->es_result_relations + node->resultRelIndex;
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
mtstate->mt_nplans = nplans;
+ mtstate->spec = node->spec;
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -1137,6 +1517,14 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
resultRelInfo->ri_IndexRelationDescs == NULL)
ExecOpenIndices(resultRelInfo);
+ /*
+ * ON CONFLICT UPDATE variant must have unique index to arbitrate on
+ * taking alternative path
+ */
+ Assert(node->spec != SPEC_INSERT || node->arbiterIndex != InvalidOid);
+
+ mtstate->arbiterIndex = node->arbiterIndex;
+
/* Now init the plan for this result rel */
estate->es_result_relation_info = resultRelInfo;
mtstate->mt_plans[i] = ExecInitNode(subplan, estate, eflags);
@@ -1308,7 +1696,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
break;
case CMD_UPDATE:
case CMD_DELETE:
- junk_filter_needed = true;
+ junk_filter_needed = (node->spec == SPEC_NONE);
break;
default:
elog(ERROR, "unknown operation");
@@ -1373,6 +1761,30 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
}
/*
+ * Initialize auxiliary ModifyTable node for INSERT...ON CONFLICT UPDATE.
+ *
+ * The UPDATE portion of the query is essentially represented as auxiliary
+ * to INSERT state at all stages of query processing, with a representation
+ * at each stage that is analogous to a regular UPDATE.
+ */
+ if (onConflictPlan)
+ {
+ PlanState *pstate;
+
+ Assert(mtstate->spec == SPEC_INSERT);
+
+ /*
+ * Initialize auxiliary child plan.
+ *
+ * ExecModifyTable() is never called for auxiliary update
+ * ModifyTableState. Execution of the auxiliary plan is driven by its
+ * parent in an ad-hoc fashion.
+ */
+ pstate = ExecInitNode(onConflictPlan, estate, eflags);
+ mtstate->onConflict = pstate;
+ }
+
+ /*
* Set up a tuple table slot for use for trigger output tuples. In a plan
* containing multiple ModifyTable nodes, all can share one such slot, so
* we keep it in the estate.
@@ -1387,11 +1799,18 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* ModifyTable node too, but there's no need.) Note the use of lcons not
* lappend: we need later-initialized ModifyTable nodes to be shut down
* before earlier ones. This ensures that we don't throw away RETURNING
- * rows that need to be seen by a later CTE subplan.
+ * rows that need to be seen by a later CTE subplan. We do not want to
+ * append an auxiliary ON CONFLICT UPDATE node either, since it must have a
+ * parent SPEC_INSERT ModifyTable node that it is auxiliary to that
+ * directly drives execution of what is logically a single unified
+ * statement (*that* plan will be appended here, though). If it must
+ * project updated rows, that will only ever be done through the parent.
*/
- if (!mtstate->canSetTag)
+ if (!mtstate->canSetTag && mtstate->spec != SPEC_UPDATE)
+ {
estate->es_auxmodifytables = lcons(mtstate,
estate->es_auxmodifytables);
+ }
return mtstate;
}
@@ -1442,6 +1861,8 @@ ExecEndModifyTable(ModifyTableState *node)
*/
for (i = 0; i < node->mt_nplans; i++)
ExecEndNode(node->mt_plans[i]);
+
+ ExecEndNode(node->onConflict);
}
void
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 00ffe4a..6c1a7f1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -178,6 +178,9 @@ _copyModifyTable(const ModifyTable *from)
COPY_NODE_FIELD(resultRelations);
COPY_SCALAR_FIELD(resultRelIndex);
COPY_NODE_FIELD(plans);
+ COPY_SCALAR_FIELD(spec);
+ COPY_SCALAR_FIELD(arbiterIndex);
+ COPY_NODE_FIELD(onConflictPlan);
COPY_NODE_FIELD(withCheckOptionLists);
COPY_NODE_FIELD(returningLists);
COPY_NODE_FIELD(fdwPrivLists);
@@ -2120,6 +2123,31 @@ _copyWithClause(const WithClause *from)
return newnode;
}
+static InferClause *
+_copyInferClause(const InferClause *from)
+{
+ InferClause *newnode = makeNode(InferClause);
+
+ COPY_NODE_FIELD(indexElems);
+ COPY_NODE_FIELD(whereClause);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
+static ConflictClause *
+_copyConflictClause(const ConflictClause *from)
+{
+ ConflictClause *newnode = makeNode(ConflictClause);
+
+ COPY_SCALAR_FIELD(specclause);
+ COPY_NODE_FIELD(infer);
+ COPY_NODE_FIELD(updatequery);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
static CommonTableExpr *
_copyCommonTableExpr(const CommonTableExpr *from)
{
@@ -2525,6 +2553,10 @@ _copyQuery(const Query *from)
COPY_NODE_FIELD(jointree);
COPY_NODE_FIELD(targetList);
COPY_NODE_FIELD(withCheckOptions);
+ COPY_SCALAR_FIELD(specClause);
+ COPY_NODE_FIELD(arbiterExpr);
+ COPY_NODE_FIELD(arbiterWhere);
+ COPY_NODE_FIELD(onConflict);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(groupClause);
COPY_NODE_FIELD(havingQual);
@@ -2548,6 +2580,7 @@ _copyInsertStmt(const InsertStmt *from)
COPY_NODE_FIELD(relation);
COPY_NODE_FIELD(cols);
COPY_NODE_FIELD(selectStmt);
+ COPY_NODE_FIELD(confClause);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(withClause);
@@ -4721,6 +4754,12 @@ copyObject(const void *from)
case T_WithClause:
retval = _copyWithClause(from);
break;
+ case T_InferClause:
+ retval = _copyInferClause(from);
+ break;
+ case T_ConflictClause:
+ retval = _copyConflictClause(from);
+ break;
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 79035b2..4127269 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -863,6 +863,10 @@ _equalQuery(const Query *a, const Query *b)
COMPARE_NODE_FIELD(jointree);
COMPARE_NODE_FIELD(targetList);
COMPARE_NODE_FIELD(withCheckOptions);
+ COMPARE_SCALAR_FIELD(specClause);
+ COMPARE_NODE_FIELD(arbiterExpr);
+ COMPARE_NODE_FIELD(arbiterWhere);
+ COMPARE_NODE_FIELD(onConflict);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(groupClause);
COMPARE_NODE_FIELD(havingQual);
@@ -884,6 +888,7 @@ _equalInsertStmt(const InsertStmt *a, const InsertStmt *b)
COMPARE_NODE_FIELD(relation);
COMPARE_NODE_FIELD(cols);
COMPARE_NODE_FIELD(selectStmt);
+ COMPARE_NODE_FIELD(confClause);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(withClause);
@@ -2426,6 +2431,27 @@ _equalWithClause(const WithClause *a, const WithClause *b)
}
static bool
+_equalInferClause(const InferClause *a, const InferClause *b)
+{
+ COMPARE_NODE_FIELD(indexElems);
+ COMPARE_NODE_FIELD(whereClause);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
+_equalConflictClause(const ConflictClause *a, const ConflictClause *b)
+{
+ COMPARE_SCALAR_FIELD(specclause);
+ COMPARE_NODE_FIELD(infer);
+ COMPARE_NODE_FIELD(updatequery);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
_equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
{
COMPARE_STRING_FIELD(ctename);
@@ -3148,6 +3174,12 @@ equal(const void *a, const void *b)
case T_WithClause:
retval = _equalWithClause(a, b);
break;
+ case T_InferClause:
+ retval = _equalInferClause(a, b);
+ break;
+ case T_ConflictClause:
+ retval = _equalConflictClause(a, b);
+ break;
case T_CommonTableExpr:
retval = _equalCommonTableExpr(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 21dfda7..4107cc9 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -1474,6 +1474,12 @@ exprLocation(const Node *expr)
case T_WithClause:
loc = ((const WithClause *) expr)->location;
break;
+ case T_InferClause:
+ loc = ((const InferClause *) expr)->location;
+ break;
+ case T_ConflictClause:
+ loc = ((const ConflictClause *) expr)->location;
+ break;
case T_CommonTableExpr:
loc = ((const CommonTableExpr *) expr)->location;
break;
@@ -1958,6 +1964,12 @@ query_tree_walker(Query *query,
return true;
if (walker((Node *) query->withCheckOptions, context))
return true;
+ if (walker((Node *) query->arbiterExpr, context))
+ return true;
+ if (walker(query->arbiterWhere, context))
+ return true;
+ if (walker(query->onConflict, context))
+ return true;
if (walker((Node *) query->returningList, context))
return true;
if (walker((Node *) query->jointree, context))
@@ -2699,6 +2711,9 @@ query_tree_mutator(Query *query,
MUTATE(query->targetList, query->targetList, List *);
MUTATE(query->withCheckOptions, query->withCheckOptions, List *);
+ MUTATE(query->arbiterExpr, query->arbiterExpr, List *);
+ MUTATE(query->arbiterWhere, query->arbiterWhere, Node *);
+ MUTATE(query->onConflict, query->onConflict, Node *);
MUTATE(query->returningList, query->returningList, List *);
MUTATE(query->jointree, query->jointree, FromExpr *);
MUTATE(query->setOperations, query->setOperations, Node *);
@@ -2968,6 +2983,8 @@ raw_expression_tree_walker(Node *node,
return true;
if (walker(stmt->selectStmt, context))
return true;
+ if (walker(stmt->confClause, context))
+ return true;
if (walker(stmt->returningList, context))
return true;
if (walker(stmt->withClause, context))
@@ -3207,6 +3224,25 @@ raw_expression_tree_walker(Node *node,
break;
case T_WithClause:
return walker(((WithClause *) node)->ctes, context);
+
+ case T_InferClause:
+ {
+ InferClause *stmt = (InferClause *) node;
+
+ if (walker(stmt->indexElems, context))
+ return true;
+ if (walker(stmt->whereClause, context))
+ return true;
+ }
+ case T_ConflictClause:
+ {
+ ConflictClause *stmt = (ConflictClause *) node;
+
+ if (walker(stmt->infer, context))
+ return true;
+ if (walker(stmt->updatequery, context))
+ return true;
+ }
case T_CommonTableExpr:
return walker(((CommonTableExpr *) node)->ctequery, context);
default:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b4a2667..a32fbaa 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -330,6 +330,9 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_INT_FIELD(resultRelIndex);
WRITE_NODE_FIELD(plans);
+ WRITE_ENUM_FIELD(spec, SpecType);
+ WRITE_OID_FIELD(arbiterIndex);
+ WRITE_NODE_FIELD(onConflictPlan);
WRITE_NODE_FIELD(withCheckOptionLists);
WRITE_NODE_FIELD(returningLists);
WRITE_NODE_FIELD(fdwPrivLists);
@@ -2301,6 +2304,10 @@ _outQuery(StringInfo str, const Query *node)
WRITE_NODE_FIELD(jointree);
WRITE_NODE_FIELD(targetList);
WRITE_NODE_FIELD(withCheckOptions);
+ WRITE_ENUM_FIELD(specClause, SpecType);
+ WRITE_NODE_FIELD(arbiterExpr);
+ WRITE_NODE_FIELD(arbiterWhere);
+ WRITE_NODE_FIELD(onConflict);
WRITE_NODE_FIELD(returningList);
WRITE_NODE_FIELD(groupClause);
WRITE_NODE_FIELD(havingQual);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index dbc162a..9f6570f 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -214,6 +214,10 @@ _readQuery(void)
READ_NODE_FIELD(jointree);
READ_NODE_FIELD(targetList);
READ_NODE_FIELD(withCheckOptions);
+ READ_ENUM_FIELD(specClause, SpecType);
+ READ_NODE_FIELD(arbiterExpr);
+ READ_NODE_FIELD(arbiterWhere);
+ READ_NODE_FIELD(onConflict);
READ_NODE_FIELD(returningList);
READ_NODE_FIELD(groupClause);
READ_NODE_FIELD(havingQual);
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index b86a3cd..fc4bb08 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4013,3 +4013,59 @@ string_to_const(const char *str, Oid datatype)
return makeConst(datatype, -1, collation, constlen,
conval, false, false);
}
+
+/*
+ * plan_speculative_use_index
+ * Use the planner to decide speculative insertion arbiter index
+ *
+ * rel is the target to undergo ON CONFLICT UPDATE/IGNORE. Decide which index
+ * to use. This should be called infrequently in practice, because its unusual
+ * for more than one index to be available that can satisfy a user-specified
+ * unique index inference specification.
+ *
+ * Note: caller had better already hold some type of lock on the table.
+ */
+Oid
+plan_speculative_use_index(PlannerInfo *root, List *indexList)
+{
+ IndexOptInfo *indexInfo;
+ RelOptInfo *rel;
+ IndexPath *cheapest;
+ IndexPath *indexScanPath;
+ ListCell *lc;
+
+ /* Set up RTE/RelOptInfo arrays if needed */
+ if (!root->simple_rel_array)
+ setup_simple_rel_arrays(root);
+
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, root->parse->resultRelation, RELOPT_BASEREL);
+
+ /* Locate cheapest IndexOptInfo for the target index */
+ cheapest = NULL;
+
+ foreach(lc, rel->indexlist)
+ {
+ indexInfo = (IndexOptInfo *) lfirst(lc);
+
+ if (!list_member_oid(indexList, indexInfo->indexoid))
+ continue;
+
+ /* Estimate the cost of index scan */
+ indexScanPath = create_index_path(root, indexInfo,
+ NIL, NIL, NIL, NIL, NIL,
+ ForwardScanDirection, false,
+ NULL, 1.0);
+
+ if (!cheapest || compare_fractional_path_costs(&cheapest->path,
+ &indexScanPath->path,
+ DEFAULT_RANGE_INEQ_SEL) > 0)
+ cheapest = indexScanPath;
+
+ }
+
+ if (!cheapest)
+ return InvalidOid;
+ else
+ return cheapest->indexinfo->indexoid;
+}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 1258961..263ff5f 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -255,13 +255,17 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
/*
* We don't support pushing join clauses into the quals of a tidscan, but
* it could still have required parameterization due to LATERAL refs in
- * its tlist.
+ * its tlist. To be tidy, we disallow TID scans as the unexecuted scan
+ * node of an ON CONFLICT UPDATE auxiliary query, even though there is no
+ * reason to think that would be harmful; the optimizer should always
+ * prefer a SeqScan or Result node (actually, we assert that it's one of
+ * those two in several places, so accepting TID scans would break those).
*/
required_outer = rel->lateral_relids;
tidquals = TidQualFromRestrictinfo(rel->baserestrictinfo, rel->relid);
- if (tidquals)
+ if (tidquals && root->parse->specClause != SPEC_UPDATE)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
required_outer));
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 655be81..e8eed55 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4811,7 +4811,8 @@ make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam)
+ List *rowMarks, Plan *onConflictPlan, SpecType spec,
+ int epqParam)
{
ModifyTable *node = makeNode(ModifyTable);
Plan *plan = &node->plan;
@@ -4860,6 +4861,9 @@ make_modifytable(PlannerInfo *root,
node->resultRelations = resultRelations;
node->resultRelIndex = -1; /* will be set correctly in setrefs.c */
node->plans = subplans;
+ node->spec = spec;
+ node->arbiterIndex = InvalidOid;
+ node->onConflictPlan = onConflictPlan;
node->withCheckOptionLists = withCheckOptionLists;
node->returningLists = returningLists;
node->rowMarks = rowMarks;
@@ -4912,6 +4916,16 @@ make_modifytable(PlannerInfo *root,
}
node->fdwPrivLists = fdw_private_list;
+ /*
+ * If a set of unique index inference expressions was provided (for
+ * INSERT...ON CONFLICT UPDATE/IGNORE), then infer appropriate
+ * unique index (or throw an error if none is available). It's
+ * possible that there will be a costing step in the event of
+ * having to choose between multiple alternatives.
+ */
+ if (root->parse->arbiterExpr)
+ node->arbiterIndex = infer_unique_index(root);
+
return node;
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9cbbcfb..4e154fb 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -612,7 +612,55 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
+
+ if (parse->onConflict)
+ {
+ Query *conflictQry = (Query*) parse->onConflict;
+ ModifyTable *parent = (ModifyTable *) plan;
+
+ /*
+ * An ON CONFLICT UPDATE query is a subquery of its parent
+ * INSERT ModifyTable, but isn't formally a subplan -- it's an
+ * "auxiliary" plan.
+ *
+ * During execution, the auxiliary plan state is used to
+ * execute the UPDATE query in an ad-hoc manner, driven by the
+ * parent. The executor will only ever execute the auxiliary
+ * plan through its parent. onConflictPlan is "auxiliary" to
+ * its parent in the sense that it's strictly encapsulated from
+ * other code (for example, the executor does not separately
+ * track it within estate as a plan that needs to have
+ * execution finished when it appears within a data-modifying
+ * CTE -- only the parent is specifically tracked in that
+ * manner).
+ *
+ * There is a fundamental nexus between parent and auxiliary
+ * plans that makes a fully unified representation seem
+ * compelling (a "CMD_UPSERT" ModifyTable plan and Query).
+ * That would obviate the need to specially track auxiliary
+ * state across all stages of execution just for this case; the
+ * optimizer would then not have to generate a fully-formed,
+ * independent UPDATE subquery plan (with a scanstate only
+ * useful for EvalPlanQual() re-evaluation). However, it's
+ * convenient to plan each ModifyTable separately, as doing so
+ * maximizes code reuse. The alternative must be to introduce
+ * abstractions that (for example) allow a single "CMD_UPSERT"
+ * ModifyTable to have two distinct types of targetlist (that
+ * will need to be processed differently during parsing and
+ * rewriting anyway). The "auxiliary" UPDATE plan is a good
+ * trade-off between a fully-fledged "CMD_UPSERT"
+ * representation, and the opposite extreme of tracking two
+ * separate ModifyTable nodes, joined by a contrived join type,
+ * with (for example) odd properties around tuple visibility
+ * not well encapsulated.
+ */
+ parent->onConflictPlan = subquery_planner(glob, conflictQry,
+ root, hasRecursion,
+ 0, NULL);
+ }
}
}
@@ -1056,6 +1104,8 @@ inheritance_planner(PlannerInfo *root)
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5d865b0..3368173 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -779,9 +779,28 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
* global list.
*/
splan->resultRelIndex = list_length(root->glob->resultRelations);
- root->glob->resultRelations =
- list_concat(root->glob->resultRelations,
- list_copy(splan->resultRelations));
+
+ if (!splan->onConflictPlan)
+ {
+ /*
+ * Only actually append result relation for non-auxiliary
+ * ModifyTable plans
+ */
+ root->glob->resultRelations =
+ list_concat(root->glob->resultRelations,
+ list_copy(splan->resultRelations));
+ }
+ else
+ {
+ splan->onConflictPlan = (Plan *) set_plan_refs(root,
+ (Plan *) splan->onConflictPlan,
+ rtoffset);
+ /*
+ * Set up the visible plan targetlist as being the same as
+ * the parent. Again, this is for the use of EXPLAIN only.
+ */
+ splan->onConflictPlan->targetlist = splan->plan.targetlist;
+ }
}
break;
case T_Append:
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 78fb6b1..f7a0523 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2345,6 +2345,12 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
valid_params,
scan_params));
}
+
+ /*
+ * No need to directly handle onConflictPlan here, since it
+ * cannot have params (due to parse analysis enforced
+ * restrictions prohibiting subqueries).
+ */
}
break;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index fb7db6d..3086ca3 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -31,6 +31,7 @@
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
+#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
@@ -125,10 +126,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/*
* Make list of indexes. Ignore indexes on system catalogs if told to.
- * Don't bother with indexes for an inheritance parent, either.
+ * Don't bother with indexes for an inheritance parent or speculative
+ * insertion UPDATE auxiliary queries, either.
*/
if (inhparent ||
- (IgnoreSystemIndexes && IsSystemRelation(relation)))
+ (IgnoreSystemIndexes && IsSystemRelation(relation)) ||
+ root->parse->specClause == SPEC_UPDATE)
hasindex = false;
else
hasindex = relation->rd_rel->relhasindex;
@@ -394,6 +397,221 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
}
/*
+ * infer_unique_index -
+ * Retrieves unique index to arbitrate speculative insertion.
+ *
+ * Uses user-supplied inference clause expressions and predicate to match a
+ * unique index from those defined and ready on the heap relation (target). An
+ * exact match is required on columns/expressions (although they can appear in
+ * any order). However, the predicate given by the user need only restrict
+ * insertion to a subset of some part of the table covered by some particular
+ * unique index (in particular, a partial unique index) in order to be
+ * inferred.
+ *
+ * The implementation does not consider which B-Tree operator class any
+ * particular available unique index uses. In particular, there is no system
+ * dependency on the default operator class for the purposes of inference.
+ * This should be okay, since by convention non-default opclasses only
+ * introduce alternative sort orders, not alternative notions of equality
+ * (there are only trivial known exceptions to this convention, where "equals"
+ * operator of a type's opclasses do not match across opclasses, exceptions
+ * that exist precisely to discourage user code from using the divergent
+ * opclass). Even if we assume that a type could usefully have multiple
+ * alternative concepts of equality, surely the definition actually implied by
+ * the operator class of actually indexed attributes is pertinent. However,
+ * this is a bit of a wart, because strictly speaking there is leeway for a
+ * query to be interpreted in deference to available unique indexes, and
+ * indexes are traditionally only an implementation detail. It hardly seems
+ * worth it to waste cycles on this corner case, though.
+ *
+ * This logic somewhat mirrors get_relation_info(). This process is not
+ * deferred to a get_relation_info() call while planning because there may not
+ * be any such call. In the ON CONFLICT UPDATE case get_relation_info() will
+ * be called, for auxiliary query planning, but even then indexes won't be
+ * examined since they're not generally interesting to that case (building
+ * index paths is explicitly avoided for auxiliary query planning, in fact).
+ */
+Oid
+infer_unique_index(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+ Relation relation;
+ Oid relationObjectId;
+ Bitmapset *plainAttrs = NULL;
+ List *candidates = NIL;
+ ListCell *l;
+ List *indexList;
+
+ Assert(parse->specClause == SPEC_INSERT ||
+ parse->specClause == SPEC_IGNORE);
+
+ /*
+ * We need not lock the relation since it was already locked, either by
+ * the rewriter or when expand_inherited_rtentry() added it to the query's
+ * rangetable.
+ */
+ relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid;
+
+ relation = heap_open(relationObjectId, NoLock);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(l, parse->arbiterExpr)
+ {
+ Expr *elem;
+ Var *var;
+ int attno;
+
+ elem = (Expr *) lfirst(l);
+
+ /*
+ * Parse analysis of inference elements performs full parse analysis of
+ * Vars, even for non-expression indexes (in contrast with utility
+ * command related use of IndexElem). However, indexes are cataloged
+ * with simple attribute numbers for non-expression indexes.
+ * Therefore, we must build a compatible bms representation here.
+ */
+ if (!IsA(elem, Var))
+ continue;
+
+ var = (Var*) elem;
+ attno = var->varattno;
+
+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("system columns may not appear in unique index inference specification")));
+ else if (attno == 0)
+ elog(ERROR, "whole row unique index inference specifications are not valid");
+
+ plainAttrs = bms_add_member(plainAttrs, attno);
+ }
+
+ indexList = RelationGetIndexList(relation);
+
+ /*
+ * Using that representation, iterate through the list of indexes on the
+ * target relation to try and find a match
+ */
+ foreach(l, indexList)
+ {
+ Oid indexoid = lfirst_oid(l);
+ Relation idxRel;
+ Form_pg_index idxForm;
+ Bitmapset *indexedPlainAttrs = NULL;
+ List *idxExprs;
+ List *predExprs;
+ List *whereExplicit;
+ AttrNumber natt;
+ ListCell *e;
+
+ /*
+ * Extract info from the relation descriptor for the index. We know
+ * that this is a target, so get lock type it is known will ultimately
+ * be required by the executor.
+ *
+ * Let executor complain about !indimmediate case directly.
+ */
+ idxRel = index_open(indexoid, RowExclusiveLock);
+ idxForm = idxRel->rd_index;
+
+ if (!idxForm->indisunique ||
+ !IndexIsValid(idxForm))
+ goto next;
+
+ /*
+ * If the index is valid, but cannot yet be used, ignore it. See
+ * src/backend/access/heap/README.HOT for discussion.
+ */
+ if (idxForm->indcheckxmin &&
+ !TransactionIdPrecedes(HeapTupleHeaderGetXmin(idxRel->rd_indextuple->t_data),
+ TransactionXmin))
+ goto next;
+
+ /* Check in detail if the clause attributes/expressions match */
+ for (natt = 0; natt < idxForm->indnatts; natt++)
+ {
+ int attno = idxRel->rd_index->indkey.values[natt];
+
+ if (attno < 0)
+ elog(ERROR, "system column in index");
+
+ if (attno != 0)
+ indexedPlainAttrs = bms_add_member(indexedPlainAttrs, attno);
+ }
+
+ /*
+ * Since expressions were made unique during parse analysis, it's
+ * evident that we cannot proceed with this index if the number of
+ * attributes (plain or expression) does not match exactly. This
+ * precludes support for unique indexes created with redundantly
+ * referenced columns (which are not forbidden by CREATE INDEX), but
+ * this seems inconsequential.
+ */
+ if (list_length(parse->arbiterExpr) != idxForm->indnatts)
+ goto next;
+
+ idxExprs = RelationGetIndexExpressions(idxRel);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(e, parse->arbiterExpr)
+ {
+ Expr *elem = (Expr *) lfirst(e);
+
+ /* Plain Vars were already separately accounted for */
+ if (IsA(elem, Var))
+ continue;
+
+ if (!list_member(idxExprs, elem))
+ goto next;
+ }
+
+ /* Non-expression attributes (if any) must match */
+ if (!bms_equal(indexedPlainAttrs, plainAttrs))
+ goto next;
+
+ /*
+ * Any user-supplied ON CONFLICT unique index inference WHERE clause
+ * need only be implied by the cataloged index definitions predicate
+ */
+ predExprs = RelationGetIndexPredicate(idxRel);
+ whereExplicit = make_ands_implicit((Expr *) parse->arbiterWhere);
+
+ if (!predicate_implied_by(predExprs, whereExplicit))
+ goto next;
+
+ candidates = lappend_oid(candidates, idxForm->indexrelid);
+next:
+ index_close(idxRel, NoLock);
+ }
+
+ list_free(indexList);
+ heap_close(relation, NoLock);
+
+ /*
+ * In the common case where there is only a single candidate unique index,
+ * there is clearly no point in building index paths to determine which is
+ * cheapest.
+ */
+ if (list_length(candidates) == 1)
+ return linitial_oid(candidates);
+ else if (candidates == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT")));
+ else
+ /* Otherwise, deduce the least expensive unique index */
+ return plan_speculative_use_index(root, candidates);
+
+ return InvalidOid; /* keep compiler quiet */
+}
+
+/*
* estimate_rel_size - estimate # pages and # tuples in a table or index
*
* We also estimate the fraction of the pages that are marked all-visible in
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index df89065..caaa44c 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -387,6 +387,8 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -408,6 +410,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
{
Query *qry = makeNode(Query);
SelectStmt *selectStmt = (SelectStmt *) stmt->selectStmt;
+ SpecType spec = stmt->confClause? stmt->confClause->specclause : SPEC_NONE;
List *exprList = NIL;
bool isGeneralSelect;
List *sub_rtable;
@@ -425,6 +428,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
qry->commandType = CMD_INSERT;
pstate->p_is_insert = true;
+ pstate->p_is_speculative = spec != SPEC_NONE;
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -478,8 +482,9 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
* mentioned in the SELECT part. Note that the target table is not added
* to the joinlist or namespace.
*/
- qry->resultRelation = setTargetTable(pstate, stmt->relation,
- false, false, ACL_INSERT);
+ qry->resultRelation = setTargetTable(pstate, stmt->relation, false, false,
+ ACL_INSERT |
+ (spec == SPEC_INSERT ? ACL_UPDATE : 0));
/* Validate stmt->cols list, or build default list if no list given */
icolumns = checkInsertTargets(pstate, stmt->cols, &attrnos);
@@ -741,12 +746,13 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
}
/*
- * If we have a RETURNING clause, we need to add the target relation to
- * the query namespace before processing it, so that Var references in
- * RETURNING will work. Also, remove any namespace entries added in a
- * sub-SELECT or VALUES list.
+ * If we have a RETURNING clause, or there are attributes used as the
+ * condition on which to take an alternative ON CONFLICT path, we need to
+ * add the target relation to the query namespace before processing it, so
+ * that Var references in RETURNING/the alternative path key will work.
+ * Also, remove any namespace entries added in a sub-SELECT or VALUES list.
*/
- if (stmt->returningList)
+ if (stmt->returningList || stmt->confClause)
{
pstate->p_namespace = NIL;
addRTEtoQuery(pstate, pstate->p_target_rangetblentry,
@@ -758,8 +764,66 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
-
+ qry->specClause = spec;
qry->hasSubLinks = pstate->p_hasSubLinks;
+ qry->onConflict = NULL;
+
+ if (stmt->confClause)
+ {
+ /*
+ * ON CONFLICT UPDATE requires special parse analysis of auxiliary
+ * update Query
+ */
+ if (stmt->confClause->updatequery)
+ {
+ UpdateStmt *pupd;
+ Query *dqry;
+ ParseState *sub_pstate = make_parsestate(pstate);
+ RangeTblEntry *subTarget;
+
+ pupd = (UpdateStmt *) stmt->confClause->updatequery;
+
+ if (!IsA(pupd, UpdateStmt))
+ elog(ERROR, "unrecognized statement in ON CONFLICT clause");
+
+ /* Assign same target relation as parent InsertStmt */
+ pupd->relation = stmt->relation;
+
+ /*
+ * The optimizer is not prepared to accept a subquery RTE for a
+ * non-CMD_SELECT Query. The CMD_UPDATE Query is tracked as
+ * special auxiliary state, while there is more or less analogous
+ * auxiliary state tracked in later stages of query execution.
+ */
+ dqry = transformStmt(sub_pstate, (Node *) pupd);
+ dqry->specClause = SPEC_UPDATE;
+ dqry->canSetTag = false;
+
+ /* Save auxiliary query */
+ qry->onConflict = (Node *) dqry;
+
+ /*
+ * Mark parent Query as requiring appropriate UPDATE/SELECT
+ * privileges
+ */
+ subTarget = sub_pstate->p_target_rangetblentry;
+
+ rte->updatedCols = bms_copy(subTarget->updatedCols);
+ rte->selectedCols = bms_union(rte->selectedCols,
+ subTarget->selectedCols);
+
+ free_parsestate(sub_pstate);
+ }
+
+ /*
+ * Infer a unique index from columns/expressions. This is later used
+ * to infer a unique index which arbitrates whether or not to take the
+ * alternative ON CONFLICT path (i.e. whether or not to INSERT or
+ * UPDATE/IGNORE in respect of each slot proposed for insertion).
+ */
+ transformConflictClause(pstate, stmt->confClause, &qry->arbiterExpr,
+ &qry->arbiterWhere);
+ }
assign_query_collations(pstate, qry);
@@ -1006,6 +1070,8 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -1906,6 +1972,10 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->commandType = CMD_UPDATE;
pstate->p_is_update = true;
+ pstate->p_is_speculative = (pstate->parentParseState &&
+ (!pstate->p_parent_cte &&
+ pstate->parentParseState->p_is_insert &&
+ pstate->parentParseState->p_is_speculative));
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -1915,6 +1985,18 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
}
+ /*
+ * Having established that this is a speculative insertion's auxiliary
+ * update, do not allow the query to access parent parse state. This is a
+ * bit of a kludge, but is the most direct way of making parent RTEs
+ * invisible. If we failed to take this measure, the parent's spuriously
+ * visible target could be illegally referenced within the auxiliary query
+ * were it to use the original target table name (rather than the standard
+ * TARGET.* alias).
+ */
+ if (pstate->p_is_speculative)
+ pstate->parentParseState = NULL;
+
qry->resultRelation = setTargetTable(pstate, stmt->relation,
interpretInhOption(stmt->relation->inhOpt),
true,
@@ -1947,6 +2029,8 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 36dac29..bb36975 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -215,6 +215,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RangeVar *range;
IntoClause *into;
WithClause *with;
+ InferClause *infer;
+ ConflictClause *conf;
A_Indices *aind;
ResTarget *target;
struct PrivTarget *privtarget;
@@ -415,6 +417,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <defelt> SeqOptElem
%type <istmt> insert_rest
+%type <infer> opt_conf_expr
+%type <conf> opt_on_conflict
%type <vsetstmt> generic_set set_rest set_rest_more generic_reset reset_rest
SetResetClause FunctionSetResetClause
@@ -513,6 +517,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> cte_list
%type <list> within_group_clause
+%type <node> UpdateInsertStmt
%type <node> filter_clause
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
@@ -551,8 +556,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
CACHE CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLLATION COLUMN COMMENT COMMENTS COMMIT
- COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
- CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
+ COMMITTED CONCURRENTLY CONFIGURATION CONFLICT CONNECTION CONSTRAINT
+ CONSTRAINTS CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
CROSS CSV CURRENT_P
CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
@@ -572,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+ IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -652,6 +657,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
+%nonassoc DISTINCT
+%nonassoc ON
%left POSTFIXOP /* dummy for postfix Op rules */
/*
* To support target_el without AS, we must give IDENT an explicit priority
@@ -9399,10 +9406,12 @@ DeallocateStmt: DEALLOCATE name
*****************************************************************************/
InsertStmt:
- opt_with_clause INSERT INTO qualified_name insert_rest returning_clause
+ opt_with_clause INSERT INTO qualified_name insert_rest
+ opt_on_conflict returning_clause
{
$5->relation = $4;
- $5->returningList = $6;
+ $5->confClause = $6;
+ $5->returningList = $7;
$5->withClause = $1;
$$ = (Node *) $5;
}
@@ -9447,6 +9456,44 @@ insert_column_item:
}
;
+opt_on_conflict:
+ ON CONFLICT opt_conf_expr UpdateInsertStmt
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_INSERT;
+ $$->infer = $3;
+ $$->updatequery = $4;
+ $$->location = @1;
+ }
+ |
+ ON CONFLICT opt_conf_expr IGNORE_P
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_IGNORE;
+ $$->infer = $3;
+ $$->updatequery = NULL;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
+opt_conf_expr:
+ '(' index_params where_clause ')'
+ {
+ $$ = makeNode(InferClause);
+ $$->indexElems = $2;
+ $$->whereClause = $3;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
returning_clause:
RETURNING target_list { $$ = $2; }
| /* EMPTY */ { $$ = NIL; }
@@ -9546,6 +9593,21 @@ UpdateStmt: opt_with_clause UPDATE relation_expr_opt_alias
}
;
+UpdateInsertStmt: UPDATE
+ SET set_clause_list
+ where_clause
+ {
+ UpdateStmt *n = makeNode(UpdateStmt);
+ n->relation = NULL;
+ n->targetList = $3;
+ n->fromClause = NULL;
+ n->whereClause = $4;
+ n->returningList = NULL;
+ n->withClause = NULL;
+ $$ = (Node *)n;
+ }
+ ;
+
set_clause_list:
set_clause { $$ = $1; }
| set_clause_list ',' set_clause { $$ = list_concat($1,$3); }
@@ -13188,6 +13250,7 @@ unreserved_keyword:
| COMMIT
| COMMITTED
| CONFIGURATION
+ | CONFLICT
| CONNECTION
| CONSTRAINTS
| CONTENT_P
@@ -13247,6 +13310,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE_P
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index 7b0e668..82ac526 100644
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -342,6 +342,10 @@ transformAggregateCall(ParseState *pstate, Aggref *agg,
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("aggregate functions are not allowed in ON CONFLICT UPDATE");
+
if (err)
ereport(ERROR,
(errcode(ERRCODE_GROUPING_ERROR),
@@ -671,6 +675,9 @@ transformWindowFuncCall(ParseState *pstate, WindowFunc *wfunc,
* which is sane anyway.
*/
}
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("window functions are not allowed in ON CONFLICT UPDATE");
+
if (err)
ereport(ERROR,
(errcode(ERRCODE_WINDOWING_ERROR),
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 654dce6..6487559 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -75,6 +75,8 @@ static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node,
List **tlist, ParseExprKind exprKind);
static int get_matching_location(int sortgroupref,
List *sortgrouprefs, List *exprs);
+static List* resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel);
static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle,
List *grouplist, List *targetlist, int location,
bool resolveUnknown);
@@ -2166,6 +2168,167 @@ get_matching_location(int sortgroupref, List *sortgrouprefs, List *exprs)
}
/*
+ * resolve_unique_index_expr
+ * Infer a unique index from a list of indexElems, for ON
+ * CONFLICT UPDATE/IGNORE
+ *
+ * Perform parse analysis of expressions and columns appearing within ON
+ * CONFLICT clause. During planning, the returned list of expressions is used
+ * to infer which unique index to use.
+ */
+static List *
+resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel)
+{
+ List *clauseexprs = NIL;
+ ListCell *l;
+
+ if (heapRel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" is not an ordinary table",
+ RelationGetRelationName(heapRel)),
+ errhint("Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ if (heapRel->rd_rel->relhassubclass)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" has inheritance children",
+ RelationGetRelationName(heapRel)),
+ errhint("Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ foreach(l, infer->indexElems)
+ {
+ IndexElem *ielem = (IndexElem *) lfirst(l);
+ Node *trans;
+
+ /*
+ * Raw grammar re-uses CREATE INDEX infrastructure for unique index
+ * inference clause, and so will accept opclasses by name and so on.
+ * Reject these here explicitly.
+ */
+ if (ielem->ordering != SORTBY_DEFAULT ||
+ ielem->nulls_ordering != SORTBY_NULLS_DEFAULT)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT does not accept ordering or NULLS FIRST/LAST specifications"),
+ errhint("These factors do not affect uniqueness of indexed datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->collation != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT collation specification is unnecessary"),
+ errhint("Collations do not affect uniqueness of collatable datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->opclass != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ON CONFLICT cannot accept non-default operator class specifications"),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (!ielem->expr)
+ {
+ /* Simple index attribute */
+ ColumnRef *n;
+
+ /*
+ * Grammar won't have built raw expression for us in event of plain
+ * column reference. Create one directly, and perform expression
+ * transformation, which seems better principled than simply
+ * propagating catalog-style simple attribute numbers. For
+ * example, it means the Var is marked for SELECT privileges, which
+ * speculative insertion requires. Planner expects this, and
+ * performs its own normalization for the purposes of matching
+ * against pg_index.
+ */
+ n = makeNode(ColumnRef);
+ n->fields = list_make1(makeString(ielem->name));
+ /* Location is approximately that of inference specification */
+ n->location = infer->location;
+ trans = (Node *) n;
+ }
+ else
+ {
+ /* Do parse transformation of the raw expression */
+ trans = (Node *) ielem->expr;
+ }
+
+ /*
+ * transformExpr() should have already rejected subqueries,
+ * aggregates, and window functions, based on the EXPR_KIND_ for an
+ * index expression. Expressions returning sets won't have been
+ * rejected, but don't bother doing so here; there should be no
+ * available expression unique index to match any such expression
+ * against anyway.
+ */
+ trans = transformExpr(pstate, trans, EXPR_KIND_INDEX_EXPRESSION);
+ /* Save in list of transformed expressions */
+ clauseexprs = list_append_unique(clauseexprs, trans);
+ }
+
+ return clauseexprs;
+}
+
+/*
+ * transformConflictClauseExpr -
+ * transform expressions of ON CONFLICT UPDATE/IGNORE.
+ *
+ * Transformed expressions used to infer one unique index relation to serve as
+ * an ON CONFLICT arbiter. Partial unique indexes may be inferred using WHERE
+ * clause from inference specification clause.
+ */
+void
+transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere)
+{
+ InferClause *infer = confClause->infer;
+
+ if (confClause->specclause == SPEC_INSERT && !infer)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from"),
+ parser_errposition(pstate,
+ exprLocation((Node *) confClause))));
+
+ /* Raw grammar must ensure this invariant holds */
+ Assert(confClause->specclause != SPEC_INSERT ||
+ confClause->updatequery != NULL);
+
+ /*
+ * If there is no inference clause, this might be an updatable view, which
+ * are supported by ON CONFLICT IGNORE (without columns/ expressions
+ * specified to infer a unique index from -- this is mandatory for the
+ * UPDATE variant). It might also be a relation with inheritance children,
+ * which would also make proceeding with inference fail.
+ */
+ if (infer)
+ {
+ *arbiterExpr = resolve_unique_index_expr(pstate, infer,
+ pstate->p_target_relation);
+
+ /* Handling inference WHERE clause (for partial unique index inference) */
+ if (infer->whereClause)
+ *arbiterWhere = transformExpr(pstate, infer->whereClause,
+ EXPR_KIND_INDEX_PREDICATE);
+ }
+
+ /*
+ * It's convenient to form a list of expressions based on the
+ * representation used by CREATE INDEX, since the same restrictions are
+ * appropriate (on subqueries and so on). However, from here on, the
+ * handling of those expressions is identical to ordinary optimizable
+ * statements. In particular, assign_query_collations() can be trusted to
+ * do the right thing with the post parse analysis query tree inference
+ * clause representation.
+ */
+}
+
+/*
* addTargetToSortList
* If the given targetlist entry isn't already in the SortGroupClause
* list, add it to the end of the list, using the given sort ordering
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f0f0488..70bf80f 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1564,6 +1564,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update && !err)
+ err = _("cannot use subquery in ON CONFLICT UPDATE");
if (err)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index fab2948..a076625 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -66,7 +66,7 @@ static void markQueryForLocking(Query *qry, Node *jtnode,
LockClauseStrength strength, LockWaitPolicy waitPolicy,
bool pushedDown);
static List *matchLocks(CmdType event, RuleLock *rulelocks,
- int varno, Query *parsetree);
+ int varno, Query *parsetree, bool *hasUpdate);
static Query *fireRIRrules(Query *parsetree, List *activeRIRs,
bool forUpdatePushedDown);
static bool view_has_instead_trigger(Relation view, CmdType event);
@@ -1288,7 +1288,8 @@ static List *
matchLocks(CmdType event,
RuleLock *rulelocks,
int varno,
- Query *parsetree)
+ Query *parsetree,
+ bool *hasUpdate)
{
List *matching_locks = NIL;
int nlocks;
@@ -1309,6 +1310,9 @@ matchLocks(CmdType event,
{
RewriteRule *oneLock = rulelocks->rules[i];
+ if (oneLock->event == CMD_UPDATE)
+ *hasUpdate = true;
+
/*
* Suppress ON INSERT/UPDATE/DELETE rules that are disabled or
* configured to not fire during the current sessions replication
@@ -2961,6 +2965,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
CmdType event = parsetree->commandType;
bool instead = false;
bool returning = false;
+ bool updatableview = false;
Query *qual_product = NULL;
List *rewritten = NIL;
ListCell *lc1;
@@ -3043,6 +3048,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
Relation rt_entry_relation;
List *locks;
List *product_queries;
+ bool hasUpdate = false;
result_relation = parsetree->resultRelation;
Assert(result_relation != 0);
@@ -3094,6 +3100,19 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
/* Process just the main targetlist */
rewriteTargetListIU(parsetree, rt_entry_relation, NULL);
}
+
+ if (parsetree->specClause == SPEC_INSERT)
+ {
+ Query *qry;
+
+ /*
+ * While user-defined rules will never be applied in the
+ * auxiliary update query, normalization of tlist is still
+ * required
+ */
+ qry = (Query *) parsetree->onConflict;
+ rewriteTargetListIU(qry, rt_entry_relation, NULL);
+ }
}
else if (event == CMD_UPDATE)
{
@@ -3111,7 +3130,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
* Collect and apply the appropriate rules.
*/
locks = matchLocks(event, rt_entry_relation->rd_rules,
- result_relation, parsetree);
+ result_relation, parsetree, &hasUpdate);
product_queries = fireRules(parsetree,
result_relation,
@@ -3160,6 +3179,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
instead = true;
returning = true;
+ updatableview = true;
}
/*
@@ -3240,6 +3260,18 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
}
}
+ /*
+ * Updatable views are supported on a limited basis by ON CONFLICT
+ * IGNORE (if there is no unique index inference required, speculative
+ * insertion proceeds).
+ */
+ if (parsetree->specClause != SPEC_NONE &&
+ (product_queries != NIL || hasUpdate) &&
+ !updatableview)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules")));
+
heap_close(rt_entry_relation, NoLock);
}
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a1ebc72..f321df1 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -421,6 +421,13 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
latestXid))
ShmemVariableCache->latestCompletedXid = latestXid;
+ /* Also clear any speculative insertion information */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+
LWLockRelease(ProcArrayLock);
}
else
@@ -438,6 +445,11 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
pgxact->delayChkpt = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
Assert(pgxact->nxids == 0);
Assert(pgxact->overflowed == false);
@@ -476,6 +488,13 @@ ProcArrayClearTransaction(PGPROC *proc)
/* Clear the subtransaction-XID cache too */
pgxact->nxids = 0;
pgxact->overflowed = false;
+
+ /* these should be clear, but just in case.. */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
}
/*
@@ -1108,6 +1127,83 @@ TransactionIdIsActive(TransactionId xid)
return result;
}
+void
+SetSpeculativeInsertionToken(uint32 token)
+{
+ MyProc->specInsertToken = token;
+}
+
+void
+SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel = relnode;
+ ItemPointerCopy(tid, &MyProc->specInsertTid);
+ LWLockRelease(ProcArrayLock);
+}
+
+void
+ClearSpeculativeInsertionState(void)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * Returns a speculative insertion token for waiting for the insertion to
+ * finish
+ */
+uint32
+SpeculativeInsertionIsInProgress(TransactionId xid, RelFileNode rel,
+ ItemPointer tid)
+{
+ uint32 result = 0;
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+
+ if (TransactionIdPrecedes(xid, RecentXmin))
+ return false;
+
+ /*
+ * Get the top transaction id.
+ *
+ * XXX We could search the proc array first, like
+ * TransactionIdIsInProgress() does, but this isn't performance-critical.
+ */
+ xid = SubTransGetTopmostTransaction(xid);
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+
+ if (pgxact->xid == xid)
+ {
+ /*
+ * Found the backend. Is it doing a speculative insertion of the
+ * given tuple?
+ */
+ if (RelFileNodeEquals(proc->specInsertRel, rel) &&
+ ItemPointerEquals(tid, &proc->specInsertTid))
+ result = proc->specInsertToken;
+
+ break;
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+
+ return result;
+}
+
/*
* GetOldestXmin -- returns oldest transaction that was running
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index d13a167..7a1df22 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -575,6 +575,69 @@ ConditionalXactLockTableWait(TransactionId xid)
return true;
}
+static uint32 speculativeInsertionToken = 0;
+
+/*
+ * SpeculativeInsertionLockAcquire
+ *
+ * Insert a lock showing that the given transaction ID is inserting a tuple,
+ * but hasn't yet decided whether it's going to keep it. The lock can then be
+ * used to wait for the decision to go ahead with the insertion, or aborting
+ * it.
+ *
+ * The token is used to distinguish multiple insertions by the same
+ * transaction. A counter will do, for example.
+ */
+void
+SpeculativeInsertionLockAcquire(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ speculativeInsertionToken++;
+ SetSpeculativeInsertionToken(speculativeInsertionToken);
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ (void) LockAcquire(&tag, ExclusiveLock, false, false);
+}
+
+/*
+ * SpeculativeInsertionLockRelease
+ *
+ * Delete the lock showing that the given transaction is speculatively
+ * inserting a tuple.
+ */
+void
+SpeculativeInsertionLockRelease(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ LockRelease(&tag, ExclusiveLock, false);
+}
+
+/*
+ * SpeculativeInsertionWait
+ *
+ * Wait for the specified transaction to finish or abort the insertion of a
+ * tuple.
+ */
+void
+SpeculativeInsertionWait(TransactionId xid, uint32 token)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, token);
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(token != 0);
+
+ (void) LockAcquire(&tag, ShareLock, false, false);
+ LockRelease(&tag, ShareLock, false);
+}
+
+
/*
* XactLockTableWaitErrorContextCb
* Error context callback for transaction lock waits.
@@ -873,6 +936,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
tag->locktag_field1,
tag->locktag_field2);
break;
+ case LOCKTAG_PROMISE_TUPLE_INSERTION:
+ appendStringInfo(buf,
+ _("tuple insertion by transaction %u"),
+ tag->locktag_field1);
+ break;
case LOCKTAG_OBJECT:
appendStringInfo(buf,
_("object %u of class %u of database %u"),
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index a1967b69..95d62cb 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -28,6 +28,7 @@ static const char *const LockTagTypeNames[] = {
"tuple",
"transactionid",
"virtualxid",
+ "inserter transactionid",
"object",
"userlock",
"advisory"
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 777f55c..f16e6af 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -726,6 +726,17 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Assert(htup->t_tableOid != InvalidOid);
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
+ snapshot->speculativeToken = 0;
+
+ /*
+ * Never return "super-deleted" tuples
+ *
+ * XXX: Comment this code out and you'll get conflicts within
+ * ExecLockUpdateTuple(), which result in an infinite loop.
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -807,6 +818,26 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
{
+ RelFileNode rnode;
+ ForkNumber forkno;
+ BlockNumber blockno;
+
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+
+ /* tuples can only be in the main fork */
+ Assert(forkno == MAIN_FORKNUM);
+ Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
+
+ /*
+ * Set speculative token. Caller can worry about xmax, since it
+ * requires a conclusively locked row version, and a concurrent
+ * update to this tuple is a conflict of its purposes.
+ */
+ snapshot->speculativeToken =
+ SpeculativeInsertionIsInProgress(HeapTupleHeaderGetRawXmin(tuple),
+ rnode,
+ &htup->t_self);
+
snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
/* XXX shouldn't we fall through to look at xmax? */
return true; /* in insertion by other */
@@ -922,6 +953,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1126,6 +1164,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
+ * Immediately VACUUM "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return HEAPTUPLE_DEAD;
+
+ /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 939d93d..62e760a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -28,6 +28,7 @@
#define HEAP_INSERT_SKIP_WAL 0x0001
#define HEAP_INSERT_SKIP_FSM 0x0002
#define HEAP_INSERT_FROZEN 0x0004
+#define HEAP_INSERT_SPECULATIVE 0x0008
typedef struct BulkInsertStateData *BulkInsertState;
@@ -141,7 +142,7 @@ extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
CommandId cid, int options, BulkInsertState bistate);
extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd);
+ HeapUpdateFailureData *hufd, bool killspeculative);
extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index a2ed2a0..ae21789 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -73,6 +73,8 @@
#define XLOG_HEAP_SUFFIX_FROM_OLD (1<<6)
/* last xl_heap_multi_insert record for one heap_multi_insert() call */
#define XLOG_HEAP_LAST_MULTI_INSERT (1<<7)
+/* XXX: Make sure that re-use of bits is safe here */
+#define XLOG_HEAP_KILLED_SPECULATIVE_TUPLE (XLOG_HEAP_LAST_MULTI_INSERT | XLOG_HEAP_PREFIX_FROM_OLD)
/* convenience macro for checking whether any form of old tuple was logged */
#define XLOG_HEAP_CONTAINS_OLD \
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 40fde83..9400801 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -354,14 +354,19 @@ extern void ExecCloseScanRelation(Relation scanrel);
extern void ExecOpenIndices(ResultRelInfo *resultRelInfo);
extern void ExecCloseIndices(ResultRelInfo *resultRelInfo);
+extern List *ExecLockIndexValues(TupleTableSlot *slot, EState *estate,
+ SpecType specReason);
extern List *ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid,
- EState *estate);
-extern bool check_exclusion_constraint(Relation heap, Relation index,
- IndexInfo *indexInfo,
- ItemPointer tupleid,
- Datum *values, bool *isnull,
- EState *estate,
- bool newIndex, bool errorOK);
+ EState *estate, bool noDupErr, Oid arbiterIdx);
+extern bool ExecCheckIndexConstraints(TupleTableSlot *slot, EState *estate,
+ ItemPointer conflictTid, Oid arbiterIdx);
+extern bool check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo,
+ ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate,
+ bool newIndex, bool errorOK,
+ bool wait, ItemPointer conflictTid);
extern void RegisterExprContextCallback(ExprContext *econtext,
ExprContextCallbackFunction function,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41288ed..19b5e29 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -41,6 +41,9 @@
* ExclusionOps Per-column exclusion operators, or NULL if none
* ExclusionProcs Underlying function OIDs for ExclusionOps
* ExclusionStrats Opclass strategy numbers for ExclusionOps
+ * UniqueOps Theses are like Exclusion*, but for unique indexes
+ * UniqueProcs
+ * UniqueStrats
* Unique is it a unique index?
* ReadyForInserts is it valid for inserts?
* Concurrent are we doing a concurrent index build?
@@ -62,6 +65,9 @@ typedef struct IndexInfo
Oid *ii_ExclusionOps; /* array with one entry per column */
Oid *ii_ExclusionProcs; /* array with one entry per column */
uint16 *ii_ExclusionStrats; /* array with one entry per column */
+ Oid *ii_UniqueOps; /* array with one entry per column */
+ Oid *ii_UniqueProcs; /* array with one entry per column */
+ uint16 *ii_UniqueStrats; /* array with one entry per column */
bool ii_Unique;
bool ii_ReadyForInserts;
bool ii_Concurrent;
@@ -1088,6 +1094,9 @@ typedef struct ModifyTableState
int mt_whichplan; /* which one is being executed (0..n-1) */
ResultRelInfo *resultRelInfo; /* per-subplan target relations */
List **mt_arowmarks; /* per-subplan ExecAuxRowMark lists */
+ SpecType spec; /* reason for speculative insertion */
+ Oid arbiterIndex; /* unique index to arbitrate taking alt path */
+ PlanState *onConflict; /* associated OnConflict state */
EPQState mt_epqstate; /* for evaluating EvalPlanQual rechecks */
bool fireBSTriggers; /* do we need to fire stmt triggers? */
} ModifyTableState;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 97ef0fc..cac6b15 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -412,6 +412,8 @@ typedef enum NodeTag
T_RowMarkClause,
T_XmlSerialize,
T_WithClause,
+ T_InferClause,
+ T_ConflictClause,
T_CommonTableExpr,
/*
@@ -624,4 +626,16 @@ typedef enum JoinType
(1 << JOIN_RIGHT) | \
(1 << JOIN_ANTI))) != 0)
+/* SpecType - "Speculative insertion" clause
+ *
+ * This also appears across various subsystems
+ */
+typedef enum
+{
+ SPEC_NONE, /* Not involved in speculative insertion */
+ SPEC_IGNORE, /* INSERT of "ON CONFLICT IGNORE" */
+ SPEC_INSERT, /* INSERT of "ON CONFLICT UPDATE" */
+ SPEC_UPDATE /* UPDATE of "ON CONFLICT UPDATE" */
+} SpecType;
+
#endif /* NODES_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 86d1c07..9ae3bb5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -132,6 +132,11 @@ typedef struct Query
List *withCheckOptions; /* a list of WithCheckOption's */
+ SpecType specClause; /* speculative insertion clause */
+ List *arbiterExpr; /* Unique index arbiter exprs */
+ Node *arbiterWhere; /* Unique index arbiter WHERE clause */
+ Node *onConflict; /* ON CONFLICT Query */
+
List *returningList; /* return-values list (of TargetEntry) */
List *groupClause; /* a list of SortGroupClause's */
@@ -564,7 +569,7 @@ typedef enum TableLikeOption
} TableLikeOption;
/*
- * IndexElem - index parameters (used in CREATE INDEX)
+ * IndexElem - index parameters (used in CREATE INDEX, and in ON CONFLICT)
*
* For a plain index attribute, 'name' is the name of the table column to
* index, and 'expr' is NULL. For an index expression, 'name' is NULL and
@@ -999,6 +1004,36 @@ typedef struct WithClause
} WithClause;
/*
+ * InferClause -
+ * ON CONFLICT unique index inference clause
+ *
+ * Note: InferClause does not propagate into the Query representation.
+ */
+typedef struct InferClause
+{
+ NodeTag type;
+ List *indexElems; /* IndexElems to infer unique index */
+ Node *whereClause; /* qualification (partial-index predicate) */
+ int location; /* token location, or -1 if unknown */
+} InferClause;
+
+/*
+ * ConflictClause -
+ * representation of ON CONFLICT clause
+ *
+ * Note: ConflictClause does not propagate into the Query representation.
+ * However, Query may contain onConflict child Query.
+ */
+typedef struct ConflictClause
+{
+ NodeTag type;
+ SpecType specclause; /* Variant specified */
+ InferClause *infer; /* Optional index inference clause */
+ Node *updatequery; /* Update parse stmt */
+ int location; /* token location, or -1 if unknown */
+} ConflictClause;
+
+/*
* CommonTableExpr -
* representation of WITH list element
*
@@ -1048,6 +1083,7 @@ typedef struct InsertStmt
RangeVar *relation; /* relation to insert into */
List *cols; /* optional: names of the target columns */
Node *selectStmt; /* the source SELECT/VALUES, or NULL */
+ ConflictClause *confClause; /* ON CONFLICT clause */
List *returningList; /* list of expressions to return */
WithClause *withClause; /* WITH clause */
} InsertStmt;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 316c9ce..c2269bb 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -177,6 +177,9 @@ typedef struct ModifyTable
List *resultRelations; /* integer list of RT indexes */
int resultRelIndex; /* index of first resultRel in plan's list */
List *plans; /* plan(s) producing source data */
+ SpecType spec; /* speculative insertion specification */
+ Oid arbiterIndex; /* Oid of ON CONFLICT arbiter index */
+ Plan *onConflictPlan; /* Plan for ON CONFLICT UPDATE auxiliary query */
List *withCheckOptionLists; /* per-target-table WCO lists */
List *returningLists; /* per-target-table RETURNING tlists */
List *fdwPrivLists; /* per-target-table FDW private data lists */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..801effe 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -64,6 +64,7 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
int indexcol,
List **indexcolnos,
bool *var_on_left_p);
+extern Oid plan_speculative_use_index(PlannerInfo *root, List *indexList);
/*
* tidpath.h
diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h
index 8eb2e57..878adfe 100644
--- a/src/include/optimizer/plancat.h
+++ b/src/include/optimizer/plancat.h
@@ -28,6 +28,8 @@ extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
extern void get_relation_info(PlannerInfo *root, Oid relationObjectId,
bool inhparent, RelOptInfo *rel);
+extern Oid infer_unique_index(PlannerInfo *root);
+
extern void estimate_rel_size(Relation rel, int32 *attr_widths,
BlockNumber *pages, double *tuples, double *allvisfrac);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 082f7d7..a5f3b5a 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -84,7 +84,8 @@ extern ModifyTable *make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam);
+ List *rowMarks, Plan *onConflictPlan, SpecType spec,
+ int epqParam);
extern bool is_projection_capable_plan(Plan *plan);
/*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c243ec..cf501e6 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -87,6 +87,7 @@ PG_KEYWORD("commit", COMMIT, UNRESERVED_KEYWORD)
PG_KEYWORD("committed", COMMITTED, UNRESERVED_KEYWORD)
PG_KEYWORD("concurrently", CONCURRENTLY, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("configuration", CONFIGURATION, UNRESERVED_KEYWORD)
+PG_KEYWORD("conflict", CONFLICT, UNRESERVED_KEYWORD)
PG_KEYWORD("connection", CONNECTION, UNRESERVED_KEYWORD)
PG_KEYWORD("constraint", CONSTRAINT, RESERVED_KEYWORD)
PG_KEYWORD("constraints", CONSTRAINTS, UNRESERVED_KEYWORD)
@@ -180,6 +181,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 6a4438f..d1d0d12 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -41,6 +41,8 @@ extern List *transformDistinctClause(ParseState *pstate,
List **targetlist, List *sortClause, bool is_agg);
extern List *transformDistinctOnClause(ParseState *pstate, List *distinctlist,
List **targetlist, List *sortClause);
+extern void transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere);
extern List *addTargetToSortList(ParseState *pstate, TargetEntry *tle,
List *sortlist, List *targetlist, SortBy *sortby,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 3103b71..2b5804e 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -153,6 +153,7 @@ struct ParseState
bool p_hasModifyingCTE;
bool p_is_insert;
bool p_is_update;
+ bool p_is_speculative;
bool p_locked_from_parent;
Relation p_target_relation;
RangeTblEntry *p_target_rangetblentry;
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index f5d70e5..6bb95fc 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -76,6 +76,11 @@ extern bool ConditionalXactLockTableWait(TransactionId xid);
extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
+/* Lock an XID for tuple insertion (used to wait for an insertion to finish) */
+extern void SpeculativeInsertionLockAcquire(TransactionId xid);
+extern void SpeculativeInsertionLockRelease(TransactionId xid);
+extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
+
/* Lock a general object (other than a relation) of the current database */
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1100923..9c21810 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -176,6 +176,8 @@ typedef enum LockTagType
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
+ LOCKTAG_PROMISE_TUPLE_INSERTION, /* tuple insertion, keyed by Xid */
+ /* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
@@ -261,6 +263,14 @@ typedef struct LOCKTAG
(locktag).locktag_type = LOCKTAG_VIRTUALTRANSACTION, \
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
+ ((locktag).locktag_field1 = (xid), \
+ (locktag).locktag_field2 = (token), \
+ (locktag).locktag_field3 = 0, \
+ (locktag).locktag_field4 = 0, \
+ (locktag).locktag_type = LOCKTAG_PROMISE_TUPLE_INSERTION, \
+ (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+
#define SET_LOCKTAG_OBJECT(locktag,dboid,classoid,objoid,objsubid) \
((locktag).locktag_field1 = (dboid), \
(locktag).locktag_field2 = (classoid), \
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index d194f38..47e791d 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,9 +16,11 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "storage/itemptr.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
+#include "storage/relfilenode.h"
/*
* Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
@@ -132,6 +134,14 @@ struct PGPROC
*/
SHM_QUEUE myProcLocks[NUM_LOCK_PARTITIONS];
+ /*
+ * If we're inserting a tuple, but we might still decide to kill it,
+ * pointer to that tuple.
+ */
+ RelFileNode specInsertRel;
+ ItemPointerData specInsertTid;
+ uint32 specInsertToken;
+
struct XidCache subxids; /* cache for subtransaction XIDs */
/* Per-backend LWLock. Protects fields below. */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 97c6e93..ea2bba9 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -55,6 +55,13 @@ extern TransactionId GetOldestXmin(Relation rel, bool ignoreVacuum);
extern TransactionId GetOldestActiveTransactionId(void);
extern TransactionId GetOldestSafeDecodingTransactionId(void);
+extern void SetSpeculativeInsertionToken(uint32 token);
+extern void SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid);
+extern void ClearSpeculativeInsertionState(void);
+extern uint32 SpeculativeInsertionIsInProgress(TransactionId xid,
+ RelFileNode rel,
+ ItemPointer tid);
+
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 26fb257..cd5ad76 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -87,6 +87,17 @@ typedef struct SnapshotData
bool copied; /* false if it's a static snapshot */
/*
+ * Snapshot's speculative token is value set by HeapTupleSatisfiesDirty,
+ * indicating that the tuple is being inserted speculatively, and may yet
+ * be "super-deleted" before EOX. The caller may use the value with
+ * PromiseTupleInsertionWait to wait for the inserter to decide. It is only
+ * set when a valid 'xmin' is set, too. By convention, when
+ * speculativeToken is zero, the caller must assume that is should wait on
+ * a non-speculative tuple (i.e. wait for xmin/xmax to commit).
+ */
+ uint32 speculativeToken;
+
+ /*
* note: all ids in subxip[] are >= xmin, but we don't bother filtering
* out any that are >= xmax
*/
--
1.9.1
0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchtext/x-patch; charset=US-ASCII; name=0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchDownload
From ea0c3e17fd4fd6f3e1d5a05b43d55f31ed64e0af Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 26 Aug 2014 21:28:40 -0700
Subject: [PATCH 1/8] Make UPDATE privileges distinct from INSERT privileges in
RTEs
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held. This worked because
ExecCheckRTEPerms() was called with a ACL_INSERT or ACL_UPDATE
requiredPerms, and based on that it was evident which type of
optimizable statement was under consideration. Since historically no
type of optimizable statement could directly INSERT and UPDATE at the
same time, there was no ambiguity as to which privileges were required.
This largely mechanical commit is required infrastructure for the
INSERT...ON CONFLICT UPDATE feature, which introduces an optimizable
statement that may be subject to both INSERT and UPDATE permissions
enforcement. Tests follow in a later commit.
sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken.
---
contrib/sepgsql/dml.c | 31 +++++++++-----
src/backend/commands/copy.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/trigger.c | 22 +++++-----
src/backend/executor/execMain.c | 70 ++++++++++++++++++++++++-------
src/backend/nodes/copyfuncs.c | 3 +-
src/backend/nodes/equalfuncs.c | 3 +-
src/backend/nodes/outfuncs.c | 3 +-
src/backend/nodes/readfuncs.c | 3 +-
src/backend/optimizer/plan/setrefs.c | 6 +--
src/backend/optimizer/prep/prepsecurity.c | 6 ++-
src/backend/optimizer/prep/prepunion.c | 8 ++--
src/backend/parser/analyze.c | 4 +-
src/backend/parser/parse_relation.c | 21 ++++++----
src/backend/rewrite/rewriteHandler.c | 52 +++++++++++++----------
src/include/nodes/parsenodes.h | 14 ++++---
16 files changed, 162 insertions(+), 88 deletions(-)
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 36c6a37..4a71753 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -145,7 +145,8 @@ fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
static bool
check_relation_privileges(Oid relOid,
Bitmapset *selected,
- Bitmapset *modified,
+ Bitmapset *inserted,
+ Bitmapset *updated,
uint32 required,
bool abort_on_violation)
{
@@ -231,8 +232,9 @@ check_relation_privileges(Oid relOid,
* Check permissions on the columns
*/
selected = fixup_whole_row_references(relOid, selected);
- modified = fixup_whole_row_references(relOid, modified);
- columns = bms_union(selected, modified);
+ inserted = fixup_whole_row_references(relOid, inserted);
+ updated = fixup_whole_row_references(relOid, updated);
+ columns = bms_union(selected, bms_union(inserted, updated));
while ((index = bms_first_member(columns)) >= 0)
{
@@ -241,13 +243,16 @@ check_relation_privileges(Oid relOid,
if (bms_is_member(index, selected))
column_perms |= SEPG_DB_COLUMN__SELECT;
- if (bms_is_member(index, modified))
+ if (bms_is_member(index, inserted))
{
- if (required & SEPG_DB_TABLE__UPDATE)
- column_perms |= SEPG_DB_COLUMN__UPDATE;
if (required & SEPG_DB_TABLE__INSERT)
column_perms |= SEPG_DB_COLUMN__INSERT;
}
+ if (bms_is_member(index, updated))
+ {
+ if (required & SEPG_DB_TABLE__UPDATE)
+ column_perms |= SEPG_DB_COLUMN__UPDATE;
+ }
if (column_perms == 0)
continue;
@@ -304,7 +309,7 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
required |= SEPG_DB_TABLE__INSERT;
if (rte->requiredPerms & ACL_UPDATE)
{
- if (!bms_is_empty(rte->modifiedCols))
+ if (!bms_is_empty(rte->updatedCols))
required |= SEPG_DB_TABLE__UPDATE;
else
required |= SEPG_DB_TABLE__LOCK;
@@ -333,7 +338,8 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
{
Oid tableOid = lfirst_oid(li);
Bitmapset *selectedCols;
- Bitmapset *modifiedCols;
+ Bitmapset *insertedCols;
+ Bitmapset *updatedCols;
/*
* child table has different attribute numbers, so we need to fix
@@ -341,15 +347,18 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
*/
selectedCols = fixup_inherited_columns(rte->relid, tableOid,
rte->selectedCols);
- modifiedCols = fixup_inherited_columns(rte->relid, tableOid,
- rte->modifiedCols);
+ insertedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->insertedCols);
+ updatedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->updatedCols);
/*
* check permissions on individual tables
*/
if (!check_relation_privileges(tableOid,
selectedCols,
- modifiedCols,
+ insertedCols,
+ updatedCols,
required, abort_on_violation))
return false;
}
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 8cb2f13..56f6b76 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -840,7 +840,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
FirstLowInvalidHeapAttributeNumber;
if (is_from)
- rte->modifiedCols = bms_add_member(rte->modifiedCols, attno);
+ rte->insertedCols = bms_add_member(rte->insertedCols, attno);
else
rte->selectedCols = bms_add_member(rte->selectedCols, attno);
}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index c961429..bf2235d 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -433,7 +433,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
rte->requiredPerms = ACL_INSERT;
for (attnum = 1; attnum <= intoRelationDesc->rd_att->natts; attnum++)
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attnum - FirstLowInvalidHeapAttributeNumber);
ExecCheckRTPerms(list_make1(rte), true);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 5c1c1be..7defe80 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -71,8 +71,8 @@ static int MyTriggerDepth = 0;
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
/* Local function prototypes */
static void ConvertTriggerToFK(CreateTrigStmt *stmt, Oid funcoid);
@@ -2343,7 +2343,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TriggerDesc *trigdesc;
int i;
TriggerData LocTriggerData;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
trigdesc = relinfo->ri_TrigDesc;
@@ -2352,7 +2352,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (!trigdesc->trig_update_before_statement)
return;
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
LocTriggerData.type = T_TriggerData;
LocTriggerData.tg_event = TRIGGER_EVENT_UPDATE |
@@ -2373,7 +2373,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, NULL, NULL))
+ updatedCols, NULL, NULL))
continue;
LocTriggerData.tg_trigger = trigger;
@@ -2398,7 +2398,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (trigdesc && trigdesc->trig_update_after_statement)
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
false, NULL, NULL, NIL,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
}
TupleTableSlot *
@@ -2416,7 +2416,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
HeapTuple oldtuple;
TupleTableSlot *newSlot;
int i;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
Bitmapset *keyCols;
LockTupleMode lockmode;
@@ -2425,10 +2425,10 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
* been modified, then we can use a weaker lock, allowing for better
* concurrency.
*/
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
keyCols = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc,
INDEX_ATTR_BITMAP_KEY);
- if (bms_overlap(keyCols, modifiedCols))
+ if (bms_overlap(keyCols, updatedCols))
lockmode = LockTupleExclusive;
else
lockmode = LockTupleNoKeyExclusive;
@@ -2482,7 +2482,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, trigtuple, newtuple))
+ updatedCols, trigtuple, newtuple))
continue;
LocTriggerData.tg_trigtuple = trigtuple;
@@ -2552,7 +2552,7 @@ ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo,
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
true, trigtuple, newtuple, recheckIndexes,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 20b3188..9e11040 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -97,8 +97,10 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
+#define GetInsertedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->insertedCols)
/* end of local decls */
@@ -648,27 +650,27 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
}
/*
- * Basically the same for the mod columns, with either INSERT or
- * UPDATE privilege as specified by remainingPerms.
+ * Basically the same for the mod columns, for both INSERT and UPDATE
+ * privilege as specified by remainingPerms (INSERT...ON CONFLICT
+ * UPDATE may set both).
*/
- remainingPerms &= ~ACL_SELECT;
- if (remainingPerms != 0)
+ if (remainingPerms & ACL_INSERT)
{
/*
- * When the query doesn't explicitly change any columns, allow the
+ * When the query doesn't explicitly insert any columns, allow the
* query if we have permission on any column of the rel. This is
* to handle SELECT FOR UPDATE as well as possible corner cases in
- * INSERT and UPDATE.
+ * UPDATE.
*/
- if (bms_is_empty(rte->modifiedCols))
+ if (bms_is_empty(rte->insertedCols))
{
- if (pg_attribute_aclcheck_all(relOid, userid, remainingPerms,
+ if (pg_attribute_aclcheck_all(relOid, userid, ACL_INSERT,
ACLMASK_ANY) != ACLCHECK_OK)
return false;
}
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->insertedCols, col)) >= 0)
{
/* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -681,7 +683,42 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
else
{
if (pg_attribute_aclcheck(relOid, attno, userid,
- remainingPerms) != ACLCHECK_OK)
+ ACL_INSERT) != ACLCHECK_OK)
+ return false;
+ }
+ }
+ }
+
+ if (remainingPerms & ACL_UPDATE)
+ {
+ /*
+ * When the query doesn't explicitly update any columns, allow the
+ * query if we have permission on any column of the rel. This is
+ * to handle SELECT FOR UPDATE as well as possible corner cases in
+ * UPDATE.
+ */
+ if (bms_is_empty(rte->updatedCols))
+ {
+ if (pg_attribute_aclcheck_all(relOid, userid, ACL_UPDATE,
+ ACLMASK_ANY) != ACLCHECK_OK)
+ return false;
+ }
+
+ col = -1;
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
+ {
+ /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
+ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+
+ if (attno == InvalidAttrNumber)
+ {
+ /* whole-row reference can't happen here */
+ elog(ERROR, "whole-row update is not implemented");
+ }
+ else
+ {
+ if (pg_attribute_aclcheck(relOid, attno, userid,
+ ACL_UPDATE) != ACLCHECK_OK)
return false;
}
}
@@ -1623,7 +1660,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1649,7 +1687,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1708,7 +1747,8 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f1a24f5..00ffe4a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2028,7 +2028,8 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(requiredPerms);
COPY_SCALAR_FIELD(checkAsUser);
COPY_BITMAPSET_FIELD(selectedCols);
- COPY_BITMAPSET_FIELD(modifiedCols);
+ COPY_BITMAPSET_FIELD(insertedCols);
+ COPY_BITMAPSET_FIELD(updatedCols);
COPY_NODE_FIELD(securityQuals);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6e8b308..79035b2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2345,7 +2345,8 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(requiredPerms);
COMPARE_SCALAR_FIELD(checkAsUser);
COMPARE_BITMAPSET_FIELD(selectedCols);
- COMPARE_BITMAPSET_FIELD(modifiedCols);
+ COMPARE_BITMAPSET_FIELD(insertedCols);
+ COMPARE_BITMAPSET_FIELD(updatedCols);
COMPARE_NODE_FIELD(securityQuals);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index dd1278b..b4a2667 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2456,7 +2456,8 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
WRITE_UINT_FIELD(requiredPerms);
WRITE_OID_FIELD(checkAsUser);
WRITE_BITMAPSET_FIELD(selectedCols);
- WRITE_BITMAPSET_FIELD(modifiedCols);
+ WRITE_BITMAPSET_FIELD(insertedCols);
+ WRITE_BITMAPSET_FIELD(updatedCols);
WRITE_NODE_FIELD(securityQuals);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ae24d05..dbc162a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1253,7 +1253,8 @@ _readRangeTblEntry(void)
READ_UINT_FIELD(requiredPerms);
READ_OID_FIELD(checkAsUser);
READ_BITMAPSET_FIELD(selectedCols);
- READ_BITMAPSET_FIELD(modifiedCols);
+ READ_BITMAPSET_FIELD(insertedCols);
+ READ_BITMAPSET_FIELD(updatedCols);
READ_NODE_FIELD(securityQuals);
READ_DONE();
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7703946..5d865b0 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -368,9 +368,9 @@ flatten_rtes_walker(Node *node, PlannerGlobal *glob)
*
* In the flat rangetable, we zero out substructure pointers that are not
* needed by the executor; this reduces the storage space and copying cost
- * for cached plans. We keep only the alias and eref Alias fields, which
- * are needed by EXPLAIN, and the selectedCols and modifiedCols bitmaps,
- * which are needed for executor-startup permissions checking and for
+ * for cached plans. We keep only the alias and eref Alias fields, which are
+ * needed by EXPLAIN, and the selectedCols, insertedCols and updatedCols
+ * bitmaps, which are needed for executor-startup permissions checking and for
* trigger event checking.
*/
static void
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index af3ee61..f86e792 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -115,7 +115,8 @@ expand_security_quals(PlannerInfo *root, List *tlist)
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the original relation
@@ -213,7 +214,8 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Now deal with any PlanRowMark on this RTE by requesting a lock
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 05f601e..1e28363 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1367,14 +1367,16 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
* if this is the parent table, leave copyObject's result alone.
*
* Note: we need to do this even though the executor won't run any
- * permissions checks on the child RTE. The modifiedCols bitmap may
- * be examined for trigger-firing purposes.
+ * permissions checks on the child RTE. The insertedCols/updatedCols
+ * bitmaps may be examined for trigger-firing purposes.
*/
if (childOID != parentOID)
{
childrte->selectedCols = translate_col_privs(rte->selectedCols,
appinfo->translated_vars);
- childrte->modifiedCols = translate_col_privs(rte->modifiedCols,
+ childrte->insertedCols = translate_col_privs(rte->insertedCols,
+ appinfo->translated_vars);
+ childrte->updatedCols = translate_col_privs(rte->updatedCols,
appinfo->translated_vars);
}
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a68f2e8..df89065 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -733,7 +733,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
false);
qry->targetList = lappend(qry->targetList, tle);
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attr_num - FirstLowInvalidHeapAttributeNumber);
icols = lnext(icols);
@@ -2002,7 +2002,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
origTarget->location);
/* Mark the target column as requiring update permissions */
- target_rte->modifiedCols = bms_add_member(target_rte->modifiedCols,
+ target_rte->updatedCols = bms_add_member(target_rte->updatedCols,
attrno - FirstLowInvalidHeapAttributeNumber);
origTargetList = lnext(origTargetList);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 8d4f79f..d2820d8 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1052,7 +1052,8 @@ addRangeTableEntry(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1105,7 +1106,8 @@ addRangeTableEntryForRelation(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1183,7 +1185,8 @@ addRangeTableEntryForSubquery(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1437,7 +1440,8 @@ addRangeTableEntryForFunction(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1509,7 +1513,8 @@ addRangeTableEntryForValues(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1577,7 +1582,8 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1677,7 +1683,8 @@ addRangeTableEntryForCTE(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index b8e6e7a..fab2948 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1403,7 +1403,8 @@ ApplyRetrieveRule(Query *parsetree,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the view should remain as
@@ -1466,12 +1467,14 @@ ApplyRetrieveRule(Query *parsetree,
subrte->requiredPerms = rte->requiredPerms;
subrte->checkAsUser = rte->checkAsUser;
subrte->selectedCols = rte->selectedCols;
- subrte->modifiedCols = rte->modifiedCols;
+ subrte->insertedCols = rte->insertedCols;
+ subrte->updatedCols = rte->updatedCols;
rte->requiredPerms = 0; /* no permission check on subquery itself */
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* If FOR [KEY] UPDATE/SHARE of view, mark all the contained tables as
@@ -2584,9 +2587,9 @@ rewriteTargetView(Query *parsetree, Relation view)
/*
* For INSERT/UPDATE the modified columns must all be updatable. Note that
* we get the modified columns from the query's targetlist, not from the
- * result RTE's modifiedCols set, since rewriteTargetListIU may have added
- * additional targetlist entries for view defaults, and these must also be
- * updatable.
+ * result RTE's insertedCols and/or updatedCols set, since
+ * rewriteTargetListIU may have added additional targetlist entries for
+ * view defaults, and these must also be updatable.
*/
if (parsetree->commandType != CMD_DELETE)
{
@@ -2723,26 +2726,31 @@ rewriteTargetView(Query *parsetree, Relation view)
*
* Initially, new_rte contains selectedCols permission check bits for all
* base-rel columns referenced by the view, but since the view is a SELECT
- * query its modifiedCols is empty. We set modifiedCols to include all
- * the columns the outer query is trying to modify, adjusting the column
- * numbers as needed. But we leave selectedCols as-is, so the view owner
- * must have read permission for all columns used in the view definition,
- * even if some of them are not read by the outer query. We could try to
- * limit selectedCols to only columns used in the transformed query, but
- * that does not correspond to what happens in ordinary SELECT usage of a
- * view: all referenced columns must have read permission, even if
- * optimization finds that some of them can be discarded during query
- * transformation. The flattening we're doing here is an optional
- * optimization, too. (If you are unpersuaded and want to change this,
- * note that applying adjust_view_column_set to view_rte->selectedCols is
- * clearly *not* the right answer, since that neglects base-rel columns
- * used in the view's WHERE quals.)
+ * query its insertedCols/updatedCols is empty. We set insertedCols and
+ * updatedCols to include all the columns the outer query is trying to
+ * modify, adjusting the column numbers as needed. But we leave
+ * selectedCols as-is, so the view owner must have read permission for all
+ * columns used in the view definition, even if some of them are not read
+ * by the outer query. We could try to limit selectedCols to only columns
+ * used in the transformed query, but that does not correspond to what
+ * happens in ordinary SELECT usage of a view: all referenced columns must
+ * have read permission, even if optimization finds that some of them can
+ * be discarded during query transformation. The flattening we're doing
+ * here is an optional optimization, too. (If you are unpersuaded and want
+ * to change this, note that applying adjust_view_column_set to
+ * view_rte->selectedCols is clearly *not* the right answer, since that
+ * neglects base-rel columns used in the view's WHERE quals.)
*
* This step needs the modified view targetlist, so we have to do things
* in this order.
*/
- Assert(bms_is_empty(new_rte->modifiedCols));
- new_rte->modifiedCols = adjust_view_column_set(view_rte->modifiedCols,
+ Assert(bms_is_empty(new_rte->insertedCols) &&
+ bms_is_empty(new_rte->updatedCols));
+
+ new_rte->insertedCols = adjust_view_column_set(view_rte->insertedCols,
+ view_targetlist);
+
+ new_rte->updatedCols = adjust_view_column_set(view_rte->updatedCols,
view_targetlist);
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..86d1c07 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -717,11 +717,12 @@ typedef struct XmlSerialize
* For SELECT/INSERT/UPDATE permissions, if the user doesn't have
* table-wide permissions then it is sufficient to have the permissions
* on all columns identified in selectedCols (for SELECT) and/or
- * modifiedCols (for INSERT/UPDATE; we can tell which from the query type).
- * selectedCols and modifiedCols are bitmapsets, which cannot have negative
- * integer members, so we subtract FirstLowInvalidHeapAttributeNumber from
- * column numbers before storing them in these fields. A whole-row Var
- * reference is represented by setting the bit for InvalidAttrNumber.
+ * insertedCols and/or updatedCols (INSERT with ON CONFLICT UPDATE may
+ * have all 3). selectedCols, insertedCols and updatedCols are
+ * bitmapsets, which cannot have negative integer members, so we subtract
+ * FirstLowInvalidHeapAttributeNumber from column numbers before storing
+ * them in these fields. A whole-row Var reference is represented by
+ * setting the bit for InvalidAttrNumber.
*--------------------
*/
typedef enum RTEKind
@@ -816,7 +817,8 @@ typedef struct RangeTblEntry
AclMode requiredPerms; /* bitmask of required access permissions */
Oid checkAsUser; /* if valid, check access as this role */
Bitmapset *selectedCols; /* columns needing SELECT permission */
- Bitmapset *modifiedCols; /* columns needing INSERT/UPDATE permission */
+ Bitmapset *insertedCols; /* columns needing INSERT permission */
+ Bitmapset *updatedCols; /* columns needing UPDATE permission */
List *securityQuals; /* any security barrier quals to apply */
} RangeTblEntry;
--
1.9.1
On Thu, Jan 29, 2015 at 11:38 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
Simply removing IGNORE support until such time as we straighten
that all out (9.6?) seems like the simplest solution. No need to block
the progress of "UPSERT", since exclusion constraint support was
only ever going to be useful for the less compelling IGNORE variant.
What do other people think? Do you agree with my view that we should
shelve IGNORE support for now, Heikki?
I appreciate the work you're doing and (as a user rather than a
pg-hacker) don't want to butt in but if it would be possible to allow
support for IGNORE for unique but not exclusion constraints that would
be really helpful for my own use cases, where being able to insert
from a dataset into a table containing unique constraints without
having to first check the dataset for uniqueness (within both itself
and the target table) is very useful.
It's possible that I've misunderstood anyway and that you mean purely
that exclusion constraint IGNORE should be shelved until 9.6, in which
case I apologise.
Of course if there's no way to allow unique constraint IGNORE without
exclusion constraints then just ignore me; I (along I'm sure with all
the others who are following this conversation from afar) will be
incredibly grateful for the work you've done either way.
I suppose there's no reason why we couldn't use a no-op ON CONFLICT
UPDATE anyway, but that does seem rather messy and (I imagine) would
involve rather more work (unless the optimizer were able to optimize
away the "update"? I don't know enough to be able to say if it would).
Thanks
Geoff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Resolved by subject fallback
On Fri, Jan 30, 2015 at 6:59 AM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
I appreciate the work you're doing and (as a user rather than a
pg-hacker) don't want to butt in but if it would be possible to allow
support for IGNORE for unique but not exclusion constraints that would
be really helpful for my own use cases, where being able to insert
from a dataset into a table containing unique constraints without
having to first check the dataset for uniqueness (within both itself
and the target table) is very useful.It's possible that I've misunderstood anyway and that you mean purely
that exclusion constraint IGNORE should be shelved until 9.6, in which
case I apologise.
Well, the issue is that we can't really add exclusion constraints to
the IGNORE case later. So the fact that we cannot do exclusion
constraints kind of implies that we can either shelve IGNORE and maybe
look at it later, or accept that we'll never support exclusion
constraints with IGNORE. We'd then include IGNORE without exclusion
constraint support now and forever. I tend to think that we'll end up
doing the latter anyway, but I really don't want to add additional
risk of this not getting into 9.5 by arguing about that now. It
doesn't matter that much.
I suppose there's no reason why we couldn't use a no-op ON CONFLICT
UPDATE anyway
Right. IGNORE isn't really all that compelling for that reason. Note
that this will still lock the unmodified row, though.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
A first (not actually that quick :() look through the patches to see
what actually happened in the last months. I didn't keep up with the
thread.
Generally the split into the individual commits doesn't seem to make
much sense to me. The commits individually (except the first) aren't
indivdiually commitable and aren't even meaningful. Splitting off the
internal docs, tests and such actually just seems to make reviewing
harder because you miss context. Splitting it so that individual piece
are committable and reviewable makes sense, but... I have no problem
doing the user docs later. If you split of RLS support, you need to
throw an error before it's implemented.
0001:
* References INSERT with ON CONFLICT UPDATE, can thus not be committed
independently. I don't think those references really are needed.
* I'm not a fan of the increased code duplication in
ExecCheckRTEPerms(). Can you look into cleaning that up?
* Also the comments inside the ACL_INSERT case still reference UPDATE.
Other than that I think we can just go ahead and commit this ahead of
time. Mentioning ON CONFLICT UPDATE (OCU henceforth) in the commit
message only.
0007:
* "AMs" alone isn't particularly unique.
* Without the context of the discussion "unprincipled deadlocks" aren't
well defined.
* Too many "" words.
* Waiting "too long" isn't defined. Neither is why that'd imply
unprincipled deadlocks. Somewhat understandable with the context of
the discussion, but surely not a couple years down the road.
* As is I don't find the README entry super helpful. It should state
what the reason for doing this is cleary, start at the higher level,
and then move to the details.
* Misses details about the speculative heavyweight locking of tuples.
0002:
* Tentatively I'd say that killspeculative should be done via a separate
function instead of heap_delete()
* I think we should, as you ponder in a comment, do the OCU specific
stuff lazily and/or in a separate function from BuildIndexInfo(). That
function is already quite visible in profiles, and the additional work
isn't entirely trivial.
* I doubt logical decoding works with the patch as it stands.
* The added ereport (i.e. user facing) error message in
ExecInsertIndexTuples won't help a user at all.
* Personally I don't care one iota for comments like "Get information
from the result relation info structure.". Yes, one of these already
exists, but ...
* If a arbiter index is passed to ExecCheckIndexConstraints(), can't we
abort the loop after checking it? Also, do we really have to iterate
over indexes for that case? How about moving the loop contents to a
separate function and using that separately for the arbiter cases?
* Don't like the comment above check_exclusion_or_unique_constraint()'s
much. Makes too much of a special case of OCU
* ItemPointerIsValid
* ExecCheckHeapTupleVisible's comment "It is not acceptable to proceed "
sounds like you're talking with a child or so ;)
* ExecCheckHeapTupleVisible()'s errhint() sounds like an
argument/excuse (actually like a code comment). That's not going to
help a user at all.
* I find the modified control flow in ExecInsert() pretty darn ugly. I
think this needs to be cleaned up. The speculative case should imo be
a separate function or something.
* /*
* This may occur when an instantaneously invisible tuple is blamed
* as a conflict because multiple rows are inserted with the same
* constrained values.
How can this happen? We don't insert multiple rows with the same
command id?
* ExecLockUpdatedTuple() has (too?) many comments, but little overview
of what/why it is doing what it does on a higher level.
* plan_speculative_use_index: "Use the planner to decide speculative
insertion arbiter index" - Huh? " rel is the target to undergo ON
CONFLICT UPDATE/IGNORE." - Which rel?
* formulations as "fundamental nexus" are hard to understand imo.
* Perhaps it has previously been discussed but I'm not convinced by the
reasoning for not looking at opclasses in infer_unique_index(). This
seems like it'd prohibit ever having e.g. case insensitive opclasses -
something surely worthwile.
* Doesn't infer_unique_index() have to look for indisvalid? This isn't
going to work well with a invalid (not to speak for a !ready) index.
* Is ->relation in the UpdateStmt generated in transformInsertStmt ever
used for anything? If so, it'd possibly generate some possible
nastyness due to repeated name lookups. Looks like it'll be used in
transformUpdateStmt
* What's the reason for the !pstate->p_parent? Also why the parens?
pstate->p_is_speculative = (pstate->parentParseState &&
(!pstate->p_parent_cte &&
pstate->parentParseState->p_is_insert &&
pstate->parentParseState->p_is_speculative));
* Why did you need to make %nonassoc DISTINCT and ON nonassoc in the grammar?
* The whole speculative insert logic isn't really well documented. Why,
for example, do we actually need the token? And why are there no
issues with overflow? And where is it documented that a 0 means
there's no token? ...
* Isn't "SpecType" a awfully generic (and nondescriptive) name?
* /* XXX: Make sure that re-use of bits is safe here */ - no, not
unless you change existing checks.
* /*
* Immediately VACUUM "super-deleted" tuples
*/
if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))
return HEAPTUPLE_DEAD;
Is that branch really needed? Shouldn't it just be happening as a
consequence of the already existing code? Same in SatisfiesMVCC. If
you actually needed that block, it'd need to be done in SatisfiesSelf
as well, no? You have a comment about a possible loop - but that seems
wrong to me, implying that HEAP_XMIN_COMMITTED was set invalidly.
Ok, I can't focus at all any further at this point. But there's enough
comments here that some even might make sense ;)
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 30 January 2015 at 21:58, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Jan 30, 2015 at 6:59 AM, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
I suppose there's no reason why we couldn't use a no-op ON CONFLICT
UPDATE anywayRight. IGNORE isn't really all that compelling for that reason. Note
that this will still lock the unmodified row, though.
Mmmf. So I would have to make sure that my source tuples were unique
before doing the INSERT (otherwise the first ON CONFLICT UPDATE for a
tuple would block any other)? That's potentially very slow :(
When you say that you can't add exclusion constraints later, do you
mean from a coding point of view or just because people would get
confused whether exclusion constraints could be IGNOREd or not?
Geoff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2 February 2015 at 14:32, Geoff Winkless <pgsqladmin@geoff.dj> wrote:
Mmmf. So I would have to make sure that my source tuples were unique
before doing the INSERT (otherwise the first ON CONFLICT UPDATE for a
tuple would block any other)? That's potentially very slow :(
Replying to my own message, because it occurs to me I might be being
stupid (surely not :) )
When you say "this will still lock the unmodified row" did you mean
just that it's locked to _other_ processes until commit? That would be
much less impactful.
Geoff
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/18/2015 04:48 AM, Peter Geoghegan wrote:
I think that the fundamental, unfixable race condition here is the
disconnect between index tuple insertion and checking for would-be
exclusion violations that exclusion constraints naturally have here,
that unique indexes naturally don't have [1] (note that I'm talking
only about approach #2 to value locking here; approach #1 isn't in
V2.0). I suspect that the feature is not technically feasible to make
work correctly with exclusion constraints, end of story. VACUUM
interlocking is probably also involved here, but the unfixable race
condition seems like our fundamental problem.
It's not a fundamental, unfixable race condition. In [1], I gave you
three ideas straight off the top of my head on how that could be fixed.
Please work with me towards a committable patch.
I'm trying...
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/30/2015 01:38 AM, Peter Geoghegan wrote:
I have not addressed the recently described problems with exclusion
constraints. I hope we can do so shortly. Simply removing IGNORE
support until such time as we straighten that all out (9.6?) seems
like the simplest solution. No need to block the progress of "UPSERT",
since exclusion constraint support was only ever going to be useful
for the less compelling IGNORE variant. What do other people think? Do
you agree with my view that we should shelve IGNORE support for now,
Heikki?
No, I don't agree. Let's fix it.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 02, 2015 at 4:48 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
I think that the fundamental, unfixable race condition here is the
disconnect between index tuple insertion and checking for would-be
exclusion violations that exclusion constraints naturally have here,
that unique indexes naturally don't have [1] (note that I'm talking
only about approach #2 to value locking here; approach #1 isn't in
V2.0). I suspect that the feature is not technically feasible to make
work correctly with exclusion constraints, end of story. VACUUM
interlocking is probably also involved here, but the unfixable race
condition seems like our fundamental problem.It's not a fundamental, unfixable race condition. In [1], I gave you
three ideas straight off the top of my head on how that could be fixed.
That was different - I tried to make it work by fixing some bugs
there. However, I'm now finding myself up against these new bugs. I
think that the underlying cause is the lack of any real locking
(unlike with the B-Tree AM) in *both* cases, but I don't even know
that for sure. The error messages you see are quite odd - why should a
btree_gist-based exclusion constraint cause a violation when
non-conflicting values are inserted? There is some other race
condition here. This wasn't a livelock (or a deadlock), which is what
your comments in early January apply to. I think that this has
something to do with VACUUM interlocking. But with the B-Tree AM
(which we're handling differently, by re-using infrastructure used for
deferred unique constraints), things work quite well. The patch stands
up to Jeff's vigorous stress-tests.
I'm not fundamentally in disagreement with you about any of this. All
I'm saying is that we should cut scope today. We're not precluding
picking up an IGNORE feature that does support exclusion constraints
in the future. Why should we insist upon having that in the first cut?
It's both significantly harder, and significantly less useful to
users, and so cutting that makes perfect sense AFAICT. As I've said
many times, exclusion constraint support was only ever going to be
useful to the IGNORE variant (I've tested exclusion constraints by
contriving a case to make them do UPSERTs, but this is only for the
benefit of the stress-test).
Thanks
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 01/30/2015 01:38 AM, Peter Geoghegan wrote:
On the stress-testing front, I'm still running Jeff Janes' tool [1],
while also continuing to use his Postgres modifications to
artificially increase the XID burn rate.
I followed the instructions in README.md to reproduce this. I downloaded
the tool, applied the upsert patchset, applied the hack to
parse_clause.c as instructed in the README.md file, installed
btree_gist, and ran count_upsert_exclusion.pl.
It failed immediately with an assertion failure:
TRAP: FailedAssertion("!(node->spec != SPEC_INSERT || node->arbiterIndex
!= ((Oid) 0))", File: "nodeModifyTable.c", Line: 1619)
Is that just because of the hack in parse_clause.c?
With assertions disabled, count_upsert_exclusion.pl ran successfully to
the end. I also tried running "VACUUM FREEZE upsert_race_test" in a loop
in another session at the same time, but it didn't make a difference.
How quickly do you see the errors?
I also tried applying crash_REL9_5.patch from the jjanes_upsert kit, and
set jj_xid=10000 to increase XID burn rate, but I'm still not seeing any
errors.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 3, 2015 at 2:05 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
TRAP: FailedAssertion("!(node->spec != SPEC_INSERT || node->arbiterIndex !=
((Oid) 0))", File: "nodeModifyTable.c", Line: 1619)Is that just because of the hack in parse_clause.c?
Yes. I never built with assertions and so didn't see this, but it
doesn't matter.
With assertions disabled, count_upsert_exclusion.pl ran successfully to the
end. I also tried running "VACUUM FREEZE upsert_race_test" in a loop in
another session at the same time, but it didn't make a difference. How
quickly do you see the errors?I also tried applying crash_REL9_5.patch from the jjanes_upsert kit, and set
jj_xid=10000 to increase XID burn rate, but I'm still not seeing any errors.
Did you build fully optimized, assertion-free code? I've been doing
that. I found it necessary to recreate some of the bugs Jeff's tool
caught. I also think that I might have needed an 8 core box to see it,
but less sure about that.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/03/2015 08:17 PM, Peter Geoghegan wrote:
On Tue, Feb 3, 2015 at 2:05 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:TRAP: FailedAssertion("!(node->spec != SPEC_INSERT || node->arbiterIndex !=
((Oid) 0))", File: "nodeModifyTable.c", Line: 1619)Is that just because of the hack in parse_clause.c?
Yes. I never built with assertions and so didn't see this, but it
doesn't matter.With assertions disabled, count_upsert_exclusion.pl ran successfully to the
end. I also tried running "VACUUM FREEZE upsert_race_test" in a loop in
another session at the same time, but it didn't make a difference. How
quickly do you see the errors?I also tried applying crash_REL9_5.patch from the jjanes_upsert kit, and set
jj_xid=10000 to increase XID burn rate, but I'm still not seeing any errors.Did you build fully optimized, assertion-free code? I've been doing
that. I found it necessary to recreate some of the bugs Jeff's tool
caught. I also think that I might have needed an 8 core box to see it,
but less sure about that.
I had compiled with -O0, but without assertions. I tried now again with
-O3. It's been running for about 10 minutes now, and I haven't seen any
errors.
Since you can reproduce this, it would be good if you could debug this.
The error message where the alleged duplicate key actually had a
different value is a bit scary. Makes me wonder if it might be a bug
with exclusion constraints in general, or just with the patch.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Feb 4, 2015 at 9:54 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
I had compiled with -O0, but without assertions. I tried now again with -O3.
It's been running for about 10 minutes now, and I haven't seen any errors.
Did you run with an artificially high XID burn rate (i.e. did you also
apply Jeff's modifications to Postgres, and specify a high burn rate
using his custom GUC)? Maybe that was important.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 2, 2015 at 01:06 AM, Andres Freund <andres@2ndquadrant.com> wrote:
A first (not actually that quick :() look through the patches to see
what actually happened in the last months. I didn't keep up with the
thread.
So, let me get this out of the way: This is the first in-depth
technical review that this work has had in a long time. Thank you for
your help here.
Generally the split into the individual commits doesn't seem to make
much sense to me. The commits individually (except the first) aren't
indivdiually commitable and aren't even meaningful. Splitting off the
internal docs, tests and such actually just seems to make reviewing
harder because you miss context. Splitting it so that individual piece
are committable and reviewable makes sense, but... I have no problem
doing the user docs later. If you split of RLS support, you need to
throw an error before it's implemented.
I mostly agree. Basically, I did not intend for all of the patches to
be individually committed. The mechanism by which EXCLUDED.*
expressions are added is somewhat novel, and deserves to be
independently *considered*. I'm trying to show how the parts fit
together more so than breaking things down in to smaller commits (as
you picked up on, 0001 is the exception - that is genuinely intended
to be committed early). Also, those commit messages give me the
opportunity to put those parts in their appropriate context vis-a-vis
our discussions. They refer to the Wiki, for example, or reasons why
pg_stat_statements shouldn't care about ExcludedExpr. Obviously the
final commit messages won't look that way.
0001:
* References INSERT with ON CONFLICT UPDATE, can thus not be committed
independently. I don't think those references really are needed.
* I'm not a fan of the increased code duplication in
ExecCheckRTEPerms(). Can you look into cleaning that up?
* Also the comments inside the ACL_INSERT case still reference UPDATE.Other than that I think we can just go ahead and commit this ahead of
time. Mentioning ON CONFLICT UPDATE (OCU henceforth) in the commit
message only.
Cool. Attached revision makes those changes.
0007:
* "AMs" alone isn't particularly unique.
* Without the context of the discussion "unprincipled deadlocks" aren't
well defined.
* Too many "" words.
* Waiting "too long" isn't defined. Neither is why that'd imply
unprincipled deadlocks. Somewhat understandable with the context of
the discussion, but surely not a couple years down the road.
* As is I don't find the README entry super helpful. It should state
what the reason for doing this is cleary, start at the higher level,
and then move to the details.
* Misses details about the speculative heavyweight locking of tuples.
Fair points. I'll work through that feedback.
Actually, I think we should memorialize that "unprincipled deadlocks"
should be avoided in some more general way, since they are after all a
general problem that we've seen elsewhere. I'm not sure about how to
go about doing that, though.
0002:
* Tentatively I'd say that killspeculative should be done via a separate
function instead of heap_delete()
Really? I guess if that were to happen, it would entail refactoring
heap_delete() to call a static function, which was also called by a
new kill_speculative() function that does this. Otherwise, you'd have
far too much duplication.
* I think we should, as you ponder in a comment, do the OCU specific
stuff lazily and/or in a separate function from BuildIndexInfo(). That
function is already quite visible in profiles, and the additional work
isn't entirely trivial.
Okay.
* I doubt logical decoding works with the patch as it stands.
I thought so. Perhaps you could suggest a better use of the available
XLOG_HEAP_* bits. I knew I needed to consider that more carefully
(hence the XXX comment), but didn't get around to it.
* The added ereport (i.e. user facing) error message in
ExecInsertIndexTuples won't help a user at all.
So, this:
/* Skip this index-update if the predicate isn't satisfied */ if (!ExecQual(predicate, econtext, false)) + { + if (arbiterIdx == indexRelation->rd_index->indexrelid) + ereport(ERROR, + (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION), + errmsg("partial arbiter unique index has predicate that does not cover tuple proposed for insertion"), + errdetail("ON CONFLICT inference clause implies that the tuple proposed for insertion actually be covered by partial predicate for index \"%s\".", + RelationGetRelationName(indexRelation)), + errhint("ON CONFLICT inference clause must infer a unique index that covers the final tuple, after BEFORE ROW INSERT triggers fire."), + errtableconstraint(heapRelation, + RelationGetRelationName(indexRelation)))); continue; + }
Yeah, that isn't a great error message. This happens here because you
are using a partial unique index (and so you must have had an
inference specification with a "WHERE" to get here). However, what you
actually went to insert would not be covered by this partial unique
index, and so couldn't ever take the alternative path, and so is
likely not thought out. Maybe it would be better to silently always
let the INSERT succeed as an INSERT. *That* actually wasn't really
discussed - this is all my idea.
* Personally I don't care one iota for comments like "Get information
from the result relation info structure.". Yes, one of these already
exists, but ...
Okay.
* If a arbiter index is passed to ExecCheckIndexConstraints(), can't we
abort the loop after checking it? Also, do we really have to iterate
over indexes for that case? How about moving the loop contents to a
separate function and using that separately for the arbiter cases?
Well, the failure to do that implies very few extra cycles, but sure.
I'll add a new reason to break at the end, when
check_exclusion_or_unique_constraint() is called in respect of a
particular (inferred) arbiter unique index.
* Don't like the comment above check_exclusion_or_unique_constraint()'s
much. Makes too much of a special case of OCU
I guess I should just refer to speculative insertion.
* ItemPointerIsValid
What about it?
* ExecCheckHeapTupleVisible's comment "It is not acceptable to proceed "
sounds like you're talking with a child or so ;)
Fair point. I should say "It would not be consistent with the
guarantees of the higher isolation levels..."
* ExecCheckHeapTupleVisible()'s errhint() sounds like an
argument/excuse (actually like a code comment). That's not going to
help a user at all.
Really? I thought it might be less than intuitive that higher
isolation levels cannot decide to do nothing on the basis of something
not in their MVCC snapshot. But come to think of it, yeah, that
errhint() isn't adding much over the main error message.
* I find the modified control flow in ExecInsert() pretty darn ugly. I
think this needs to be cleaned up. The speculative case should imo be
a separate function or something.
Hmm. I'm not quite sold on that. Basically, if we did that, we'd still
have a function that was more or less a strict superset of
ExecInsert(). What have we saved?
What I do agree with is that ExecInsert() should be refactored to make
the common case (a vanilla insert) look like the common case, whereas
the uncommon case (an upsert) should have that dealt with specially.
There is room for improvement. Is that a fair compromise?
* /*
* This may occur when an instantaneously invisible tuple is blamed
* as a conflict because multiple rows are inserted with the same
* constrained values.
How can this happen? We don't insert multiple rows with the same
command id?
This is a cardinality violation [1]https://wiki.postgresql.org/wiki/UPSERT#.22Cardinality_violation.22_errors_in_detail -- Peter Geoghegan. It can definitely happen - just
try the examples you see on the Wiki. This is possible because I
modified heap_lock_tuple() to return HeapTupleInvisible (and not just
complain directly when HeapTupleSatisfiesUpdate() returns
"HeapTupleInvisible"). It's also possible because we're using a
DirtySnapshot at various points. This is sort of like how ExecUpdate()
handles a return value of "HeapTupleSelfUpdated" from heap_update().
Not quite though, because 1. ) I prefer to throw an error (rather than
silently not UPDATE that slot), and 2. ) we're not dealing with MVCC
semantics, so the return values are different in both cases. The
*nature* of the situation handled is similar between UPSERTs (in
ExecLockUpdatedTuple()) and vanilla UPDATEs (in ExecUpdate()), though.
Does that make sense?
* ExecLockUpdatedTuple() has (too?) many comments, but little overview
of what/why it is doing what it does on a higher level.
Fair point. Seems like material for a better worked out executor README.
* plan_speculative_use_index: "Use the planner to decide speculative
insertion arbiter index" - Huh? " rel is the target to undergo ON
CONFLICT UPDATE/IGNORE." - Which rel?
Sorry, that's an obsolete comment (the function signature changed). It
should refer to the target of the Query being planned.
* formulations as "fundamental nexus" are hard to understand imo.
I'm trying to suggest that INSERT ... ON CONFLICT UPDATE is not quite
two separate top-level commands, and yet is also not a new, distinct
type of top-level command. This is mostly a high level design decision
that maximizes code reuse.
* Perhaps it has previously been discussed but I'm not convinced by the
reasoning for not looking at opclasses in infer_unique_index(). This
seems like it'd prohibit ever having e.g. case insensitive opclasses -
something surely worthwile.
I don't think anyone gave that idea the thumbs-up. However, I really
don't see the problem. Sure, we could have case insensitive opclasses
in the future, and you may want to make a unique index using one. But
then it becomes a matter of whatever unique indexes are available. The
limitation is only that you cannot explicitly indicate that you want a
certain opclass. It comes down to whatever unique indexes happen to be
available, since of course taking the alternative path is arbitrated
by a would-be unique violation. It's a bit odd that we're leaving it
up to the available indexes to decide on semantics like that, but the
problem is so narrow and the solution so involved that I'd argue it's
acceptable.
* Doesn't infer_unique_index() have to look for indisvalid? This isn't
going to work well with a invalid (not to speak for a !ready) index.
It does (check IndexIsValid()). I think the mistake I made here was
not checking IndexIsReady(), since that is an additional concern above
what the similar get_relation_info() function must consider.
* Is ->relation in the UpdateStmt generated in transformInsertStmt ever
used for anything? If so, it'd possibly generate some possible
nastyness due to repeated name lookups. Looks like it'll be used in
transformUpdateStmt
What, you mean security issues, for example? I have a hard time seeing
how that could work in practice, given that the one and only target
RTE is marked with the appropriate updatedCols originating from
transformUpdateStmt(). Still, it is a concern generally - better safe
than sorry. I was thinking of plugging it by ensuring that the
Relations matched, but that might not be good enough. Maybe it would
be better to bite the bullet and have transformUpdateStmt() use the
same Relation directly, which is something I hoped to avoid (better to
have transformUpdateStmt() know as little as possible about this, I'd
say).
* What's the reason for the !pstate->p_parent? Also why the parens?
pstate->p_is_speculative = (pstate->parentParseState &&
(!pstate->p_parent_cte &&
pstate->parentParseState->p_is_insert &&
pstate->parentParseState->p_is_speculative));
You mean the "!pstate->p_parent_cte"? That's there because you can get
queries to segfault if this logic doesn't consider that a
data-modifying CTE can have an UPDATE that appears within a CTE
referenced from an INSERT. :-)
* Why did you need to make %nonassoc DISTINCT and ON nonassoc in the grammar?
To prevent a shift/reduce conflict, I changed the associativity.
Without this, here are the details of State 700, which has the
conflict (from gram.output):
"""""
State 700
1465 opt_distinct: DISTINCT .
1466 | DISTINCT . ON '(' expr_list ')'
ON shift, and go to state 1094
ON [reduce using rule 1465 (opt_distinct)]
$default reduce using rule 1465 (opt_distinct)
"""""
* The whole speculative insert logic isn't really well documented. Why,
for example, do we actually need the token? And why are there no
issues with overflow? And where is it documented that a 0 means
there's no token? ...
Fair enough. Presumably it's okay that overflow theoretically could
occur, because a race is all but impossible. The token represents a
particular attempt by some backend at inserting a tuple, that needs to
be waited on specifically only if it is their active attempt (and the
xact is still running). Otherwise, you get unprincipled deadlocks.
Even if by some incredibly set of circumstances it wraps around, worst
case scenario you get an unprinciped deadlock, which is hardly the end
of the world given the immense number of insertions required, and the
immense unlikelihood that things would work out that way - it'd be
basically impossible.
I'll document the "0" thing.
* Isn't "SpecType" a awfully generic (and nondescriptive) name?
OK. That'll be changed.
* /* XXX: Make sure that re-use of bits is safe here */ - no, not
unless you change existing checks.
I think that this is a restatement of your remarks on logical decoding. No?
* /*
* Immediately VACUUM "super-deleted" tuples
*/
if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))
return HEAPTUPLE_DEAD;
Is that branch really needed? Shouldn't it just be happening as a
consequence of the already existing code? Same in SatisfiesMVCC. If
you actually needed that block, it'd need to be done in SatisfiesSelf
as well, no? You have a comment about a possible loop - but that seems
wrong to me, implying that HEAP_XMIN_COMMITTED was set invalidly.
Indeed, this code is kind of odd. While I think the omission within
SatisfiesSelf() may be problematic too, if you really want to know why
this code is needed, uncomment it and run Jeff's stress-test. It will
reliably break.
This code:
"""""
if (HeapTupleHeaderXminInvalid(tuple))
return HEAPTUPLE_DEAD;
"""""
and this code:
"""""
/*
* Immediately VACUUM "super-deleted" tuples
*/
if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))
return HEAPTUPLE_DEAD;
"""""
are not equivalent (nor is the latter a superset of the former). Maybe
they should be, but they're not. What's more, heap tuple header raw
xmin has never been able to change, and I don't think there is any
reason for it to be InvalidTransactionId. See my new comments within
EvalPlanQualFetch() remarking on how it's now possible for that to
change (before, the comment claimed that it wasn't possible).
Ok, I can't focus at all any further at this point. But there's enough
comments here that some even might make sense ;)
Most do. :-)
Thanks.
[1]: https://wiki.postgresql.org/wiki/UPSERT#.22Cardinality_violation.22_errors_in_detail -- Peter Geoghegan
--
Peter Geoghegan
Attachments:
0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchtext/x-patch; charset=US-ASCII; name=0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchDownload
From ce390514f6ac94fdb30f5930a658c92a6987e371 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 26 Aug 2014 21:28:40 -0700
Subject: [PATCH 1/8] Make UPDATE privileges distinct from INSERT privileges in
RTEs
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held. This worked because
ExecCheckRTEPerms() was called with a ACL_INSERT or ACL_UPDATE
requiredPerms, and based on that it was evident which type of
optimizable statement was under consideration. Since historically no
type of optimizable statement could directly INSERT and UPDATE at the
same time, there was no ambiguity as to which privileges were required.
This largely mechanical commit is required infrastructure for the
INSERT...ON CONFLICT UPDATE feature, which introduces an optimizable
statement that may be subject to both INSERT and UPDATE permissions
enforcement. Tests follow in a later commit.
sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken.
---
contrib/sepgsql/dml.c | 31 ++++++---
src/backend/commands/copy.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/trigger.c | 22 +++---
src/backend/executor/execMain.c | 110 +++++++++++++++++++-----------
src/backend/nodes/copyfuncs.c | 3 +-
src/backend/nodes/equalfuncs.c | 3 +-
src/backend/nodes/outfuncs.c | 3 +-
src/backend/nodes/readfuncs.c | 3 +-
src/backend/optimizer/plan/setrefs.c | 6 +-
src/backend/optimizer/prep/prepsecurity.c | 6 +-
src/backend/optimizer/prep/prepunion.c | 8 ++-
src/backend/parser/analyze.c | 4 +-
src/backend/parser/parse_relation.c | 21 ++++--
src/backend/rewrite/rewriteHandler.c | 52 ++++++++------
src/include/nodes/parsenodes.h | 14 ++--
16 files changed, 176 insertions(+), 114 deletions(-)
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 36c6a37..4a71753 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -145,7 +145,8 @@ fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
static bool
check_relation_privileges(Oid relOid,
Bitmapset *selected,
- Bitmapset *modified,
+ Bitmapset *inserted,
+ Bitmapset *updated,
uint32 required,
bool abort_on_violation)
{
@@ -231,8 +232,9 @@ check_relation_privileges(Oid relOid,
* Check permissions on the columns
*/
selected = fixup_whole_row_references(relOid, selected);
- modified = fixup_whole_row_references(relOid, modified);
- columns = bms_union(selected, modified);
+ inserted = fixup_whole_row_references(relOid, inserted);
+ updated = fixup_whole_row_references(relOid, updated);
+ columns = bms_union(selected, bms_union(inserted, updated));
while ((index = bms_first_member(columns)) >= 0)
{
@@ -241,13 +243,16 @@ check_relation_privileges(Oid relOid,
if (bms_is_member(index, selected))
column_perms |= SEPG_DB_COLUMN__SELECT;
- if (bms_is_member(index, modified))
+ if (bms_is_member(index, inserted))
{
- if (required & SEPG_DB_TABLE__UPDATE)
- column_perms |= SEPG_DB_COLUMN__UPDATE;
if (required & SEPG_DB_TABLE__INSERT)
column_perms |= SEPG_DB_COLUMN__INSERT;
}
+ if (bms_is_member(index, updated))
+ {
+ if (required & SEPG_DB_TABLE__UPDATE)
+ column_perms |= SEPG_DB_COLUMN__UPDATE;
+ }
if (column_perms == 0)
continue;
@@ -304,7 +309,7 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
required |= SEPG_DB_TABLE__INSERT;
if (rte->requiredPerms & ACL_UPDATE)
{
- if (!bms_is_empty(rte->modifiedCols))
+ if (!bms_is_empty(rte->updatedCols))
required |= SEPG_DB_TABLE__UPDATE;
else
required |= SEPG_DB_TABLE__LOCK;
@@ -333,7 +338,8 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
{
Oid tableOid = lfirst_oid(li);
Bitmapset *selectedCols;
- Bitmapset *modifiedCols;
+ Bitmapset *insertedCols;
+ Bitmapset *updatedCols;
/*
* child table has different attribute numbers, so we need to fix
@@ -341,15 +347,18 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
*/
selectedCols = fixup_inherited_columns(rte->relid, tableOid,
rte->selectedCols);
- modifiedCols = fixup_inherited_columns(rte->relid, tableOid,
- rte->modifiedCols);
+ insertedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->insertedCols);
+ updatedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->updatedCols);
/*
* check permissions on individual tables
*/
if (!check_relation_privileges(tableOid,
selectedCols,
- modifiedCols,
+ insertedCols,
+ updatedCols,
required, abort_on_violation))
return false;
}
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 92ff632..d2996fb 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -847,7 +847,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
FirstLowInvalidHeapAttributeNumber;
if (is_from)
- rte->modifiedCols = bms_add_member(rte->modifiedCols, attno);
+ rte->insertedCols = bms_add_member(rte->insertedCols, attno);
else
rte->selectedCols = bms_add_member(rte->selectedCols, attno);
}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index c961429..bf2235d 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -433,7 +433,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
rte->requiredPerms = ACL_INSERT;
for (attnum = 1; attnum <= intoRelationDesc->rd_att->natts; attnum++)
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attnum - FirstLowInvalidHeapAttributeNumber);
ExecCheckRTPerms(list_make1(rte), true);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 5c1c1be..7defe80 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -71,8 +71,8 @@ static int MyTriggerDepth = 0;
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
/* Local function prototypes */
static void ConvertTriggerToFK(CreateTrigStmt *stmt, Oid funcoid);
@@ -2343,7 +2343,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TriggerDesc *trigdesc;
int i;
TriggerData LocTriggerData;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
trigdesc = relinfo->ri_TrigDesc;
@@ -2352,7 +2352,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (!trigdesc->trig_update_before_statement)
return;
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
LocTriggerData.type = T_TriggerData;
LocTriggerData.tg_event = TRIGGER_EVENT_UPDATE |
@@ -2373,7 +2373,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, NULL, NULL))
+ updatedCols, NULL, NULL))
continue;
LocTriggerData.tg_trigger = trigger;
@@ -2398,7 +2398,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (trigdesc && trigdesc->trig_update_after_statement)
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
false, NULL, NULL, NIL,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
}
TupleTableSlot *
@@ -2416,7 +2416,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
HeapTuple oldtuple;
TupleTableSlot *newSlot;
int i;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
Bitmapset *keyCols;
LockTupleMode lockmode;
@@ -2425,10 +2425,10 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
* been modified, then we can use a weaker lock, allowing for better
* concurrency.
*/
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
keyCols = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc,
INDEX_ATTR_BITMAP_KEY);
- if (bms_overlap(keyCols, modifiedCols))
+ if (bms_overlap(keyCols, updatedCols))
lockmode = LockTupleExclusive;
else
lockmode = LockTupleNoKeyExclusive;
@@ -2482,7 +2482,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, trigtuple, newtuple))
+ updatedCols, trigtuple, newtuple))
continue;
LocTriggerData.tg_trigtuple = trigtuple;
@@ -2552,7 +2552,7 @@ ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo,
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
true, trigtuple, newtuple, recheckIndexes,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 20b3188..f20f1e8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -82,6 +82,9 @@ static void ExecutePlan(EState *estate, PlanState *planstate,
ScanDirection direction,
DestReceiver *dest);
static bool ExecCheckRTEPerms(RangeTblEntry *rte);
+static bool ExecCheckRTEPermsModified(Oid relOid, Oid userid,
+ Bitmapset *modifiedCols,
+ AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static char *ExecBuildSlotValueDescription(Oid reloid,
TupleTableSlot *slot,
@@ -97,8 +100,10 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
+#define GetInsertedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->insertedCols)
/* end of local decls */
@@ -559,7 +564,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
AclMode remainingPerms;
Oid relOid;
Oid userid;
- int col;
/*
* Only plain-relation RTEs need to be checked here. Function RTEs are
@@ -597,6 +601,8 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
remainingPerms = requiredPerms & ~relPerms;
if (remainingPerms != 0)
{
+ int col = -1;
+
/*
* If we lack any permissions that exist only as relation permissions,
* we can fail straight away.
@@ -625,7 +631,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
return false;
}
- col = -1;
while ((col = bms_next_member(rte->selectedCols, col)) >= 0)
{
/* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
@@ -648,43 +653,63 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
}
/*
- * Basically the same for the mod columns, with either INSERT or
- * UPDATE privilege as specified by remainingPerms.
+ * Basically the same for the mod columns, for both INSERT and UPDATE
+ * privilege as specified by remainingPerms.
*/
- remainingPerms &= ~ACL_SELECT;
- if (remainingPerms != 0)
- {
- /*
- * When the query doesn't explicitly change any columns, allow the
- * query if we have permission on any column of the rel. This is
- * to handle SELECT FOR UPDATE as well as possible corner cases in
- * INSERT and UPDATE.
- */
- if (bms_is_empty(rte->modifiedCols))
- {
- if (pg_attribute_aclcheck_all(relOid, userid, remainingPerms,
- ACLMASK_ANY) != ACLCHECK_OK)
- return false;
- }
+ if (remainingPerms & ACL_INSERT && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->insertedCols,
+ ACL_INSERT))
+ return false;
- col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
- {
- /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
- AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+ if (remainingPerms & ACL_UPDATE && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->updatedCols,
+ ACL_UPDATE))
+ return false;
+ }
+ return true;
+}
- if (attno == InvalidAttrNumber)
- {
- /* whole-row reference can't happen here */
- elog(ERROR, "whole-row update is not implemented");
- }
- else
- {
- if (pg_attribute_aclcheck(relOid, attno, userid,
- remainingPerms) != ACLCHECK_OK)
- return false;
- }
- }
+/*
+ * ExecCheckRTEPermsModified
+ * Check INSERT or UPDATE access permissions for a single RTE (these
+ * are processed uniformly).
+ */
+static bool
+ExecCheckRTEPermsModified(Oid relOid, Oid userid, Bitmapset *modifiedCols,
+ AclMode requiredPerms)
+{
+ int col = -1;
+
+ /*
+ * When the query doesn't explicitly update any columns, allow the
+ * query if we have permission on any column of the rel. This is
+ * to handle SELECT FOR UPDATE as well as possible corner cases in
+ * UPDATE.
+ */
+ if (bms_is_empty(modifiedCols))
+ {
+ if (pg_attribute_aclcheck_all(relOid, userid, requiredPerms,
+ ACLMASK_ANY) != ACLCHECK_OK)
+ return false;
+ }
+
+ while ((col = bms_next_member(modifiedCols, col)) >= 0)
+ {
+ /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
+ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+
+ if (attno == InvalidAttrNumber)
+ {
+ /* whole-row reference can't happen here */
+ elog(ERROR, "whole-row update is not implemented");
+ }
+ else
+ {
+ if (pg_attribute_aclcheck(relOid, attno, userid,
+ requiredPerms) != ACLCHECK_OK)
+ return false;
}
}
return true;
@@ -1623,7 +1648,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1649,7 +1675,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1708,7 +1735,8 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f1a24f5..00ffe4a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2028,7 +2028,8 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(requiredPerms);
COPY_SCALAR_FIELD(checkAsUser);
COPY_BITMAPSET_FIELD(selectedCols);
- COPY_BITMAPSET_FIELD(modifiedCols);
+ COPY_BITMAPSET_FIELD(insertedCols);
+ COPY_BITMAPSET_FIELD(updatedCols);
COPY_NODE_FIELD(securityQuals);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6e8b308..79035b2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2345,7 +2345,8 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(requiredPerms);
COMPARE_SCALAR_FIELD(checkAsUser);
COMPARE_BITMAPSET_FIELD(selectedCols);
- COMPARE_BITMAPSET_FIELD(modifiedCols);
+ COMPARE_BITMAPSET_FIELD(insertedCols);
+ COMPARE_BITMAPSET_FIELD(updatedCols);
COMPARE_NODE_FIELD(securityQuals);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index dd1278b..b4a2667 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2456,7 +2456,8 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
WRITE_UINT_FIELD(requiredPerms);
WRITE_OID_FIELD(checkAsUser);
WRITE_BITMAPSET_FIELD(selectedCols);
- WRITE_BITMAPSET_FIELD(modifiedCols);
+ WRITE_BITMAPSET_FIELD(insertedCols);
+ WRITE_BITMAPSET_FIELD(updatedCols);
WRITE_NODE_FIELD(securityQuals);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ae24d05..dbc162a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1253,7 +1253,8 @@ _readRangeTblEntry(void)
READ_UINT_FIELD(requiredPerms);
READ_OID_FIELD(checkAsUser);
READ_BITMAPSET_FIELD(selectedCols);
- READ_BITMAPSET_FIELD(modifiedCols);
+ READ_BITMAPSET_FIELD(insertedCols);
+ READ_BITMAPSET_FIELD(updatedCols);
READ_NODE_FIELD(securityQuals);
READ_DONE();
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7703946..5d865b0 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -368,9 +368,9 @@ flatten_rtes_walker(Node *node, PlannerGlobal *glob)
*
* In the flat rangetable, we zero out substructure pointers that are not
* needed by the executor; this reduces the storage space and copying cost
- * for cached plans. We keep only the alias and eref Alias fields, which
- * are needed by EXPLAIN, and the selectedCols and modifiedCols bitmaps,
- * which are needed for executor-startup permissions checking and for
+ * for cached plans. We keep only the alias and eref Alias fields, which are
+ * needed by EXPLAIN, and the selectedCols, insertedCols and updatedCols
+ * bitmaps, which are needed for executor-startup permissions checking and for
* trigger event checking.
*/
static void
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index af3ee61..f86e792 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -115,7 +115,8 @@ expand_security_quals(PlannerInfo *root, List *tlist)
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the original relation
@@ -213,7 +214,8 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Now deal with any PlanRowMark on this RTE by requesting a lock
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 05f601e..1e28363 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1367,14 +1367,16 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
* if this is the parent table, leave copyObject's result alone.
*
* Note: we need to do this even though the executor won't run any
- * permissions checks on the child RTE. The modifiedCols bitmap may
- * be examined for trigger-firing purposes.
+ * permissions checks on the child RTE. The insertedCols/updatedCols
+ * bitmaps may be examined for trigger-firing purposes.
*/
if (childOID != parentOID)
{
childrte->selectedCols = translate_col_privs(rte->selectedCols,
appinfo->translated_vars);
- childrte->modifiedCols = translate_col_privs(rte->modifiedCols,
+ childrte->insertedCols = translate_col_privs(rte->insertedCols,
+ appinfo->translated_vars);
+ childrte->updatedCols = translate_col_privs(rte->updatedCols,
appinfo->translated_vars);
}
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a68f2e8..df89065 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -733,7 +733,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
false);
qry->targetList = lappend(qry->targetList, tle);
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attr_num - FirstLowInvalidHeapAttributeNumber);
icols = lnext(icols);
@@ -2002,7 +2002,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
origTarget->location);
/* Mark the target column as requiring update permissions */
- target_rte->modifiedCols = bms_add_member(target_rte->modifiedCols,
+ target_rte->updatedCols = bms_add_member(target_rte->updatedCols,
attrno - FirstLowInvalidHeapAttributeNumber);
origTargetList = lnext(origTargetList);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 8d4f79f..d2820d8 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1052,7 +1052,8 @@ addRangeTableEntry(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1105,7 +1106,8 @@ addRangeTableEntryForRelation(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1183,7 +1185,8 @@ addRangeTableEntryForSubquery(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1437,7 +1440,8 @@ addRangeTableEntryForFunction(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1509,7 +1513,8 @@ addRangeTableEntryForValues(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1577,7 +1582,8 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1677,7 +1683,8 @@ addRangeTableEntryForCTE(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index b8e6e7a..fab2948 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1403,7 +1403,8 @@ ApplyRetrieveRule(Query *parsetree,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the view should remain as
@@ -1466,12 +1467,14 @@ ApplyRetrieveRule(Query *parsetree,
subrte->requiredPerms = rte->requiredPerms;
subrte->checkAsUser = rte->checkAsUser;
subrte->selectedCols = rte->selectedCols;
- subrte->modifiedCols = rte->modifiedCols;
+ subrte->insertedCols = rte->insertedCols;
+ subrte->updatedCols = rte->updatedCols;
rte->requiredPerms = 0; /* no permission check on subquery itself */
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* If FOR [KEY] UPDATE/SHARE of view, mark all the contained tables as
@@ -2584,9 +2587,9 @@ rewriteTargetView(Query *parsetree, Relation view)
/*
* For INSERT/UPDATE the modified columns must all be updatable. Note that
* we get the modified columns from the query's targetlist, not from the
- * result RTE's modifiedCols set, since rewriteTargetListIU may have added
- * additional targetlist entries for view defaults, and these must also be
- * updatable.
+ * result RTE's insertedCols and/or updatedCols set, since
+ * rewriteTargetListIU may have added additional targetlist entries for
+ * view defaults, and these must also be updatable.
*/
if (parsetree->commandType != CMD_DELETE)
{
@@ -2723,26 +2726,31 @@ rewriteTargetView(Query *parsetree, Relation view)
*
* Initially, new_rte contains selectedCols permission check bits for all
* base-rel columns referenced by the view, but since the view is a SELECT
- * query its modifiedCols is empty. We set modifiedCols to include all
- * the columns the outer query is trying to modify, adjusting the column
- * numbers as needed. But we leave selectedCols as-is, so the view owner
- * must have read permission for all columns used in the view definition,
- * even if some of them are not read by the outer query. We could try to
- * limit selectedCols to only columns used in the transformed query, but
- * that does not correspond to what happens in ordinary SELECT usage of a
- * view: all referenced columns must have read permission, even if
- * optimization finds that some of them can be discarded during query
- * transformation. The flattening we're doing here is an optional
- * optimization, too. (If you are unpersuaded and want to change this,
- * note that applying adjust_view_column_set to view_rte->selectedCols is
- * clearly *not* the right answer, since that neglects base-rel columns
- * used in the view's WHERE quals.)
+ * query its insertedCols/updatedCols is empty. We set insertedCols and
+ * updatedCols to include all the columns the outer query is trying to
+ * modify, adjusting the column numbers as needed. But we leave
+ * selectedCols as-is, so the view owner must have read permission for all
+ * columns used in the view definition, even if some of them are not read
+ * by the outer query. We could try to limit selectedCols to only columns
+ * used in the transformed query, but that does not correspond to what
+ * happens in ordinary SELECT usage of a view: all referenced columns must
+ * have read permission, even if optimization finds that some of them can
+ * be discarded during query transformation. The flattening we're doing
+ * here is an optional optimization, too. (If you are unpersuaded and want
+ * to change this, note that applying adjust_view_column_set to
+ * view_rte->selectedCols is clearly *not* the right answer, since that
+ * neglects base-rel columns used in the view's WHERE quals.)
*
* This step needs the modified view targetlist, so we have to do things
* in this order.
*/
- Assert(bms_is_empty(new_rte->modifiedCols));
- new_rte->modifiedCols = adjust_view_column_set(view_rte->modifiedCols,
+ Assert(bms_is_empty(new_rte->insertedCols) &&
+ bms_is_empty(new_rte->updatedCols));
+
+ new_rte->insertedCols = adjust_view_column_set(view_rte->insertedCols,
+ view_targetlist);
+
+ new_rte->updatedCols = adjust_view_column_set(view_rte->updatedCols,
view_targetlist);
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..86d1c07 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -717,11 +717,12 @@ typedef struct XmlSerialize
* For SELECT/INSERT/UPDATE permissions, if the user doesn't have
* table-wide permissions then it is sufficient to have the permissions
* on all columns identified in selectedCols (for SELECT) and/or
- * modifiedCols (for INSERT/UPDATE; we can tell which from the query type).
- * selectedCols and modifiedCols are bitmapsets, which cannot have negative
- * integer members, so we subtract FirstLowInvalidHeapAttributeNumber from
- * column numbers before storing them in these fields. A whole-row Var
- * reference is represented by setting the bit for InvalidAttrNumber.
+ * insertedCols and/or updatedCols (INSERT with ON CONFLICT UPDATE may
+ * have all 3). selectedCols, insertedCols and updatedCols are
+ * bitmapsets, which cannot have negative integer members, so we subtract
+ * FirstLowInvalidHeapAttributeNumber from column numbers before storing
+ * them in these fields. A whole-row Var reference is represented by
+ * setting the bit for InvalidAttrNumber.
*--------------------
*/
typedef enum RTEKind
@@ -816,7 +817,8 @@ typedef struct RangeTblEntry
AclMode requiredPerms; /* bitmask of required access permissions */
Oid checkAsUser; /* if valid, check access as this role */
Bitmapset *selectedCols; /* columns needing SELECT permission */
- Bitmapset *modifiedCols; /* columns needing INSERT/UPDATE permission */
+ Bitmapset *insertedCols; /* columns needing INSERT permission */
+ Bitmapset *updatedCols; /* columns needing UPDATE permission */
List *securityQuals; /* any security barrier quals to apply */
} RangeTblEntry;
--
1.9.1
On Wed, Feb 4, 2015 at 10:03 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Wed, Feb 4, 2015 at 9:54 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:I had compiled with -O0, but without assertions. I tried now again with -O3.
It's been running for about 10 minutes now, and I haven't seen any errors.Did you run with an artificially high XID burn rate (i.e. did you also
apply Jeff's modifications to Postgres, and specify a high burn rate
using his custom GUC)? Maybe that was important.
Excuse me: I now see that you specifically indicated that you did. But
looks like your XID burn rate was quite a lot higher than mine
(assuming that you were consistent in using " jj_xid=10000", although
I'm not asserting that that was significant).
I attach a log of output from an example session where exclusion
constraints are shown to break (plus the corresponding server log,
plus /proc/cpuinfo on the off chance that that's significant). As you
can from the fact that the span of time recorded in the log is only a
couple of minutes, this is really easy for me to
recreate....sometimes. I could not recreate the problem with only 4
clients (on this 8 core server) after a few dozen attempts, and then I
couldn't recreate the issue at all, so clearly those details matter.
It might have something to do with CPU scaling, which I've found can
significantly affect outcomes for things like this (looks like my
hosting provider changed settings in the system BIOS recently, such
that I cannot set the CPU governor to "performance").
Perhaps you could take a crack at recreating this, Jeff?
Thanks
--
Peter Geoghegan
Attachments:
On 29 January 2015 at 23:38, Peter Geoghegan <pg@heroku.com> wrote:
On Sat, Jan 17, 2015 at 6:48 PM, Peter Geoghegan <pg@heroku.com> wrote:
I continued with this since posting V2.0.
Attached version (V2.1) fixes bit-rot caused by the recent changes by
Stephen ("Fix column-privilege leak in error-message paths"). More
precisely, it is rebased on top of today's 17792b commit.
Patch 0002 no longer applies due to a conflict in
src/backend/executor/execUtils.c.
Thom
On Wed, Feb 4, 2015 at 04:49:46PM -0800, Peter Geoghegan wrote:
On Tue, Feb 2, 2015 at 01:06 AM, Andres Freund <andres@2ndquadrant.com> wrote:
A first (not actually that quick :() look through the patches to see
what actually happened in the last months. I didn't keep up with the
thread.So, let me get this out of the way: This is the first in-depth
technical review that this work has had in a long time. Thank you for
your help here.
I looked at all the patches too. The patch is only 9k lines, not huge.
Other than the locking part, the biggest part of this patch is adjusting
things so that an INSERT can change into an UPDATE. The code that
handles SELECT/INSERT/UPDATE/DELETE is already complex, and this makes
it even more so. I have no idea how we can be sure we have hit every
single case, but I am also unclear how we will _ever_ know we have hit
them all.
We know people want this feature, and this patch seems to be our best
bet to getting it. If we push this off for 9.6, I am not sure what that
buys us.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 6, 2015 at 1:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
Other than the locking part, the biggest part of this patch is adjusting
things so that an INSERT can change into an UPDATE.
Thanks for taking a look at it. That's somewhat cleaned up in the
attached patchseries - V2.2. This has been rebased to repair the minor
bit-rot pointed out by Thom.
Highlights:
* Better parser representation.
The auxiliary UPDATE never uses its own RangeVar (in fact,
UpdateStmt.relation is never set). This eliminates any possibility of
repeated name lookup problems, addressing Andres' concern. But it also
results in better code. The auxiliary UPDATE is not modified by the
parent INSERT at all - rather, the UPDATE knows to fetch its target
Relation from the parsestate parent INSERT. There is no need to
artificially cut off the parent within the auxiliary UPDATE to make
sure the independent RTE is not visible (during parse analysis, prior
to merging the two, as happened in earlier revisions) - that was a
kludge that I'm glad to be rid of. There is no merging of distinct
INSERT and UPDATE Relations/RTEs because there is only ever one
Relation/RTE to begin with. Previously, the merging merged RTE
selectedCols and updatedCols into the parent INSERT (for column-level
privileges, for example). I'm also a lot less cute about determining
whether an UPDATE is an auxiliary UPDATE from within the parser, which
was also a concern raised by Andres.
About 90% of the special case code previously in transformInsertStmt()
is now in setTargetTable(). This is a significant improvement all
around, since the code is now more consistent with existing parse
analysis code - setTargetTable() is naturally where the auxilary
UPDATE figures out details on its target, and builds an EXCLUDED RTE
and adds it to the namespace as a special case (just like for regular
UPDATE targets, which similarly get added to the namespace +
joinlist).
All of this implies a slight behavioral change (which is documented):
The TARGET.* alias is now visible everywhere. So you see it within
every node of EXPLAIN output, and if you want to qualify a RETURNING
column, the TARGET.* alias must be used (not the original table name).
I think that this is an improvement too, although it is arguably a
slight behavioral change to INSERTs in general (can't think why anyone
would particularly want to qualify with an alias in INSERT's
RETURNING, though). Note that the EXCLUDED.* pseudo-alias is still
only visible within the UPDATE's targetlist and WHERE clause. I think
it would be a bad idea to make the EXCLUDED.* tuples visible from
RETURNING [1]/messages/by-id/CAM3SWZTcpy9rroLM3TkfuU4HDLrEtuGzxLptGn2vLhVAFwQCVA@mail.gmail.com -- Peter Geoghegan.
* Cleaner ExecInsert() control flow. Andres rightly complained that
the existing control flow was convoluted. I believe he will find this
revision a lot clearer, although I have not gone so far as creating
something like an ExecUpsert().
* Better documentation. The executor README has been overhauled to
describe the flow of things from a higher level. The procarray changes
are better documented by comments, too.
* Special work previously within BuildIndexInfo() that is needed for
unique indexes for the UPSERT case only is now done only in the UPSERT
case. There is now no added overhead in BuildIndexInfo() for existing
cases.
* Worked on feedback on various points of style raised by Andres (e.g.
an errhint() was removed).
* Better explanation of the re-use of XLOG_HEAP* flag bits. I believe
that it's fine to reuse the "(1<<7)" bit, given that each distinct use
of the bit can only appear in distinct record types (that is, the bit
is used by xl_heap_multi_insert, and now xl_heap_delete). Those two
uses should be mutually exclusive. It's not as if we've had to be
economical with the use of heap flag XLog record bits before now, so
the best approach here isn't obvious. For now, I see no problem with
this reuse.
* SnapshotSelf (that is, HeapTupleSatisfiesSelf()) has additions
analogous to previous additions to the HeapTupleSatisfiesVacuum() and
HeapTupleSatisfiesMVCC() visibility routines. I still don't think that
the changes to tqual.c are completely satisfactory, but as long as
they're directly necessary (which they evidently are - Jeff's
stress-testing tool shows that) then I should at least make the
appropriate changes everywhere. We should definitely focus on why
they're necessary, and consider if we can do better.
* There was some squashing of commits, since Andres felt that they
weren't all useful as separate commits. I've still split out the RTE
permissions commit, as well as the RLS commit (plus the documentation
and test commits, FWIW). I hope that this will make it easier to
review parts of the patch, without being generally excessive.
When documentation and tests are left out, the entire patch series is left at:
68 files changed, 2958 insertions(+), 297 deletions(-)
which is not too big.
Thanks
[1]: /messages/by-id/CAM3SWZTcpy9rroLM3TkfuU4HDLrEtuGzxLptGn2vLhVAFwQCVA@mail.gmail.com -- Peter Geoghegan
--
Peter Geoghegan
Attachments:
0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchtext/x-patch; charset=US-ASCII; name=0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchDownload
From b353b143e2d1a2b9379b8d75b93586b603ec50df Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 26 Aug 2014 21:28:40 -0700
Subject: [PATCH 1/6] Make UPDATE privileges distinct from INSERT privileges in
RTEs
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held. This worked because
ExecCheckRTEPerms() was called with a ACL_INSERT or ACL_UPDATE
requiredPerms, and based on that it was evident which type of
optimizable statement was under consideration. Since historically no
type of optimizable statement could directly INSERT and UPDATE at the
same time, there was no ambiguity as to which privileges were required.
This largely mechanical commit is required infrastructure for the
INSERT...ON CONFLICT UPDATE feature, which introduces an optimizable
statement that may be subject to both INSERT and UPDATE permissions
enforcement. Tests follow in a later commit.
sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken.
---
contrib/sepgsql/dml.c | 31 ++++++---
src/backend/commands/copy.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/trigger.c | 22 +++---
src/backend/executor/execMain.c | 110 +++++++++++++++++++-----------
src/backend/nodes/copyfuncs.c | 3 +-
src/backend/nodes/equalfuncs.c | 3 +-
src/backend/nodes/outfuncs.c | 3 +-
src/backend/nodes/readfuncs.c | 3 +-
src/backend/optimizer/plan/setrefs.c | 6 +-
src/backend/optimizer/prep/prepsecurity.c | 6 +-
src/backend/optimizer/prep/prepunion.c | 8 ++-
src/backend/parser/analyze.c | 4 +-
src/backend/parser/parse_relation.c | 21 ++++--
src/backend/rewrite/rewriteHandler.c | 52 ++++++++------
src/include/nodes/parsenodes.h | 14 ++--
16 files changed, 176 insertions(+), 114 deletions(-)
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 36c6a37..4a71753 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -145,7 +145,8 @@ fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
static bool
check_relation_privileges(Oid relOid,
Bitmapset *selected,
- Bitmapset *modified,
+ Bitmapset *inserted,
+ Bitmapset *updated,
uint32 required,
bool abort_on_violation)
{
@@ -231,8 +232,9 @@ check_relation_privileges(Oid relOid,
* Check permissions on the columns
*/
selected = fixup_whole_row_references(relOid, selected);
- modified = fixup_whole_row_references(relOid, modified);
- columns = bms_union(selected, modified);
+ inserted = fixup_whole_row_references(relOid, inserted);
+ updated = fixup_whole_row_references(relOid, updated);
+ columns = bms_union(selected, bms_union(inserted, updated));
while ((index = bms_first_member(columns)) >= 0)
{
@@ -241,13 +243,16 @@ check_relation_privileges(Oid relOid,
if (bms_is_member(index, selected))
column_perms |= SEPG_DB_COLUMN__SELECT;
- if (bms_is_member(index, modified))
+ if (bms_is_member(index, inserted))
{
- if (required & SEPG_DB_TABLE__UPDATE)
- column_perms |= SEPG_DB_COLUMN__UPDATE;
if (required & SEPG_DB_TABLE__INSERT)
column_perms |= SEPG_DB_COLUMN__INSERT;
}
+ if (bms_is_member(index, updated))
+ {
+ if (required & SEPG_DB_TABLE__UPDATE)
+ column_perms |= SEPG_DB_COLUMN__UPDATE;
+ }
if (column_perms == 0)
continue;
@@ -304,7 +309,7 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
required |= SEPG_DB_TABLE__INSERT;
if (rte->requiredPerms & ACL_UPDATE)
{
- if (!bms_is_empty(rte->modifiedCols))
+ if (!bms_is_empty(rte->updatedCols))
required |= SEPG_DB_TABLE__UPDATE;
else
required |= SEPG_DB_TABLE__LOCK;
@@ -333,7 +338,8 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
{
Oid tableOid = lfirst_oid(li);
Bitmapset *selectedCols;
- Bitmapset *modifiedCols;
+ Bitmapset *insertedCols;
+ Bitmapset *updatedCols;
/*
* child table has different attribute numbers, so we need to fix
@@ -341,15 +347,18 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
*/
selectedCols = fixup_inherited_columns(rte->relid, tableOid,
rte->selectedCols);
- modifiedCols = fixup_inherited_columns(rte->relid, tableOid,
- rte->modifiedCols);
+ insertedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->insertedCols);
+ updatedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->updatedCols);
/*
* check permissions on individual tables
*/
if (!check_relation_privileges(tableOid,
selectedCols,
- modifiedCols,
+ insertedCols,
+ updatedCols,
required, abort_on_violation))
return false;
}
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 92ff632..d2996fb 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -847,7 +847,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
FirstLowInvalidHeapAttributeNumber;
if (is_from)
- rte->modifiedCols = bms_add_member(rte->modifiedCols, attno);
+ rte->insertedCols = bms_add_member(rte->insertedCols, attno);
else
rte->selectedCols = bms_add_member(rte->selectedCols, attno);
}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index c961429..bf2235d 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -433,7 +433,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
rte->requiredPerms = ACL_INSERT;
for (attnum = 1; attnum <= intoRelationDesc->rd_att->natts; attnum++)
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attnum - FirstLowInvalidHeapAttributeNumber);
ExecCheckRTPerms(list_make1(rte), true);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 5c1c1be..7defe80 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -71,8 +71,8 @@ static int MyTriggerDepth = 0;
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
/* Local function prototypes */
static void ConvertTriggerToFK(CreateTrigStmt *stmt, Oid funcoid);
@@ -2343,7 +2343,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TriggerDesc *trigdesc;
int i;
TriggerData LocTriggerData;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
trigdesc = relinfo->ri_TrigDesc;
@@ -2352,7 +2352,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (!trigdesc->trig_update_before_statement)
return;
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
LocTriggerData.type = T_TriggerData;
LocTriggerData.tg_event = TRIGGER_EVENT_UPDATE |
@@ -2373,7 +2373,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, NULL, NULL))
+ updatedCols, NULL, NULL))
continue;
LocTriggerData.tg_trigger = trigger;
@@ -2398,7 +2398,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (trigdesc && trigdesc->trig_update_after_statement)
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
false, NULL, NULL, NIL,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
}
TupleTableSlot *
@@ -2416,7 +2416,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
HeapTuple oldtuple;
TupleTableSlot *newSlot;
int i;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
Bitmapset *keyCols;
LockTupleMode lockmode;
@@ -2425,10 +2425,10 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
* been modified, then we can use a weaker lock, allowing for better
* concurrency.
*/
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
keyCols = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc,
INDEX_ATTR_BITMAP_KEY);
- if (bms_overlap(keyCols, modifiedCols))
+ if (bms_overlap(keyCols, updatedCols))
lockmode = LockTupleExclusive;
else
lockmode = LockTupleNoKeyExclusive;
@@ -2482,7 +2482,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, trigtuple, newtuple))
+ updatedCols, trigtuple, newtuple))
continue;
LocTriggerData.tg_trigtuple = trigtuple;
@@ -2552,7 +2552,7 @@ ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo,
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
true, trigtuple, newtuple, recheckIndexes,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 33b172b..dbcebb7 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -82,6 +82,9 @@ static void ExecutePlan(EState *estate, PlanState *planstate,
ScanDirection direction,
DestReceiver *dest);
static bool ExecCheckRTEPerms(RangeTblEntry *rte);
+static bool ExecCheckRTEPermsModified(Oid relOid, Oid userid,
+ Bitmapset *modifiedCols,
+ AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static char *ExecBuildSlotValueDescription(Oid reloid,
TupleTableSlot *slot,
@@ -97,8 +100,10 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
+#define GetInsertedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->insertedCols)
/* end of local decls */
@@ -559,7 +564,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
AclMode remainingPerms;
Oid relOid;
Oid userid;
- int col;
/*
* Only plain-relation RTEs need to be checked here. Function RTEs are
@@ -597,6 +601,8 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
remainingPerms = requiredPerms & ~relPerms;
if (remainingPerms != 0)
{
+ int col = -1;
+
/*
* If we lack any permissions that exist only as relation permissions,
* we can fail straight away.
@@ -625,7 +631,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
return false;
}
- col = -1;
while ((col = bms_next_member(rte->selectedCols, col)) >= 0)
{
/* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
@@ -648,43 +653,63 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
}
/*
- * Basically the same for the mod columns, with either INSERT or
- * UPDATE privilege as specified by remainingPerms.
+ * Basically the same for the mod columns, for both INSERT and UPDATE
+ * privilege as specified by remainingPerms.
*/
- remainingPerms &= ~ACL_SELECT;
- if (remainingPerms != 0)
- {
- /*
- * When the query doesn't explicitly change any columns, allow the
- * query if we have permission on any column of the rel. This is
- * to handle SELECT FOR UPDATE as well as possible corner cases in
- * INSERT and UPDATE.
- */
- if (bms_is_empty(rte->modifiedCols))
- {
- if (pg_attribute_aclcheck_all(relOid, userid, remainingPerms,
- ACLMASK_ANY) != ACLCHECK_OK)
- return false;
- }
+ if (remainingPerms & ACL_INSERT && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->insertedCols,
+ ACL_INSERT))
+ return false;
- col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
- {
- /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
- AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+ if (remainingPerms & ACL_UPDATE && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->updatedCols,
+ ACL_UPDATE))
+ return false;
+ }
+ return true;
+}
- if (attno == InvalidAttrNumber)
- {
- /* whole-row reference can't happen here */
- elog(ERROR, "whole-row update is not implemented");
- }
- else
- {
- if (pg_attribute_aclcheck(relOid, attno, userid,
- remainingPerms) != ACLCHECK_OK)
- return false;
- }
- }
+/*
+ * ExecCheckRTEPermsModified
+ * Check INSERT or UPDATE access permissions for a single RTE (these
+ * are processed uniformly).
+ */
+static bool
+ExecCheckRTEPermsModified(Oid relOid, Oid userid, Bitmapset *modifiedCols,
+ AclMode requiredPerms)
+{
+ int col = -1;
+
+ /*
+ * When the query doesn't explicitly update any columns, allow the
+ * query if we have permission on any column of the rel. This is
+ * to handle SELECT FOR UPDATE as well as possible corner cases in
+ * UPDATE.
+ */
+ if (bms_is_empty(modifiedCols))
+ {
+ if (pg_attribute_aclcheck_all(relOid, userid, requiredPerms,
+ ACLMASK_ANY) != ACLCHECK_OK)
+ return false;
+ }
+
+ while ((col = bms_next_member(modifiedCols, col)) >= 0)
+ {
+ /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
+ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+
+ if (attno == InvalidAttrNumber)
+ {
+ /* whole-row reference can't happen here */
+ elog(ERROR, "whole-row update is not implemented");
+ }
+ else
+ {
+ if (pg_attribute_aclcheck(relOid, attno, userid,
+ requiredPerms) != ACLCHECK_OK)
+ return false;
}
}
return true;
@@ -1623,7 +1648,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1649,7 +1675,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1708,7 +1735,8 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index f1a24f5..00ffe4a 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2028,7 +2028,8 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(requiredPerms);
COPY_SCALAR_FIELD(checkAsUser);
COPY_BITMAPSET_FIELD(selectedCols);
- COPY_BITMAPSET_FIELD(modifiedCols);
+ COPY_BITMAPSET_FIELD(insertedCols);
+ COPY_BITMAPSET_FIELD(updatedCols);
COPY_NODE_FIELD(securityQuals);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6e8b308..79035b2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2345,7 +2345,8 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(requiredPerms);
COMPARE_SCALAR_FIELD(checkAsUser);
COMPARE_BITMAPSET_FIELD(selectedCols);
- COMPARE_BITMAPSET_FIELD(modifiedCols);
+ COMPARE_BITMAPSET_FIELD(insertedCols);
+ COMPARE_BITMAPSET_FIELD(updatedCols);
COMPARE_NODE_FIELD(securityQuals);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index dd1278b..b4a2667 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2456,7 +2456,8 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
WRITE_UINT_FIELD(requiredPerms);
WRITE_OID_FIELD(checkAsUser);
WRITE_BITMAPSET_FIELD(selectedCols);
- WRITE_BITMAPSET_FIELD(modifiedCols);
+ WRITE_BITMAPSET_FIELD(insertedCols);
+ WRITE_BITMAPSET_FIELD(updatedCols);
WRITE_NODE_FIELD(securityQuals);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ae24d05..dbc162a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1253,7 +1253,8 @@ _readRangeTblEntry(void)
READ_UINT_FIELD(requiredPerms);
READ_OID_FIELD(checkAsUser);
READ_BITMAPSET_FIELD(selectedCols);
- READ_BITMAPSET_FIELD(modifiedCols);
+ READ_BITMAPSET_FIELD(insertedCols);
+ READ_BITMAPSET_FIELD(updatedCols);
READ_NODE_FIELD(securityQuals);
READ_DONE();
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7703946..5d865b0 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -368,9 +368,9 @@ flatten_rtes_walker(Node *node, PlannerGlobal *glob)
*
* In the flat rangetable, we zero out substructure pointers that are not
* needed by the executor; this reduces the storage space and copying cost
- * for cached plans. We keep only the alias and eref Alias fields, which
- * are needed by EXPLAIN, and the selectedCols and modifiedCols bitmaps,
- * which are needed for executor-startup permissions checking and for
+ * for cached plans. We keep only the alias and eref Alias fields, which are
+ * needed by EXPLAIN, and the selectedCols, insertedCols and updatedCols
+ * bitmaps, which are needed for executor-startup permissions checking and for
* trigger event checking.
*/
static void
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index af3ee61..f86e792 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -115,7 +115,8 @@ expand_security_quals(PlannerInfo *root, List *tlist)
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the original relation
@@ -213,7 +214,8 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Now deal with any PlanRowMark on this RTE by requesting a lock
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 05f601e..1e28363 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1367,14 +1367,16 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
* if this is the parent table, leave copyObject's result alone.
*
* Note: we need to do this even though the executor won't run any
- * permissions checks on the child RTE. The modifiedCols bitmap may
- * be examined for trigger-firing purposes.
+ * permissions checks on the child RTE. The insertedCols/updatedCols
+ * bitmaps may be examined for trigger-firing purposes.
*/
if (childOID != parentOID)
{
childrte->selectedCols = translate_col_privs(rte->selectedCols,
appinfo->translated_vars);
- childrte->modifiedCols = translate_col_privs(rte->modifiedCols,
+ childrte->insertedCols = translate_col_privs(rte->insertedCols,
+ appinfo->translated_vars);
+ childrte->updatedCols = translate_col_privs(rte->updatedCols,
appinfo->translated_vars);
}
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a68f2e8..df89065 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -733,7 +733,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
false);
qry->targetList = lappend(qry->targetList, tle);
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attr_num - FirstLowInvalidHeapAttributeNumber);
icols = lnext(icols);
@@ -2002,7 +2002,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
origTarget->location);
/* Mark the target column as requiring update permissions */
- target_rte->modifiedCols = bms_add_member(target_rte->modifiedCols,
+ target_rte->updatedCols = bms_add_member(target_rte->updatedCols,
attrno - FirstLowInvalidHeapAttributeNumber);
origTargetList = lnext(origTargetList);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 8d4f79f..d2820d8 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1052,7 +1052,8 @@ addRangeTableEntry(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1105,7 +1106,8 @@ addRangeTableEntryForRelation(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1183,7 +1185,8 @@ addRangeTableEntryForSubquery(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1437,7 +1440,8 @@ addRangeTableEntryForFunction(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1509,7 +1513,8 @@ addRangeTableEntryForValues(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1577,7 +1582,8 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1677,7 +1683,8 @@ addRangeTableEntryForCTE(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index b8e6e7a..fab2948 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1403,7 +1403,8 @@ ApplyRetrieveRule(Query *parsetree,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the view should remain as
@@ -1466,12 +1467,14 @@ ApplyRetrieveRule(Query *parsetree,
subrte->requiredPerms = rte->requiredPerms;
subrte->checkAsUser = rte->checkAsUser;
subrte->selectedCols = rte->selectedCols;
- subrte->modifiedCols = rte->modifiedCols;
+ subrte->insertedCols = rte->insertedCols;
+ subrte->updatedCols = rte->updatedCols;
rte->requiredPerms = 0; /* no permission check on subquery itself */
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* If FOR [KEY] UPDATE/SHARE of view, mark all the contained tables as
@@ -2584,9 +2587,9 @@ rewriteTargetView(Query *parsetree, Relation view)
/*
* For INSERT/UPDATE the modified columns must all be updatable. Note that
* we get the modified columns from the query's targetlist, not from the
- * result RTE's modifiedCols set, since rewriteTargetListIU may have added
- * additional targetlist entries for view defaults, and these must also be
- * updatable.
+ * result RTE's insertedCols and/or updatedCols set, since
+ * rewriteTargetListIU may have added additional targetlist entries for
+ * view defaults, and these must also be updatable.
*/
if (parsetree->commandType != CMD_DELETE)
{
@@ -2723,26 +2726,31 @@ rewriteTargetView(Query *parsetree, Relation view)
*
* Initially, new_rte contains selectedCols permission check bits for all
* base-rel columns referenced by the view, but since the view is a SELECT
- * query its modifiedCols is empty. We set modifiedCols to include all
- * the columns the outer query is trying to modify, adjusting the column
- * numbers as needed. But we leave selectedCols as-is, so the view owner
- * must have read permission for all columns used in the view definition,
- * even if some of them are not read by the outer query. We could try to
- * limit selectedCols to only columns used in the transformed query, but
- * that does not correspond to what happens in ordinary SELECT usage of a
- * view: all referenced columns must have read permission, even if
- * optimization finds that some of them can be discarded during query
- * transformation. The flattening we're doing here is an optional
- * optimization, too. (If you are unpersuaded and want to change this,
- * note that applying adjust_view_column_set to view_rte->selectedCols is
- * clearly *not* the right answer, since that neglects base-rel columns
- * used in the view's WHERE quals.)
+ * query its insertedCols/updatedCols is empty. We set insertedCols and
+ * updatedCols to include all the columns the outer query is trying to
+ * modify, adjusting the column numbers as needed. But we leave
+ * selectedCols as-is, so the view owner must have read permission for all
+ * columns used in the view definition, even if some of them are not read
+ * by the outer query. We could try to limit selectedCols to only columns
+ * used in the transformed query, but that does not correspond to what
+ * happens in ordinary SELECT usage of a view: all referenced columns must
+ * have read permission, even if optimization finds that some of them can
+ * be discarded during query transformation. The flattening we're doing
+ * here is an optional optimization, too. (If you are unpersuaded and want
+ * to change this, note that applying adjust_view_column_set to
+ * view_rte->selectedCols is clearly *not* the right answer, since that
+ * neglects base-rel columns used in the view's WHERE quals.)
*
* This step needs the modified view targetlist, so we have to do things
* in this order.
*/
- Assert(bms_is_empty(new_rte->modifiedCols));
- new_rte->modifiedCols = adjust_view_column_set(view_rte->modifiedCols,
+ Assert(bms_is_empty(new_rte->insertedCols) &&
+ bms_is_empty(new_rte->updatedCols));
+
+ new_rte->insertedCols = adjust_view_column_set(view_rte->insertedCols,
+ view_targetlist);
+
+ new_rte->updatedCols = adjust_view_column_set(view_rte->updatedCols,
view_targetlist);
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..86d1c07 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -717,11 +717,12 @@ typedef struct XmlSerialize
* For SELECT/INSERT/UPDATE permissions, if the user doesn't have
* table-wide permissions then it is sufficient to have the permissions
* on all columns identified in selectedCols (for SELECT) and/or
- * modifiedCols (for INSERT/UPDATE; we can tell which from the query type).
- * selectedCols and modifiedCols are bitmapsets, which cannot have negative
- * integer members, so we subtract FirstLowInvalidHeapAttributeNumber from
- * column numbers before storing them in these fields. A whole-row Var
- * reference is represented by setting the bit for InvalidAttrNumber.
+ * insertedCols and/or updatedCols (INSERT with ON CONFLICT UPDATE may
+ * have all 3). selectedCols, insertedCols and updatedCols are
+ * bitmapsets, which cannot have negative integer members, so we subtract
+ * FirstLowInvalidHeapAttributeNumber from column numbers before storing
+ * them in these fields. A whole-row Var reference is represented by
+ * setting the bit for InvalidAttrNumber.
*--------------------
*/
typedef enum RTEKind
@@ -816,7 +817,8 @@ typedef struct RangeTblEntry
AclMode requiredPerms; /* bitmask of required access permissions */
Oid checkAsUser; /* if valid, check access as this role */
Bitmapset *selectedCols; /* columns needing SELECT permission */
- Bitmapset *modifiedCols; /* columns needing INSERT/UPDATE permission */
+ Bitmapset *insertedCols; /* columns needing INSERT permission */
+ Bitmapset *updatedCols; /* columns needing UPDATE permission */
List *securityQuals; /* any security barrier quals to apply */
} RangeTblEntry;
--
1.9.1
0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 81f838d81f15e9b8e91cf51efacb2164a20bf55d Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:01:32 -0700
Subject: [PATCH 2/6] Support INSERT ... ON CONFLICT {UPDATE | IGNORE}
This non-standard INSERT clause allows DML statement authors to specify
that in the event of each of any of the tuples being inserted
duplicating an existing tuple in terms of a value or set of values
constrained by a unique index, an alternative path may be taken. The
statement may alternatively IGNORE the tuple being inserted without
raising an error, or go to UPDATE the existing tuple whose value is
duplicated by a value within one single tuple proposed for insertion.
The implementation loops until either an insert or an UPDATE/IGNORE
occurs. No existing tuple may be affected more than once per INSERT.
This is implemented using a new infrastructure called "speculative
insertion". (The approach to "Value locking" presenting here follows
design #2, as described on the value locking Postgres Wiki page).
Alternatively, we may go to UPDATE, using the EvalPlanQual() mechanism
to execute a special auxiliary plan.
READ COMMITTED isolation level is permitted to UPDATE a tuple even where
no version is visible to the command's MVCC snapshot. Similarly, any
query predicate associated with the UPDATE portion of the new statement
need only satisfy an already locked, conclusively committed and visible
conflict tuple. When the predicate isn't satisfied, the tuple is still
locked, which implies that at READ COMMITTED, a tuple may be locked
without any version being visible to the command's MVCC snapshot.
Users specify a single unique index to take the alternative path on,
which is inferred from a set of user-supplied column names (or
expressions). This is mandatory for the ON CONFLICT UPDATE variant,
which should address concerns about spuriously taking an incorrect
alternative ON CONFLICT path (i.e. the wrong unique index is used for
arbitration of whether or not to take the alternative path) due to there
being more than one would-be unique violation. Previous revisions of
the patch didn't mandate this. However, we may still IGNORE based on
the first would-be unique violation detected, on the assumption that it
doesn't particularly matter where it originated from for that variant
(iff the user didn't make a point of indicated his or her intent).
The auxiliary ModifyTable plan used by the UPDATE portion of the new
statement is not formally a subplan of its parent INSERT ModifyTable
plan. Rather, it's an independently planned subquery, whose execution
is tightly driven by its parent. Special auxiliary state pertaining to
the auxiliary UPDATE is tracked by its parent through all stages of
query execution.
The implementation imposes some restrictions on child auxiliary UPDATE
plans, which make the plans comport with their parent to the extent
required during the executor stage. One user-visible consequences of
this is that the special auxiliary UPDATE query cannot have subselects
within its targetlist or WHERE clause. UPDATEs may not reference any
other table, and UPDATE FROM is disallowed. INSERT's RETURNING clause
projects tuples successfully inserted and updated. An INSERT with an ON
CONFLICT UPDATE clause processes all slots that are ultimately affected,
regardless of whether or not the alternative ON CONFLICT UPDATE path was
taken. However, if an ON CONFLICT UPDATE's WHERE clause is not
satisfied in respect of some slot/tuple, the post-update tuple is not
projected (although the row is still locked, just as before).
Note that pg_stat_statements does not fingerprint ExludedExpr, because
it cannot appear in the post-parse-analysis, pre-rewrite Query tree.
(pg_stat_statements does not fingerprint every primnode anyway, mostly
because some are only expected in utility statements). Other existing
Node handling sites that don't expect to see primnodes that appear only
after rewriting (ExcludedExpr may be in its own subcategory here in that
it is the only such non-utility related Node) do not have an
ExcludedExpr case added either.
---
contrib/pg_stat_statements/pg_stat_statements.c | 5 +
contrib/postgres_fdw/deparse.c | 7 +-
contrib/postgres_fdw/postgres_fdw.c | 16 +-
contrib/postgres_fdw/postgres_fdw.h | 2 +-
src/backend/access/heap/heapam.c | 97 ++++-
src/backend/access/nbtree/nbtinsert.c | 32 +-
src/backend/catalog/index.c | 59 ++-
src/backend/catalog/indexing.c | 2 +-
src/backend/commands/constraint.c | 7 +-
src/backend/commands/copy.c | 7 +-
src/backend/commands/explain.c | 87 ++++-
src/backend/executor/execMain.c | 14 +-
src/backend/executor/execQual.c | 54 +++
src/backend/executor/execUtils.c | 257 +++++++++++--
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeModifyTable.c | 464 +++++++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 55 +++
src/backend/nodes/equalfuncs.c | 43 +++
src/backend/nodes/nodeFuncs.c | 74 ++++
src/backend/nodes/outfuncs.c | 18 +
src/backend/nodes/readfuncs.c | 19 +
src/backend/optimizer/path/indxpath.c | 57 +++
src/backend/optimizer/path/tidpath.c | 8 +-
src/backend/optimizer/plan/createplan.c | 16 +-
src/backend/optimizer/plan/planner.c | 53 +++
src/backend/optimizer/plan/setrefs.c | 31 +-
src/backend/optimizer/plan/subselect.c | 6 +
src/backend/optimizer/util/plancat.c | 222 +++++++++++-
src/backend/parser/analyze.c | 86 ++++-
src/backend/parser/gram.y | 75 +++-
src/backend/parser/parse_clause.c | 258 +++++++++++--
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_node.c | 8 +-
src/backend/rewrite/rewriteHandler.c | 127 ++++++-
src/backend/storage/ipc/procarray.c | 109 ++++++
src/backend/storage/lmgr/lmgr.c | 68 ++++
src/backend/tcop/pquery.c | 16 +-
src/backend/utils/adt/lockfuncs.c | 1 +
src/backend/utils/adt/ruleutils.c | 39 ++
src/backend/utils/time/tqual.c | 52 +++
src/bin/psql/common.c | 5 +-
src/include/access/heapam.h | 3 +-
src/include/access/heapam_xlog.h | 2 +
src/include/catalog/index.h | 2 +
src/include/executor/executor.h | 21 +-
src/include/nodes/execnodes.h | 19 +
src/include/nodes/nodes.h | 18 +
src/include/nodes/parsenodes.h | 40 +-
src/include/nodes/plannodes.h | 3 +
src/include/nodes/primnodes.h | 47 +++
src/include/optimizer/paths.h | 1 +
src/include/optimizer/plancat.h | 2 +
src/include/optimizer/planmain.h | 3 +-
src/include/parser/kwlist.h | 2 +
src/include/parser/parse_clause.h | 2 +
src/include/parser/parse_node.h | 1 +
src/include/storage/lmgr.h | 5 +
src/include/storage/lock.h | 10 +
src/include/storage/proc.h | 13 +
src/include/storage/procarray.h | 7 +
src/include/utils/snapshot.h | 11 +
61 files changed, 2624 insertions(+), 159 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 95616b3..414ec83 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2198,6 +2198,11 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
JumbleRangeTable(jstate, query->rtable);
JumbleExpr(jstate, (Node *) query->jointree);
JumbleExpr(jstate, (Node *) query->targetList);
+ APP_JUMB(query->specClause);
+ JumbleExpr(jstate, (Node *) query->arbiterExpr);
+ JumbleExpr(jstate, query->arbiterWhere);
+ if (query->onConflict)
+ JumbleQuery(jstate, (Query *) query->onConflict);
JumbleExpr(jstate, (Node *) query->returningList);
JumbleExpr(jstate, (Node *) query->groupClause);
JumbleExpr(jstate, query->havingQual);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 59cb053..ca51586 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -847,8 +847,8 @@ appendWhereClause(StringInfo buf,
void
deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
- List **retrieved_attrs)
+ List *targetAttrs, bool ignore,
+ List *returningList, List **retrieved_attrs)
{
AttrNumber pindex;
bool first;
@@ -892,6 +892,9 @@ deparseInsertSql(StringInfo buf, PlannerInfo *root,
else
appendStringInfoString(buf, " DEFAULT VALUES");
+ if (ignore)
+ appendStringInfoString(buf, " ON CONFLICT IGNORE");
+
deparseReturningList(buf, root, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_insert_after_row,
returningList, retrieved_attrs);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..1539899 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1167,6 +1167,7 @@ postgresPlanForeignModify(PlannerInfo *root,
List *targetAttrs = NIL;
List *returningList = NIL;
List *retrieved_attrs = NIL;
+ bool ignore = false;
initStringInfo(&sql);
@@ -1201,7 +1202,7 @@ postgresPlanForeignModify(PlannerInfo *root,
int col;
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -1218,6 +1219,17 @@ postgresPlanForeignModify(PlannerInfo *root,
if (plan->returningLists)
returningList = (List *) list_nth(plan->returningLists, subplan_index);
+ if (root->parse->arbiterExpr)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT unique index inference")));
+ else if (plan->spec == SPEC_INSERT)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT UPDATE")));
+ else if (plan->spec == SPEC_IGNORE)
+ ignore = true;
+
/*
* Construct the SQL command string.
*/
@@ -1225,7 +1237,7 @@ postgresPlanForeignModify(PlannerInfo *root,
{
case CMD_INSERT:
deparseInsertSql(&sql, root, resultRelation, rel,
- targetAttrs, returningList,
+ targetAttrs, ignore, returningList,
&retrieved_attrs);
break;
case CMD_UPDATE:
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..3763a57 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -60,7 +60,7 @@ extern void appendWhereClause(StringInfo buf,
List **params);
extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
+ List *targetAttrs, bool ignore, List *returningList,
List **retrieved_attrs);
extern void deparseUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 46060bc..0aa3e57 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
*
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc->specInsert fields
+ * are filled.
+ *
* Note that these options will be applied when inserting into the heap's
* TOAST table, too, if the tuple requires any out-of-line data.
*
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
END_CRIT_SECTION();
+ /*
+ * Let others know that we speculatively inserted this tuple, before
+ * releasing the buffer lock.
+ */
+ if (options & HEAP_INSERT_SPECULATIVE)
+ SetSpeculativeInsertionTid(relation->rd_node, &heaptup->t_self);
+
UnlockReleaseBuffer(buffer);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we "super-delete" a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id. Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
*/
HTSU_Result
heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd)
+ HeapUpdateFailureData *hufd, bool killspeculative)
{
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ if (!killspeculative)
+ {
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ }
+ else
+ {
+ if (tp.t_data->t_choice.t_heap.t_xmin != xid ||
+ tp.t_data->t_choice.t_heap.t_field3.t_cid != cid)
+ elog(ERROR, "attempted to super-delete a tuple from other CID");
+ result = HeapTupleMayBeUpdated;
+ }
+
if (result == HeapTupleInvisible)
{
@@ -2823,12 +2850,15 @@ l1:
* using our own TransactionId below, since some other backend could
* incorporate our XID into a MultiXact immediately afterwards.)
*/
- MultiXactIdSetOldestMember();
+ if (!killspeculative)
+ {
+ MultiXactIdSetOldestMember();
- compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
- tp.t_data->t_infomask, tp.t_data->t_infomask2,
- xid, LockTupleExclusive, true,
- &new_xmax, &new_infomask, &new_infomask2);
+ compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
+ tp.t_data->t_infomask, tp.t_data->t_infomask2,
+ xid, LockTupleExclusive, true,
+ &new_xmax, &new_infomask, &new_infomask2);
+ }
START_CRIT_SECTION();
@@ -2855,8 +2885,23 @@ l1:
tp.t_data->t_infomask |= new_infomask;
tp.t_data->t_infomask2 |= new_infomask2;
HeapTupleHeaderClearHotUpdated(tp.t_data);
- HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
- HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ /*
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
+ * instead of setting xmax, to make the tuple clearly invisible to
+ * everyone. In particular, we want HeapTupleSatisfiesDirty() to regard
+ * the tuple as dead, so that another backend inserting a duplicate key
+ * value won't unnecessarily wait for our transaction to finish.
+ */
+ if (!killspeculative)
+ {
+ HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
+ HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ }
+ else
+ {
+ HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ }
+
/* Make sure there is no forward chain link in t_ctid */
tp.t_data->t_ctid = tp.t_self;
@@ -2872,7 +2917,11 @@ l1:
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);
- xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
+ xlrec.flags = 0;
+ if (all_visible_cleared)
+ xlrec.flags |= XLOG_HEAP_ALL_VISIBLE_CLEARED;
+ if (killspeculative)
+ xlrec.flags |= XLOG_HEAP_KILLED_SPECULATIVE_TUPLE;
xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask,
tp.t_data->t_infomask2);
xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self);
@@ -2977,7 +3026,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
result = heap_delete(relation, tid,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd, false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -4070,14 +4119,16 @@ get_mxact_status_for_lock(LockTupleMode mode, bool is_update)
*
* Function result may be:
* HeapTupleMayBeUpdated: lock was successfully acquired
+ * HeapTupleInvisible: lock failed because tuple instantaneously invisible
* HeapTupleSelfUpdated: lock failed because tuple updated by self
* HeapTupleUpdated: lock failed because tuple updated by other xact
* HeapTupleWouldBlock: lock couldn't be acquired and wait_policy is skip
*
- * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
- * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
- * (the last only for HeapTupleSelfUpdated, since we
- * cannot obtain cmax from a combocid generated by another transaction).
+ * In the failure cases other than HeapTupleInvisible, the routine fills
+ * *hufd with the tuple's t_ctid, t_xmax (resolving a possible MultiXact,
+ * if necessary), and t_cmax (the last only for HeapTupleSelfUpdated,
+ * since we cannot obtain cmax from a combocid generated by another
+ * transaction).
* See comments for struct HeapUpdateFailureData for additional info.
*
* See README.tuplock for a thorough explanation of this mechanism.
@@ -4115,8 +4166,15 @@ l3:
if (result == HeapTupleInvisible)
{
- UnlockReleaseBuffer(*buffer);
- elog(ERROR, "attempted to lock invisible tuple");
+ LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
+
+ /*
+ * This is possible, but only when locking a tuple for speculative
+ * insertion. We return this value here rather than throwing an error
+ * in order to give that case the opportunity to throw a more specific
+ * error.
+ */
+ return HeapTupleInvisible;
}
else if (result == HeapTupleBeingUpdated)
{
@@ -7326,7 +7384,10 @@ heap_xlog_delete(XLogReaderState *record)
HeapTupleHeaderClearHotUpdated(htup);
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ else
+ HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 932c6f7..1a4e18d 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -51,7 +51,8 @@ static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
Relation heapRel, Buffer buf, OffsetNumber offset,
ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique);
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken);
static void _bt_findinsertloc(Relation rel,
Buffer *bufptr,
OffsetNumber *offsetptr,
@@ -159,17 +160,27 @@ top:
*/
if (checkUnique != UNIQUE_CHECK_NO)
{
- TransactionId xwait;
+ TransactionId xwait;
+ uint32 speculativeToken;
offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey,
- checkUnique, &is_unique);
+ checkUnique, &is_unique, &speculativeToken);
if (TransactionIdIsValid(xwait))
{
/* Have to wait for the other guy ... */
_bt_relbuf(rel, buf);
- XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+ /*
+ * If it's a speculative insertion, wait for it to finish (ie.
+ * to go ahead with the insertion, or kill the tuple). Otherwise
+ * wait for the transaction to finish as usual.
+ */
+ if (speculativeToken)
+ SpeculativeInsertionWait(xwait, speculativeToken);
+ else
+ XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+
/* start over... */
_bt_freestack(stack);
goto top;
@@ -211,9 +222,12 @@ top:
* also point to end-of-page, which means that the first tuple to check
* is the first tuple on the next page.
*
- * Returns InvalidTransactionId if there is no conflict, else an xact ID
- * we must wait for to see if it commits a conflicting tuple. If an actual
- * conflict is detected, no return --- just ereport().
+ * Returns InvalidTransactionId if there is no conflict, else an xact ID we
+ * must wait for to see if it commits a conflicting tuple. If an actual
+ * conflict is detected, no return --- just ereport(). If an xact ID is
+ * returned, and the conflicting tuple still has a speculative insertion in
+ * progress, *speculativeToken is set to non-zero, and the caller can wait for
+ * the verdict on the insertion using SpeculativeInsertionWait().
*
* However, if checkUnique == UNIQUE_CHECK_PARTIAL, we always return
* InvalidTransactionId because we don't want to wait. In this case we
@@ -223,7 +237,8 @@ top:
static TransactionId
_bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
Buffer buf, OffsetNumber offset, ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique)
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int natts = rel->rd_rel->relnatts;
@@ -340,6 +355,7 @@ _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
if (nbuf != InvalidBuffer)
_bt_relbuf(rel, nbuf);
/* Tell _bt_doinsert to wait... */
+ *speculativeToken = SnapshotDirty.speculativeToken;
return xwait;
}
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f85ed93..e986d7e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1662,6 +1662,10 @@ BuildIndexInfo(Relation index)
/* other info */
ii->ii_Unique = indexStruct->indisunique;
ii->ii_ReadyForInserts = IndexIsReady(indexStruct);
+ /* assume not doing speculative insertion for now */
+ ii->ii_UniqueOps = NULL;
+ ii->ii_UniqueProcs = NULL;
+ ii->ii_UniqueStrats = NULL;
/* initialize index-build state to default */
ii->ii_Concurrent = false;
@@ -1671,6 +1675,53 @@ BuildIndexInfo(Relation index)
}
/* ----------------
+ * AddUniqueSpeculative
+ * Append extra state to IndexInfo record
+ *
+ * For unique indexes, we usually don't want to add info to the IndexInfo for
+ * checking uniqueness, since the B-Tree AM handles that directly. However, in
+ * the case of speculative insertion, external support is required.
+ *
+ * Do this processing here rather than in BuildIndexInfo() to save the common
+ * non-speculative cases the overhead they'd otherwise incur.
+ * ----------------
+ */
+void
+AddUniqueSpeculative(Relation index, IndexInfo *ii)
+{
+ int ncols = index->rd_rel->relnatts;
+ int i;
+
+ /*
+ * fetch info for checking unique indexes
+ */
+ Assert(ii->ii_Unique);
+
+ if (index->rd_rel->relam != BTREE_AM_OID)
+ elog(ERROR, "unexpected non-btree speculative unique index");
+
+ ii->ii_UniqueOps = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueProcs = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueStrats = (uint16 *) palloc(sizeof(uint16) * ncols);
+
+ /*
+ * We have to look up the operator's strategy number. This
+ * provides a cross-check that the operator does match the index.
+ */
+ /* We need the func OIDs and strategy numbers too */
+ for (i = 0; i < ncols; i++)
+ {
+ ii->ii_UniqueStrats[i] = BTEqualStrategyNumber;
+ ii->ii_UniqueOps[i] =
+ get_opfamily_member(index->rd_opfamily[i],
+ index->rd_opcintype[i],
+ index->rd_opcintype[i],
+ ii->ii_UniqueStrats[i]);
+ ii->ii_UniqueProcs[i] = get_opcode(ii->ii_UniqueOps[i]);
+ }
+}
+
+/* ----------------
* FormIndexDatum
* Construct values[] and isnull[] arrays for a new index tuple.
*
@@ -2606,10 +2657,10 @@ IndexCheckExclusion(Relation heapRelation,
/*
* Check that this tuple has no conflicts.
*/
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- &(heapTuple->t_self), values, isnull,
- estate, true, false);
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &(heapTuple->t_self),
+ values, isnull, estate, true,
+ false, true, NULL);
}
heap_endscan(scan);
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index fe123ad..0231084 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -46,7 +46,7 @@ CatalogOpenIndexes(Relation heapRel)
resultRelInfo->ri_RelationDesc = heapRel;
resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, false);
return resultRelInfo;
}
diff --git a/src/backend/commands/constraint.c b/src/backend/commands/constraint.c
index 561d8fa..d5ab12f 100644
--- a/src/backend/commands/constraint.c
+++ b/src/backend/commands/constraint.c
@@ -170,9 +170,10 @@ unique_key_recheck(PG_FUNCTION_ARGS)
* For exclusion constraints we just do the normal check, but now it's
* okay to throw error.
*/
- check_exclusion_constraint(trigdata->tg_relation, indexRel, indexInfo,
- &(new_row->t_self), values, isnull,
- estate, false, false);
+ check_exclusion_or_unique_constraint(trigdata->tg_relation, indexRel,
+ indexInfo, &(new_row->t_self),
+ values, isnull, estate, false,
+ false, true, NULL);
}
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index d2996fb..2d45eb3 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2283,7 +2283,7 @@ CopyFrom(CopyState cstate)
1, /* dummy rangetable index */
0);
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, false);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -2438,7 +2438,8 @@ CopyFrom(CopyState cstate)
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false,
+ InvalidOid);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, tuple,
@@ -2552,7 +2553,7 @@ CopyFromInsertBatch(CopyState cstate, EState *estate, CommandId mycid,
ExecStoreTuple(bufferedTuples[i], myslot, InvalidBuffer, false);
recheckIndexes =
ExecInsertIndexTuples(myslot, &(bufferedTuples[i]->t_self),
- estate);
+ estate, false, InvalidOid);
ExecARInsertTriggers(estate, resultRelInfo,
bufferedTuples[i],
recheckIndexes);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7cfc9bb..e6a8d8e 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -103,7 +103,8 @@ static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
static void ExplainScanTarget(Scan *plan, ExplainState *es);
static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
-static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es);
+static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors);
static void ExplainMemberNodes(List *plans, PlanState **planstates,
List *ancestors, ExplainState *es);
static void ExplainSubPlans(List *plans, List *ancestors,
@@ -763,6 +764,9 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
ExplainPreScanMemberNodes(((ModifyTable *) plan)->plans,
((ModifyTableState *) planstate)->mt_plans,
rels_used);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainPreScanNode(((ModifyTableState *) planstate)->onConflict,
+ rels_used);
break;
case T_Append:
ExplainPreScanMemberNodes(((Append *) plan)->appendplans,
@@ -864,6 +868,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
const char *custom_name = NULL;
int save_indent = es->indent;
bool haschildren;
+ bool suppresschildren = false;
+ ModifyTable *mtplan;
switch (nodeTag(plan))
{
@@ -872,13 +878,33 @@ ExplainNode(PlanState *planstate, List *ancestors,
break;
case T_ModifyTable:
sname = "ModifyTable";
- switch (((ModifyTable *) plan)->operation)
+ mtplan = (ModifyTable *) plan;
+ switch (mtplan->operation)
{
case CMD_INSERT:
pname = operation = "Insert";
break;
case CMD_UPDATE:
- pname = operation = "Update";
+ if (mtplan->spec == SPEC_NONE)
+ {
+ pname = operation = "Update";
+ }
+ else
+ {
+ Assert(mtplan->spec == SPEC_UPDATE);
+
+ pname = operation = "Conflict Update";
+
+ /*
+ * Do not display child sequential scan/result node.
+ * Quals from child will be directly attributed to
+ * ModifyTable node, since we prefer to avoid
+ * displaying scan node to users, as it is merely an
+ * implementation detail; it is never executed in the
+ * conventional way.
+ */
+ suppresschildren = true;
+ }
break;
case CMD_DELETE:
pname = operation = "Delete";
@@ -1458,7 +1484,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate, es);
break;
case T_ModifyTable:
- show_modifytable_info((ModifyTableState *) planstate, es);
+ show_modifytable_info((ModifyTableState *) planstate, es,
+ ancestors);
break;
case T_Hash:
show_hash_info((HashState *) planstate, es);
@@ -1586,7 +1613,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate->subPlan;
if (haschildren)
{
- ExplainOpenGroup("Plans", "Plans", false, es);
+ if (!suppresschildren)
+ ExplainOpenGroup("Plans", "Plans", false, es);
/* Pass current PlanState as head of ancestors list for children */
ancestors = lcons(planstate, ancestors);
}
@@ -1609,9 +1637,13 @@ ExplainNode(PlanState *planstate, List *ancestors,
switch (nodeTag(plan))
{
case T_ModifyTable:
- ExplainMemberNodes(((ModifyTable *) plan)->plans,
- ((ModifyTableState *) planstate)->mt_plans,
- ancestors, es);
+ if (((ModifyTable *) plan)->spec != SPEC_UPDATE)
+ ExplainMemberNodes(((ModifyTable *) plan)->plans,
+ ((ModifyTableState *) planstate)->mt_plans,
+ ancestors, es);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainNode(((ModifyTableState *) planstate)->onConflict,
+ ancestors, "Member", NULL, es);
break;
case T_Append:
ExplainMemberNodes(((Append *) plan)->appendplans,
@@ -1649,7 +1681,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (haschildren)
{
ancestors = list_delete_first(ancestors);
- ExplainCloseGroup("Plans", "Plans", false, es);
+
+ if (!suppresschildren)
+ ExplainCloseGroup("Plans", "Plans", false, es);
}
/* in text format, undo whatever indentation we added */
@@ -2202,6 +2236,15 @@ ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
rti = linitial_int(plan->resultRelations);
ExplainTargetRel((Plan *) plan, rti, es);
+
+ if (plan->arbiterIndex != InvalidOid)
+ {
+ char *indexname = get_rel_name(plan->arbiterIndex);
+
+ /* nothing to do for text format explains */
+ if (es->format != EXPLAIN_FORMAT_TEXT && indexname != NULL)
+ ExplainPropertyText("Arbiter Index", indexname, es);
+ }
}
/*
@@ -2237,6 +2280,12 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
if (es->verbose)
namespace = get_namespace_name(get_rel_namespace(rte->relid));
objecttag = "Relation Name";
+
+ /*
+ * ON CONFLICT's "TARGET" alias will not appear in output for
+ * auxiliary ModifyTable as its alias, because target
+ * resultRelation is shared between parent and auxiliary queries
+ */
break;
case T_FunctionScan:
{
@@ -2315,7 +2364,8 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
* Show extra information for a ModifyTable node
*/
static void
-show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
+show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors)
{
FdwRoutine *fdwroutine = mtstate->resultRelInfo->ri_FdwRoutine;
@@ -2337,6 +2387,23 @@ show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
0,
es);
}
+ else if (mtstate->spec == SPEC_UPDATE)
+ {
+ PlanState *ps = (*mtstate->mt_plans);
+
+ /*
+ * Seqscan node is always used, unless optimizer determined that
+ * predicate precludes ever updating, in which case a simple Result
+ * node is possible
+ */
+ Assert(IsA(ps->plan, SeqScan) || IsA(ps->plan, Result));
+
+ /* Attribute child scan node's qual to ModifyTable node */
+ show_scan_qual(ps->plan->qual, "Filter", ps, ancestors, es);
+
+ if (ps->plan->qual)
+ show_instrumentation_count("Rows Removed by Filter", 1, ps, es);
+ }
}
/*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbcebb7..3d7761d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2122,7 +2122,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* the latest version of the row was deleted, so we need do
* nothing. (Should be safe to examine xmin without getting
* buffer's content lock, since xmin never changes in an existing
- * tuple.)
+ * non-promise tuple, and there is no reason to lock a promise
+ * tuple until it is clear that it has been fulfilled.)
*/
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple.t_data),
priorXmax))
@@ -2203,11 +2204,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* case, so as to avoid the "Halloween problem" of
* repeated update attempts. In the latter case it might
* be sensible to fetch the updated tuple instead, but
- * doing so would require changing heap_lock_tuple as well
- * as heap_update and heap_delete to not complain about
- * updating "invisible" tuples, which seems pretty scary.
- * So for now, treat the tuple as deleted and do not
- * process.
+ * doing so would require changing heap_update and
+ * heap_delete to not complain about updating "invisible"
+ * tuples, which seems pretty scary (heap_lock_tuple will
+ * not complain, but few callers expect HeapTupleInvisible,
+ * and we're not one of them). So for now, treat the tuple
+ * as deleted and do not process.
*/
ReleaseBuffer(buffer);
return NULL;
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 0e7400f..57d726e 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -182,6 +182,9 @@ static Datum ExecEvalArrayCoerceExpr(ArrayCoerceExprState *astate,
bool *isNull, ExprDoneCond *isDone);
static Datum ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
bool *isNull, ExprDoneCond *isDone);
+static Datum ExecEvalExcluded(ExcludedExprState *excludedExpr,
+ ExprContext *econtext, bool *isNull,
+ ExprDoneCond *isDone);
/* ----------------------------------------------------------------
@@ -4338,6 +4341,33 @@ ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
return 0; /* keep compiler quiet */
}
+/* ----------------------------------------------------------------
+ * ExecEvalExcluded
+ * ----------------------------------------------------------------
+ */
+static Datum
+ExecEvalExcluded(ExcludedExprState *excludedExpr, ExprContext *econtext,
+ bool *isNull, ExprDoneCond *isDone)
+{
+ /*
+ * ExcludedExpr is essentially an expression that adapts its single Var
+ * argument to refer to the expression context inner slot's tuple, which is
+ * reserved for the purpose of referencing EXCLUDED.* tuples within ON
+ * CONFLICT UPDATE auxiliary queries' EPQ expression context (ON CONFLICT
+ * UPDATE makes special use of the EvalPlanQual() mechanism to update).
+ *
+ * nodeModifyTable.c assigns its own table slot in the auxiliary queries'
+ * EPQ expression state (originating in the parent INSERT node) on the
+ * assumption that it may only be used by ExcludedExpr, and on the
+ * assumption that the inner slot is not otherwise useful. This occurs in
+ * advance of the expression evaluation for UPDATE (which calls here are
+ * part of) once per slot proposed for insertion, and works because of
+ * restrictions on the structure of ON CONFLICT UPDATE auxiliary queries.
+ *
+ * Just evaluate nested Var.
+ */
+ return ExecEvalScalarVar(excludedExpr->arg, econtext, isNull, isDone);
+}
/*
* ExecEvalExprSwitchContext
@@ -5065,6 +5095,30 @@ ExecInitExpr(Expr *node, PlanState *parent)
state = (ExprState *) makeNode(ExprState);
state->evalfunc = ExecEvalCurrentOfExpr;
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExprState *cstate = makeNode(ExcludedExprState);
+ Var *contained = (Var*) excludedexpr->arg;
+
+ /*
+ * varno forced to INNER_VAR -- see remarks within
+ * ExecLockUpdateTuple().
+ *
+ * We rely on the assumption that the only place that
+ * ExcludedExpr may appear is where EXCLUDED Var references
+ * originally appeared after parse analysis. The rewriter
+ * replaces these with ExcludedExpr that reference the
+ * corresponding Var within the ON CONFLICT UPDATE target RTE.
+ */
+ Assert(IsA(contained, Var));
+
+ contained->varno = INNER_VAR;
+ cstate->arg = ExecInitExpr((Expr *) contained, parent);
+ state = (ExprState *) cstate;
+ state->evalfunc = (ExprStateEvalFunc) ExecEvalExcluded;
+ }
+ break;
case T_TargetEntry:
{
TargetEntry *tle = (TargetEntry *) node;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 022041b..cb8e4f6 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -885,7 +885,7 @@ ExecCloseScanRelation(Relation scanrel)
* ----------------------------------------------------------------
*/
void
-ExecOpenIndices(ResultRelInfo *resultRelInfo)
+ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative)
{
Relation resultRelation = resultRelInfo->ri_RelationDesc;
List *indexoidlist;
@@ -938,6 +938,13 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo)
/* extract index key information from the index's pg_index info */
ii = BuildIndexInfo(indexDesc);
+ /*
+ * Iff the indexes are to be used for speculative insertion, add extra
+ * information required by unique index entries
+ */
+ if (speculative && ii->ii_Unique)
+ AddUniqueSpeculative(indexDesc, ii);
+
relationDescs[i] = indexDesc;
indexInfoArray[i] = ii;
i++;
@@ -990,7 +997,8 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
*
* This returns a list of index OIDs for any unique or exclusion
* constraints that are deferred and that had
- * potential (unconfirmed) conflicts.
+ * potential (unconfirmed) conflicts. (if noDupErr == true, the
+ * same is done for non-deferred constraints)
*
* CAUTION: this must not be called for a HOT update.
* We can't defend against that here for lack of info.
@@ -1000,7 +1008,9 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
List *
ExecInsertIndexTuples(TupleTableSlot *slot,
ItemPointer tupleid,
- EState *estate)
+ EState *estate,
+ bool noDupErr,
+ Oid arbiterIdx)
{
List *result = NIL;
ResultRelInfo *resultRelInfo;
@@ -1070,7 +1080,17 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
/* Skip this index-update if the predicate isn't satisfied */
if (!ExecQual(predicate, econtext, false))
+ {
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ ereport(ERROR,
+ (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION),
+ errmsg("partial arbiter unique index has predicate that does not cover tuple proposed for insertion"),
+ errdetail("ON CONFLICT inference clause implies that the tuple proposed for insertion must be covered by predicate for partial index \"%s\".",
+ RelationGetRelationName(indexRelation)),
+ errtableconstraint(heapRelation,
+ RelationGetRelationName(indexRelation))));
continue;
+ }
}
/*
@@ -1092,9 +1112,16 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* For a deferrable unique index, we tell the index AM to just detect
* possible non-uniqueness, and we add the index OID to the result
* list if further checking is needed.
+ *
+ * For a speculative insertion (ON CONFLICT UPDATE/IGNORE), just detect
+ * possible non-uniqueness, and tell the caller if it failed.
*/
if (!indexRelation->rd_index->indisunique)
checkUnique = UNIQUE_CHECK_NO;
+ else if (noDupErr && arbiterIdx == InvalidOid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
+ else if (noDupErr && arbiterIdx == indexRelation->rd_index->indexrelid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
else if (indexRelation->rd_index->indimmediate)
checkUnique = UNIQUE_CHECK_YES;
else
@@ -1112,8 +1139,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* If the index has an associated exclusion constraint, check that.
* This is simpler than the process for uniqueness checks since we
* always insert first and then check. If the constraint is deferred,
- * we check now anyway, but don't throw error on violation; instead
- * we'll queue a recheck event.
+ * we check now anyway, but don't throw error on violation or wait for
+ * a conclusive outcome from a concurrent insertion; instead we'll
+ * queue a recheck event. Similarly, noDupErr callers (speculative
+ * inserters) will recheck later, and wait for a conclusive outcome
+ * then.
*
* An index for an exclusion constraint can't also be UNIQUE (not an
* essential property, we just don't allow it in the grammar), so no
@@ -1121,13 +1151,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*/
if (indexInfo->ii_ExclusionOps != NULL)
{
- bool errorOK = !indexRelation->rd_index->indimmediate;
+ bool violationOK = (!indexRelation->rd_index->indimmediate ||
+ noDupErr);
satisfiesConstraint =
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- tupleid, values, isnull,
- estate, false, errorOK);
+ check_exclusion_or_unique_constraint(heapRelation,
+ indexRelation, indexInfo,
+ tupleid, values, isnull,
+ estate, false,
+ violationOK, false, NULL);
}
if ((checkUnique == UNIQUE_CHECK_PARTIAL ||
@@ -1135,7 +1167,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
!satisfiesConstraint)
{
/*
- * The tuple potentially violates the uniqueness or exclusion
+ * The tuple potentially violates the unique index or exclusion
* constraint, so make a note of the index so that we can re-check
* it later.
*/
@@ -1146,18 +1178,154 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
return result;
}
+/* ----------------------------------------------------------------
+ * ExecCheckIndexConstraints
+ *
+ * This routine checks if a tuple violates any unique or
+ * exclusion constraints. If no conflict, returns true.
+ * Otherwise returns false, and the TID of the conflicting
+ * tuple is returned in *conflictTid
+ *
+ * Note that this doesn't lock the values in any way, so it's
+ * possible that a conflicting tuple is inserted immediately
+ * after this returns, and a later insert with the same values
+ * still conflicts. But this can be used for a pre-check before
+ * insertion.
+ * ----------------------------------------------------------------
+ */
+bool
+ExecCheckIndexConstraints(TupleTableSlot *slot,
+ EState *estate, ItemPointer conflictTid,
+ Oid arbiterIdx)
+{
+ ResultRelInfo *resultRelInfo;
+ int i;
+ int numIndices;
+ RelationPtr relationDescs;
+ Relation heapRelation;
+ IndexInfo **indexInfoArray;
+ ExprContext *econtext;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ ItemPointerData invalidItemPtr;
+ bool checkedIndex = false;
+
+ ItemPointerSetInvalid(conflictTid);
+ ItemPointerSetInvalid(&invalidItemPtr);
+
+ /*
+ * Get information from the result relation info structure.
+ */
+ resultRelInfo = estate->es_result_relation_info;
+ numIndices = resultRelInfo->ri_NumIndices;
+ relationDescs = resultRelInfo->ri_IndexRelationDescs;
+ indexInfoArray = resultRelInfo->ri_IndexRelationInfo;
+ heapRelation = resultRelInfo->ri_RelationDesc;
+
+ /*
+ * We will use the EState's per-tuple context for evaluating predicates
+ * and index expressions (creating it if it's not already there).
+ */
+ econtext = GetPerTupleExprContext(estate);
+
+ /* Arrange for econtext's scan tuple to be the tuple under test */
+ econtext->ecxt_scantuple = slot;
+
+ /*
+ * for each index, form and insert the index tuple
+ */
+ for (i = 0; i < numIndices; i++)
+ {
+ Relation indexRelation = relationDescs[i];
+ IndexInfo *indexInfo;
+ bool satisfiesConstraint;
+
+ if (indexRelation == NULL)
+ continue;
+
+ indexInfo = indexInfoArray[i];
+
+ if (!indexInfo->ii_Unique && !indexInfo->ii_ExclusionOps)
+ continue;
+
+ /* If the index is marked as read-only, ignore it */
+ if (!indexInfo->ii_ReadyForInserts)
+ continue;
+
+ /* When specific arbiter index requested, only examine it */
+ if (arbiterIdx != InvalidOid &&
+ arbiterIdx != indexRelation->rd_index->indexrelid)
+ continue;
+
+ checkedIndex = true;
+
+ /* Check for partial index */
+ if (indexInfo->ii_Predicate != NIL)
+ {
+ List *predicate;
+
+ /*
+ * If predicate state not set up yet, create it (in the estate's
+ * per-query context)
+ */
+ predicate = indexInfo->ii_PredicateState;
+ if (predicate == NIL)
+ {
+ predicate = (List *)
+ ExecPrepareExpr((Expr *) indexInfo->ii_Predicate,
+ estate);
+ indexInfo->ii_PredicateState = predicate;
+ }
+
+ /* Skip this index-update if the predicate isn't satisfied */
+ if (!ExecQual(predicate, econtext, false))
+ continue;
+ }
+
+ /*
+ * FormIndexDatum fills in its values and isnull parameters with the
+ * appropriate values for the column(s) of the index.
+ */
+ FormIndexDatum(indexInfo,
+ slot,
+ estate,
+ values,
+ isnull);
+
+ satisfiesConstraint =
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &invalidItemPtr,
+ values, isnull, estate, false,
+ true, true, conflictTid);
+ if (!satisfiesConstraint)
+ return false;
+
+ /* If this was a user-specified arbiter index, we're done */
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ break;
+ }
+
+ if (arbiterIdx != InvalidOid && !checkedIndex)
+ elog(ERROR, "unexpected failure to find arbiter unique index");
+
+ return true;
+}
+
/*
- * Check for violation of an exclusion constraint
+ * Check for violation of an exclusion or unique constraint
*
* heap: the table containing the new tuple
* index: the index supporting the exclusion constraint
* indexInfo: info about the index, including the exclusion properties
- * tupleid: heap TID of the new tuple we have just inserted
+ * tupleid: heap TID of the new tuple we have just inserted (invalid if we
+ * haven't inserted a new tuple yet)
* values, isnull: the *index* column values computed for the new tuple
* estate: an EState we can do evaluation in
* newIndex: if true, we are trying to build a new index (this affects
* only the wording of error messages)
* errorOK: if true, don't throw error for violation
+ * wait: if true, wait for conflicting transaction to finish, even if !errorOK
+ * conflictTid: if not-NULL, the TID of conflicting tuple is returned here.
*
* Returns true if OK, false if actual or potential violation
*
@@ -1167,16 +1335,25 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* is convenient for deferred exclusion checks; we need not bother queuing
* a deferred event if there is definitely no conflict at insertion time.
*
- * When errorOK is false, we'll throw error on violation, so a false result
+ * When violationOK is false, we'll throw error on violation, so a false result
* is impossible.
+ *
+ * Note: The indexam is normally responsible for checking unique constraints,
+ * so this normally only needs to be used for exclusion constraints. But this
+ * function is also called when doing a "pre-check" for conflicts, for the
+ * benefit of speculative insertion. Caller may request that conflict TID be
+ * set, to take further steps.
*/
bool
-check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
- ItemPointer tupleid, Datum *values, bool *isnull,
- EState *estate, bool newIndex, bool errorOK)
+check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo, ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate, bool newIndex,
+ bool violationOK, bool wait,
+ ItemPointer conflictTid)
{
- Oid *constr_procs = indexInfo->ii_ExclusionProcs;
- uint16 *constr_strats = indexInfo->ii_ExclusionStrats;
+ Oid *constr_procs;
+ uint16 *constr_strats;
Oid *index_collations = index->rd_indcollation;
int index_natts = index->rd_index->indnatts;
IndexScanDesc index_scan;
@@ -1190,6 +1367,17 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
TupleTableSlot *existing_slot;
TupleTableSlot *save_scantuple;
+ if (indexInfo->ii_ExclusionOps)
+ {
+ constr_procs = indexInfo->ii_ExclusionProcs;
+ constr_strats = indexInfo->ii_ExclusionStrats;
+ }
+ else
+ {
+ constr_procs = indexInfo->ii_UniqueProcs;
+ constr_strats = indexInfo->ii_UniqueStrats;
+ }
+
/*
* If any of the input values are NULL, the constraint check is assumed to
* pass (i.e., we assume the operators are strict).
@@ -1254,7 +1442,8 @@ retry:
/*
* Ignore the entry for the tuple we're trying to check.
*/
- if (ItemPointerEquals(tupleid, &tup->t_self))
+ if (ItemPointerIsValid(tupleid) &&
+ ItemPointerEquals(tupleid, &tup->t_self))
{
if (found_self) /* should not happen */
elog(ERROR, "found self tuple multiple times in index \"%s\"",
@@ -1288,9 +1477,11 @@ retry:
* we're not supposed to raise error, just return the fact of the
* potential conflict without waiting to see if it's real.
*/
- if (errorOK)
+ if (violationOK && !wait)
{
conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
break;
}
@@ -1309,14 +1500,29 @@ retry:
{
ctid_wait = tup->t_data->t_ctid;
index_endscan(index_scan);
- XactLockTableWait(xwait, heap, &ctid_wait,
- XLTW_RecheckExclusionConstr);
+ if (DirtySnapshot.speculativeToken)
+ SpeculativeInsertionWait(DirtySnapshot.xmin,
+ DirtySnapshot.speculativeToken);
+ else if (violationOK)
+ XactLockTableWait(xwait, heap, &tup->t_self,
+ XLTW_RecheckExclusionConstr);
+ else
+ XactLockTableWait(xwait, heap, &ctid_wait,
+ XLTW_RecheckExclusionConstr);
goto retry;
}
/*
- * We have a definite conflict. Report it.
+ * We have a definite conflict. Return it to caller, or report it.
*/
+ if (violationOK)
+ {
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+ break;
+ }
+
error_new = BuildIndexValueDescription(index, values, isnull);
error_existing = BuildIndexValueDescription(index, existing_values,
existing_isnull);
@@ -1352,6 +1558,9 @@ retry:
* However, it is possible to define exclusion constraints for which that
* wouldn't be true --- for instance, if the operator is <>. So we no
* longer complain if found_self is still false.
+ *
+ * It would also not be true in the pre-check mode, when we haven't
+ * inserted a tuple yet.
*/
econtext->ecxt_scantuple = save_scantuple;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 48107d9..4699060 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -151,10 +151,11 @@ lnext:
* case, so as to avoid the "Halloween problem" of repeated
* update attempts. In the latter case it might be sensible
* to fetch the updated tuple instead, but doing so would
- * require changing heap_lock_tuple as well as heap_update and
- * heap_delete to not complain about updating "invisible"
- * tuples, which seems pretty scary. So for now, treat the
- * tuple as deleted and do not process.
+ * require changing heap_update and heap_delete to not complain
+ * about updating "invisible" tuples, which seems pretty scary
+ * (heap_lock_tuple will not complain, but few callers expect
+ * HeapTupleInvisible, and we're not one of them). So for now,
+ * treat the tuple as deleted and do not process.
*/
goto lnext;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index f96fb24..5411896 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -46,12 +46,23 @@
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/tqual.h"
+static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
+ ModifyTableState *onConflict,
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning);
+
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
* target relation's rowtype
@@ -151,6 +162,36 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
return ExecProject(projectReturning, NULL);
}
+/*
+ * ExecCheckHeapTupleVisible -- verify heap tuple is visible
+ *
+ * It would not be consistent with guarantees of the higher isolation levels to
+ * proceed with avoiding insertion (taking speculative insertion's alternative
+ * IGNORE/UPDATE path) on the basis of another tuple that is not visible.
+ * Check for the need to raise a serialization failure, and do so as necessary.
+ */
+static void
+ExecCheckHeapTupleVisible(EState *estate,
+ ResultRelInfo *relinfo,
+ ItemPointer tid)
+{
+
+ Relation rel = relinfo->ri_RelationDesc;
+ Buffer buffer;
+ HeapTupleData tuple;
+
+ if (!IsolationUsesXactSnapshot())
+ return;
+
+ tuple.t_self = *tid;
+ if (!heap_fetch(rel, estate->es_snapshot, &tuple, &buffer, false, NULL))
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent insert or update dictating alternative ON CONFLICT path")));
+
+ ReleaseBuffer(buffer);
+}
+
/* ----------------------------------------------------------------
* ExecInsert
*
@@ -163,6 +204,9 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
static TupleTableSlot *
ExecInsert(TupleTableSlot *slot,
TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ Oid arbiterIndex,
+ SpecCmd spec,
EState *estate,
bool canSetTag)
{
@@ -246,6 +290,8 @@ ExecInsert(TupleTableSlot *slot,
}
else
{
+ ItemPointerData conflictTid;
+
/*
* Constraints might reference the tableoid column, so initialize
* t_tableOid before evaluating them.
@@ -259,20 +305,138 @@ ExecInsert(TupleTableSlot *slot,
ExecConstraints(resultRelInfo, slot, estate);
/*
+ * If we are performing speculative insertion, do a non-conclusive
+ * check for conflicts.
+ *
+ * Control returns here when there is 1) A row-locking conflict, or 2)
+ * an insertion conflict. See the executor README for a full
+ * discussion of speculative insertion.
+ */
+vlock:
+
+ /*
+ * XXX If we know or assume that there are few duplicates, it would be
+ * better to skip this, and just optimistically proceed with the
+ * insertion below.
+ */
+ if (spec != SPEC_NONE && resultRelInfo->ri_NumIndices > 0)
+ {
+ /*
+ * Check if it's required to proceed with the second phase
+ * ("insertion proper") of speculative insertion in respect of the
+ * slot. If insertion ultimately does not proceed, no firing of
+ * AFTER ROW INSERT triggers occurs.
+ *
+ * We don't suppress the effects (or, perhaps, side-effects) of
+ * BEFORE ROW INSERT triggers. This isn't ideal, but then we
+ * cannot proceed with even considering uniqueness violations until
+ * these triggers fire on the one hand, but on the other hand they
+ * have the ability to execute arbitrary user-defined code which
+ * may perform operations entirely outside the system's ability to
+ * nullify.
+ */
+ if (!ExecCheckIndexConstraints(slot, estate, &conflictTid,
+ arbiterIndex))
+ {
+ TupleTableSlot *returning = NULL;
+
+ /*
+ * Lock and consider updating in the SPEC_INSERT case. For the
+ * SPEC_IGNORE case, it's still necessary to verify that the
+ * tuple is visible to the executor's MVCC snapshot.
+ */
+ if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
+ &conflictTid,
+ planSlot,
+ slot,
+ onConflict,
+ estate,
+ canSetTag,
+ &returning))
+ goto vlock;
+ else if (spec == SPEC_IGNORE)
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
+
+ /*
+ * RETURNING may have been processed already -- the target
+ * ResultRelInfo might have made representation within
+ * ExecUpdate() that this is required. Inserted and updated
+ * tuples are projected indifferently for ON CONFLICT UPDATE
+ * with RETURNING.
+ *
+ * Since there was no row conflict, we're done.
+ */
+ return returning;
+ }
+
+ /*
+ * Before we start insertion proper, acquire our "promise tuple
+ * insertion lock". Others can use that (rather than an XID lock,
+ * which is appropriate only for non-promise tuples) to wait for us
+ * to decide if we're going to go ahead with the insertion.
+ */
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+ }
+
+ /*
* insert the tuple
*
* Note: heap_insert returns the tid (location) of the new tuple in
* the t_self field.
*/
newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, 0, NULL);
+ estate->es_output_cid,
+ spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0,
+ NULL);
/*
* insert index entries for tuple
*/
if (resultRelInfo->ri_NumIndices > 0)
+ {
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, spec != SPEC_NONE,
+ arbiterIndex);
+
+ if (spec != SPEC_NONE)
+ {
+ HeapUpdateFailureData hufd;
+
+ /*
+ * Consider possible race: concurrent insertion conflicts with
+ * our speculative heap tuple. Must then "super-delete" the
+ * heap tuple and retry from the start.
+ *
+ * This is occasionally necessary so that "unprincipled
+ * deadlocks" are avoided; now that a conflict was found,
+ * other sessions should not wait on our speculative token, and
+ * they certainly shouldn't treat our speculatively-inserted
+ * heap tuple as an ordinary tuple that it must wait on the
+ * outcome of our xact to UPDATE/DELETE. This makes heap
+ * tuples behave as conceptual "value locks" of short duration,
+ * distinct from ordinary tuples that other xacts must wait on
+ * xmin-xact-end of in the event of a possible unique/exclusion
+ * violation (the violation that arbitrates taking the
+ * alternative UPDATE/IGNORE path).
+ */
+ if (recheckIndexes)
+ heap_delete(resultRelationDesc, &(tuple->t_self),
+ estate->es_output_cid, InvalidSnapshot, false,
+ &hufd, true);
+
+ Assert(hufd.cmax == estate->es_output_cid);
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+
+ if (recheckIndexes)
+ {
+ list_free(recheckIndexes);
+ goto vlock;
+ }
+
+ /* since there was no insertion conflict, we're done */
+ }
+ }
}
if (canSetTag)
@@ -399,7 +563,8 @@ ldelete:;
estate->es_output_cid,
estate->es_crosscheck_snapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd,
+ false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -768,7 +933,7 @@ lreplace:;
*/
if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false, InvalidOid);
}
if (canSetTag)
@@ -792,6 +957,236 @@ lreplace:;
return NULL;
}
+/* ----------------------------------------------------------------
+ * Try to lock tuple for update as part of speculative insertion. If
+ * a qual originating from ON CONFLICT UPDATE is satisfied, update
+ * (but still lock row, even though it may not satisfy estate's
+ * snapshot).
+ *
+ * Returns value indicating if we're done (with or without an
+ * update), or if the executor must start from scratch.
+ * ----------------------------------------------------------------
+ */
+static bool
+ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
+ ModifyTableState *onConflict,
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning)
+{
+ Relation relation = resultRelInfo->ri_RelationDesc;
+ HeapTupleData tuple;
+ HeapTuple copyTuple = NULL;
+ HeapUpdateFailureData hufd;
+ HTSU_Result test;
+ Buffer buffer;
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * Lock tuple for update.
+ *
+ * Like EvalPlanQualFetch(), don't follow updates. There is no actual
+ * benefit to doing so, since as discussed below, a conflict invalidates
+ * our previous conclusion that the tuple is the conclusively committed
+ * conflicting tuple.
+ */
+ tuple.t_self = *conflictTid;
+ test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
+ LockTupleExclusive, LockWaitBlock, false, &buffer,
+ &hufd);
+
+ if (test == HeapTupleMayBeUpdated)
+ copyTuple = heap_copytuple(&tuple);
+
+ switch (test)
+ {
+ case HeapTupleInvisible:
+ /*
+ * This may occur when an instantaneously invisible tuple is blamed
+ * as a conflict because multiple rows are inserted with the same
+ * constrained values.
+ *
+ * We cannot proceed, because to do so would leave users open to
+ * the risk that the same row will be updated a second time in the
+ * same command; allowing a second update affecting a single row
+ * within the same command a second time would leave the update
+ * order undefined. It is the user's responsibility to resolve
+ * these self-duplicates in advance of proposing for insertion a
+ * set of tuples, but warn them. These problems are why SQL-2003
+ * similarly specifies that for SQL MERGE, an exception must be
+ * raised in the event of an attempt to update the same row twice.
+ *
+ * XXX It might be preferable to do something similar when a row is
+ * locked twice (and not updated twice) by the same speculative
+ * insertion, as if to take each lock acquisition as a indication
+ * of a discrete, unfulfilled intent to update (perhaps in some
+ * later command of the same xact). This does not seem feasible,
+ * though.
+ */
+ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data)))
+ ereport(ERROR,
+ (errcode(ERRCODE_CARDINALITY_VIOLATION),
+ errmsg("ON CONFLICT UPDATE command could not lock/update self-inserted tuple"),
+ errhint("Ensure that no rows proposed for insertion within the same command have duplicate constrained values.")));
+
+ /* This shouldn't happen */
+ elog(ERROR, "attempted to lock invisible tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleSelfUpdated:
+ /*
+ * XXX In practice this is dead code, since BEFORE triggers fire
+ * prior to speculative insertion. Since a dirty snapshot is used
+ * to find possible conflict tuples, speculative insertion could
+ * not have seen the old/MVCC-current row version at all (even if
+ * it was only rendered old by this same command).
+ */
+ elog(ERROR,"unexpected self-updated tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleMayBeUpdated:
+ /*
+ * Success -- we're done, as tuple is locked. Verify that the
+ * tuple is known to be visible to our snapshot under conventional
+ * MVCC rules if the current isolation level mandates that. In
+ * READ COMMITTED mode, we can lock and update a tuple still in
+ * progress according to our snapshot, but higher isolation levels
+ * cannot avail of that, and must actively defend against doing so.
+ * We might get a serialization failure within ExecUpdate() anyway
+ * if this step was skipped, but this cannot be relied on, for
+ * example because the auxiliary WHERE clause happened to not be
+ * satisfied.
+ */
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &tuple.t_data->t_ctid);
+
+ /*
+ * This loosening of snapshot isolation for the benefit of READ
+ * COMMITTED speculative insertions is used consistently:
+ * speculative quals are only tested against already locked tuples.
+ * It would be rather inconsistent to UPDATE when no tuple version
+ * is MVCC-visible (which seems inevitable since we must *do
+ * something* there, and "READ COMMITTED serialization failures"
+ * are unappealing), while also avoiding updating here entirely on
+ * the basis of a non-conclusive tuple version (the version that
+ * happens to be visible to this command's MVCC snapshot, or a
+ * subsequent non-conclusive version).
+ *
+ * In other words: Only the final, conclusively locked tuple
+ * (which must have the same value in the relevant constrained
+ * attribute(s) as the value previously "value locked") matters.
+ */
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStartNode(onConflict->ps.instrument);
+
+ /*
+ * Conceptually, the parent ModifyTable is like a relation scan
+ * node that uses a dirty snapshot, returning rows which the
+ * auxiliary plan must operate on (if only to lock all such rows).
+ * EvalPlanQual() is involved in the evaluation of their UPDATE,
+ * regardless of whether or not the tuple is visible to the
+ * command's MVCC Snapshot.
+ */
+ EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
+
+ /*
+ * Save EPQ expression context. Auxiliary plan's scan node (which
+ * would have been just initialized by EvalPlanQualBegin() on the
+ * first time through here per query) cannot fail to provide one.
+ */
+ econtext = onConflict->mt_epqstate.planstate->ps_ExprContext;
+
+ /*
+ * UPDATE affects the same ResultRelation as INSERT in the context
+ * of ON CONFLICT UPDATE, so parent's target rti is used
+ */
+ EvalPlanQualSetTuple(&onConflict->mt_epqstate,
+ resultRelInfo->ri_RangeTableIndex, copyTuple);
+
+ /*
+ * Make available rejected tuple for referencing within UPDATE
+ * expression (that is, make available a slot with the rejected
+ * tuple, possibly already modified by BEFORE INSERT row triggers).
+ *
+ * This is for the benefit of any ExcludedExpr that may appear
+ * within UPDATE's targetlist or WHERE clause. The EXCLUDED tuple
+ * may be referenced as an ExcludedExpr, which exist purely for our
+ * benefit. The nested ExcludedExpr's Var will necessarily have an
+ * INNER_VAR varno on the assumption that the inner slot of the EPQ
+ * scan plan state's expression context will contain the EXCLUDED
+ * heaptuple slot (that is, on the assumption that during
+ * expression evaluation, the ecxt_innertuple will be assigned the
+ * insertSlot by this codepath, in advance of expression
+ * evaluation).
+ *
+ * See handling of ExcludedExpr within handleRewrite.c and
+ * execQual.c.
+ */
+ econtext->ecxt_innertuple = insertSlot;
+
+ slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+
+ if (!TupIsNull(slot))
+ *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
+ planSlot, &onConflict->mt_epqstate,
+ onConflict->ps.state, canSetTag);
+
+ ReleaseBuffer(buffer);
+
+ /*
+ * As when executing an UPDATE's ModifyTable node in the
+ * conventional manner, reset the per-output-tuple ExprContext
+ */
+ ResetPerTupleExprContext(onConflict->ps.state);
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStopNode(onConflict->ps.instrument, *returning ? 1:0);
+
+ return true;
+ case HeapTupleUpdated:
+ if (IsolationUsesXactSnapshot())
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent update")));
+
+ /*
+ * Tell caller to try again from the very start. We don't use the
+ * usual EvalPlanQual() looping pattern here, fundamentally because
+ * we don't have a useful qual to verify the next tuple with. Our
+ * "qual" is really any user-supplied qual AND the unique
+ * constraint "col OP value" implied by a speculative insertion
+ * conflict. However, because of the selective evaluation of the
+ * former "qual" (the interactions with MVCC and row locking), this
+ * is an over-simplification.
+ *
+ * We might devise a means of verifying, by way of binary equality
+ * in a similar manner to HOT codepaths, if any unique indexed
+ * columns changed, but this would only serve to ameliorate the
+ * fundamental problem. It might well not be good enough, because
+ * those columns could change too. It seems unlikely that working
+ * harder here is worthwhile.
+ *
+ * At this point, all bets are off -- it might actually turn out to
+ * be okay to proceed with insertion instead of locking now (the
+ * tuple we attempted to lock could have been deleted, for
+ * example). On the other hand, it might not be okay, but for an
+ * entirely different reason, with an entirely separate TID to
+ * blame and lock. This TID may not even be part of the same
+ * update chain.
+ */
+ ReleaseBuffer(buffer);
+ return false;
+ default:
+ elog(ERROR, "unrecognized heap_lock_tuple status: %u", test);
+ }
+
+ return false;
+}
+
/*
* Process BEFORE EACH STATEMENT triggers
@@ -803,6 +1198,9 @@ fireBSTriggers(ModifyTableState *node)
{
case CMD_INSERT:
ExecBSInsertTriggers(node->ps.state, node->resultRelInfo);
+ if (node->spec == SPEC_INSERT)
+ ExecBSUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
break;
case CMD_UPDATE:
ExecBSUpdateTriggers(node->ps.state, node->resultRelInfo);
@@ -825,6 +1223,9 @@ fireASTriggers(ModifyTableState *node)
switch (node->operation)
{
case CMD_INSERT:
+ if (node->spec == SPEC_INSERT)
+ ExecASUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
ExecASInsertTriggers(node->ps.state, node->resultRelInfo);
break;
case CMD_UPDATE:
@@ -852,6 +1253,8 @@ ExecModifyTable(ModifyTableState *node)
{
EState *estate = node->ps.state;
CmdType operation = node->operation;
+ ModifyTableState *onConflict = (ModifyTableState *) node->onConflict;
+ SpecCmd spec = node->spec;
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
PlanState *subplanstate;
@@ -1022,7 +1425,9 @@ ExecModifyTable(ModifyTableState *node)
switch (operation)
{
case CMD_INSERT:
- slot = ExecInsert(slot, planSlot, estate, node->canSetTag);
+ slot = ExecInsert(slot, planSlot, onConflict,
+ node->arbiterIndex, spec, estate,
+ node->canSetTag);
break;
case CMD_UPDATE:
slot = ExecUpdate(tupleid, oldtuple, slot, planSlot,
@@ -1070,6 +1475,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ModifyTableState *mtstate;
CmdType operation = node->operation;
+ Plan *onConflictPlan = node->onConflictPlan;
int nplans = list_length(node->plans);
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
@@ -1097,6 +1503,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->resultRelInfo = estate->es_result_relations + node->resultRelIndex;
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
mtstate->mt_nplans = nplans;
+ mtstate->spec = node->spec;
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -1135,7 +1542,15 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
operation != CMD_DELETE &&
resultRelInfo->ri_IndexRelationDescs == NULL)
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, mtstate->spec != SPEC_NONE);
+
+ /*
+ * ON CONFLICT UPDATE variant must have unique index to arbitrate on
+ * taking alternative path
+ */
+ Assert(node->spec != SPEC_INSERT || node->arbiterIndex != InvalidOid);
+
+ mtstate->arbiterIndex = node->arbiterIndex;
/* Now init the plan for this result rel */
estate->es_result_relation_info = resultRelInfo;
@@ -1308,7 +1723,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
break;
case CMD_UPDATE:
case CMD_DELETE:
- junk_filter_needed = true;
+ junk_filter_needed = (node->spec == SPEC_NONE);
break;
default:
elog(ERROR, "unknown operation");
@@ -1373,6 +1788,30 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
}
/*
+ * Initialize auxiliary ModifyTable node for INSERT...ON CONFLICT UPDATE.
+ *
+ * The UPDATE portion of the query is essentially represented as auxiliary
+ * to INSERT state at all stages of query processing, with a representation
+ * at each stage that is analogous to a regular UPDATE.
+ */
+ if (onConflictPlan)
+ {
+ PlanState *pstate;
+
+ Assert(mtstate->spec == SPEC_INSERT);
+
+ /*
+ * Initialize auxiliary child plan.
+ *
+ * ExecModifyTable() is never called for auxiliary update
+ * ModifyTableState. Execution of the auxiliary plan is driven by its
+ * parent in an ad-hoc fashion.
+ */
+ pstate = ExecInitNode(onConflictPlan, estate, eflags);
+ mtstate->onConflict = pstate;
+ }
+
+ /*
* Set up a tuple table slot for use for trigger output tuples. In a plan
* containing multiple ModifyTable nodes, all can share one such slot, so
* we keep it in the estate.
@@ -1387,9 +1826,14 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* ModifyTable node too, but there's no need.) Note the use of lcons not
* lappend: we need later-initialized ModifyTable nodes to be shut down
* before earlier ones. This ensures that we don't throw away RETURNING
- * rows that need to be seen by a later CTE subplan.
+ * rows that need to be seen by a later CTE subplan. We do not want to
+ * append an auxiliary ON CONFLICT UPDATE node either, since it must have a
+ * parent SPEC_INSERT ModifyTable node that it is auxiliary to that
+ * directly drives execution of what is logically a single unified
+ * statement (*that* plan will be appended here, though). If it must
+ * project updated rows, that will only ever be done through the parent.
*/
- if (!mtstate->canSetTag)
+ if (!mtstate->canSetTag && mtstate->spec != SPEC_UPDATE)
estate->es_auxmodifytables = lcons(mtstate,
estate->es_auxmodifytables);
@@ -1442,6 +1886,8 @@ ExecEndModifyTable(ModifyTableState *node)
*/
for (i = 0; i < node->mt_nplans; i++)
ExecEndNode(node->mt_plans[i]);
+
+ ExecEndNode(node->onConflict);
}
void
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 00ffe4a..df611d2 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -178,6 +178,9 @@ _copyModifyTable(const ModifyTable *from)
COPY_NODE_FIELD(resultRelations);
COPY_SCALAR_FIELD(resultRelIndex);
COPY_NODE_FIELD(plans);
+ COPY_SCALAR_FIELD(spec);
+ COPY_SCALAR_FIELD(arbiterIndex);
+ COPY_NODE_FIELD(onConflictPlan);
COPY_NODE_FIELD(withCheckOptionLists);
COPY_NODE_FIELD(returningLists);
COPY_NODE_FIELD(fdwPrivLists);
@@ -1776,6 +1779,19 @@ _copyCurrentOfExpr(const CurrentOfExpr *from)
}
/*
+ * _copyExcludedExpr
+ */
+static ExcludedExpr *
+_copyExcludedExpr(const ExcludedExpr *from)
+{
+ ExcludedExpr *newnode = makeNode(ExcludedExpr);
+
+ COPY_NODE_FIELD(arg);
+
+ return newnode;
+}
+
+/*
* _copyTargetEntry
*/
static TargetEntry *
@@ -2120,6 +2136,31 @@ _copyWithClause(const WithClause *from)
return newnode;
}
+static InferClause *
+_copyInferClause(const InferClause *from)
+{
+ InferClause *newnode = makeNode(InferClause);
+
+ COPY_NODE_FIELD(indexElems);
+ COPY_NODE_FIELD(whereClause);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
+static ConflictClause *
+_copyConflictClause(const ConflictClause *from)
+{
+ ConflictClause *newnode = makeNode(ConflictClause);
+
+ COPY_SCALAR_FIELD(specclause);
+ COPY_NODE_FIELD(infer);
+ COPY_NODE_FIELD(updatequery);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
static CommonTableExpr *
_copyCommonTableExpr(const CommonTableExpr *from)
{
@@ -2525,6 +2566,10 @@ _copyQuery(const Query *from)
COPY_NODE_FIELD(jointree);
COPY_NODE_FIELD(targetList);
COPY_NODE_FIELD(withCheckOptions);
+ COPY_SCALAR_FIELD(specClause);
+ COPY_NODE_FIELD(arbiterExpr);
+ COPY_NODE_FIELD(arbiterWhere);
+ COPY_NODE_FIELD(onConflict);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(groupClause);
COPY_NODE_FIELD(havingQual);
@@ -2548,6 +2593,7 @@ _copyInsertStmt(const InsertStmt *from)
COPY_NODE_FIELD(relation);
COPY_NODE_FIELD(cols);
COPY_NODE_FIELD(selectStmt);
+ COPY_NODE_FIELD(confClause);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(withClause);
@@ -4254,6 +4300,9 @@ copyObject(const void *from)
case T_CurrentOfExpr:
retval = _copyCurrentOfExpr(from);
break;
+ case T_ExcludedExpr:
+ retval = _copyExcludedExpr(from);
+ break;
case T_TargetEntry:
retval = _copyTargetEntry(from);
break;
@@ -4721,6 +4770,12 @@ copyObject(const void *from)
case T_WithClause:
retval = _copyWithClause(from);
break;
+ case T_InferClause:
+ retval = _copyInferClause(from);
+ break;
+ case T_ConflictClause:
+ retval = _copyConflictClause(from);
+ break;
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 79035b2..24e58fa 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -681,6 +681,14 @@ _equalCurrentOfExpr(const CurrentOfExpr *a, const CurrentOfExpr *b)
}
static bool
+_equalExcludedExpr(const ExcludedExpr *a, const ExcludedExpr *b)
+{
+ COMPARE_NODE_FIELD(arg);
+
+ return true;
+}
+
+static bool
_equalTargetEntry(const TargetEntry *a, const TargetEntry *b)
{
COMPARE_NODE_FIELD(expr);
@@ -863,6 +871,10 @@ _equalQuery(const Query *a, const Query *b)
COMPARE_NODE_FIELD(jointree);
COMPARE_NODE_FIELD(targetList);
COMPARE_NODE_FIELD(withCheckOptions);
+ COMPARE_SCALAR_FIELD(specClause);
+ COMPARE_NODE_FIELD(arbiterExpr);
+ COMPARE_NODE_FIELD(arbiterWhere);
+ COMPARE_NODE_FIELD(onConflict);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(groupClause);
COMPARE_NODE_FIELD(havingQual);
@@ -884,6 +896,7 @@ _equalInsertStmt(const InsertStmt *a, const InsertStmt *b)
COMPARE_NODE_FIELD(relation);
COMPARE_NODE_FIELD(cols);
COMPARE_NODE_FIELD(selectStmt);
+ COMPARE_NODE_FIELD(confClause);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(withClause);
@@ -2426,6 +2439,27 @@ _equalWithClause(const WithClause *a, const WithClause *b)
}
static bool
+_equalInferClause(const InferClause *a, const InferClause *b)
+{
+ COMPARE_NODE_FIELD(indexElems);
+ COMPARE_NODE_FIELD(whereClause);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
+_equalConflictClause(const ConflictClause *a, const ConflictClause *b)
+{
+ COMPARE_SCALAR_FIELD(specclause);
+ COMPARE_NODE_FIELD(infer);
+ COMPARE_NODE_FIELD(updatequery);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
_equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
{
COMPARE_STRING_FIELD(ctename);
@@ -2694,6 +2728,9 @@ equal(const void *a, const void *b)
case T_CurrentOfExpr:
retval = _equalCurrentOfExpr(a, b);
break;
+ case T_ExcludedExpr:
+ retval = _equalExcludedExpr(a, b);
+ break;
case T_TargetEntry:
retval = _equalTargetEntry(a, b);
break;
@@ -3148,6 +3185,12 @@ equal(const void *a, const void *b)
case T_WithClause:
retval = _equalWithClause(a, b);
break;
+ case T_InferClause:
+ retval = _equalInferClause(a, b);
+ break;
+ case T_ConflictClause:
+ retval = _equalConflictClause(a, b);
+ break;
case T_CommonTableExpr:
retval = _equalCommonTableExpr(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 21dfda7..a9e1e13 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -235,6 +235,13 @@ exprType(const Node *expr)
case T_CurrentOfExpr:
type = BOOLOID;
break;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ type = exprType((Node *) n->arg);
+ }
+ break;
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -469,6 +476,12 @@ exprTypmod(const Node *expr)
return ((const CoerceToDomainValue *) expr)->typeMod;
case T_SetToDefault:
return ((const SetToDefault *) expr)->typeMod;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ return ((const Var *) n->arg)->vartypmod;
+ }
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
default:
@@ -894,6 +907,9 @@ exprCollation(const Node *expr)
case T_CurrentOfExpr:
coll = InvalidOid; /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ coll = exprCollation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -1089,6 +1105,12 @@ exprSetCollation(Node *expr, Oid collation)
case T_CurrentOfExpr:
Assert(!OidIsValid(collation)); /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ {
+ Var *v = (Var *) ((ExcludedExpr *) expr)->arg;
+ v->varcollid = collation;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
break;
@@ -1474,6 +1496,12 @@ exprLocation(const Node *expr)
case T_WithClause:
loc = ((const WithClause *) expr)->location;
break;
+ case T_InferClause:
+ loc = ((const InferClause *) expr)->location;
+ break;
+ case T_ConflictClause:
+ loc = ((const ConflictClause *) expr)->location;
+ break;
case T_CommonTableExpr:
loc = ((const CommonTableExpr *) expr)->location;
break;
@@ -1481,6 +1509,10 @@ exprLocation(const Node *expr)
/* just use argument's location */
loc = exprLocation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_ExcludedExpr:
+ /* just use nested expr's location */
+ loc = exprLocation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
@@ -1910,6 +1942,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_ExcludedExpr:
+ return walker(((ExcludedExpr *) node)->arg, context);
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -1958,6 +1992,12 @@ query_tree_walker(Query *query,
return true;
if (walker((Node *) query->withCheckOptions, context))
return true;
+ if (walker((Node *) query->arbiterExpr, context))
+ return true;
+ if (walker(query->arbiterWhere, context))
+ return true;
+ if (walker(query->onConflict, context))
+ return true;
if (walker((Node *) query->returningList, context))
return true;
if (walker((Node *) query->jointree, context))
@@ -2620,6 +2660,16 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExpr *newnode;
+
+ FLATCOPY(newnode, excludedexpr, ExcludedExpr);
+ MUTATE(newnode->arg, newnode->arg, Node *);
+ return (Node *) newnode;
+ }
+ break;
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -2699,6 +2749,9 @@ query_tree_mutator(Query *query,
MUTATE(query->targetList, query->targetList, List *);
MUTATE(query->withCheckOptions, query->withCheckOptions, List *);
+ MUTATE(query->arbiterExpr, query->arbiterExpr, List *);
+ MUTATE(query->arbiterWhere, query->arbiterWhere, Node *);
+ MUTATE(query->onConflict, query->onConflict, Node *);
MUTATE(query->returningList, query->returningList, List *);
MUTATE(query->jointree, query->jointree, FromExpr *);
MUTATE(query->setOperations, query->setOperations, Node *);
@@ -2968,6 +3021,8 @@ raw_expression_tree_walker(Node *node,
return true;
if (walker(stmt->selectStmt, context))
return true;
+ if (walker(stmt->confClause, context))
+ return true;
if (walker(stmt->returningList, context))
return true;
if (walker(stmt->withClause, context))
@@ -3207,6 +3262,25 @@ raw_expression_tree_walker(Node *node,
break;
case T_WithClause:
return walker(((WithClause *) node)->ctes, context);
+
+ case T_InferClause:
+ {
+ InferClause *stmt = (InferClause *) node;
+
+ if (walker(stmt->indexElems, context))
+ return true;
+ if (walker(stmt->whereClause, context))
+ return true;
+ }
+ case T_ConflictClause:
+ {
+ ConflictClause *stmt = (ConflictClause *) node;
+
+ if (walker(stmt->infer, context))
+ return true;
+ if (walker(stmt->updatequery, context))
+ return true;
+ }
case T_CommonTableExpr:
return walker(((CommonTableExpr *) node)->ctequery, context);
default:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index b4a2667..34e9163 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -330,6 +330,9 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_INT_FIELD(resultRelIndex);
WRITE_NODE_FIELD(plans);
+ WRITE_ENUM_FIELD(spec, SpecType);
+ WRITE_OID_FIELD(arbiterIndex);
+ WRITE_NODE_FIELD(onConflictPlan);
WRITE_NODE_FIELD(withCheckOptionLists);
WRITE_NODE_FIELD(returningLists);
WRITE_NODE_FIELD(fdwPrivLists);
@@ -1426,6 +1429,14 @@ _outCurrentOfExpr(StringInfo str, const CurrentOfExpr *node)
}
static void
+_outExcludedExpr(StringInfo str, const ExcludedExpr *node)
+{
+ WRITE_NODE_TYPE("EXCLUDED");
+
+ WRITE_NODE_FIELD(arg);
+}
+
+static void
_outTargetEntry(StringInfo str, const TargetEntry *node)
{
WRITE_NODE_TYPE("TARGETENTRY");
@@ -2301,6 +2312,10 @@ _outQuery(StringInfo str, const Query *node)
WRITE_NODE_FIELD(jointree);
WRITE_NODE_FIELD(targetList);
WRITE_NODE_FIELD(withCheckOptions);
+ WRITE_ENUM_FIELD(specClause, SpecType);
+ WRITE_NODE_FIELD(arbiterExpr);
+ WRITE_NODE_FIELD(arbiterWhere);
+ WRITE_NODE_FIELD(onConflict);
WRITE_NODE_FIELD(returningList);
WRITE_NODE_FIELD(groupClause);
WRITE_NODE_FIELD(havingQual);
@@ -3062,6 +3077,9 @@ _outNode(StringInfo str, const void *obj)
case T_CurrentOfExpr:
_outCurrentOfExpr(str, obj);
break;
+ case T_ExcludedExpr:
+ _outExcludedExpr(str, obj);
+ break;
case T_TargetEntry:
_outTargetEntry(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index dbc162a..48a7206 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -214,6 +214,10 @@ _readQuery(void)
READ_NODE_FIELD(jointree);
READ_NODE_FIELD(targetList);
READ_NODE_FIELD(withCheckOptions);
+ READ_ENUM_FIELD(specClause, SpecCmd);
+ READ_NODE_FIELD(arbiterExpr);
+ READ_NODE_FIELD(arbiterWhere);
+ READ_NODE_FIELD(onConflict);
READ_NODE_FIELD(returningList);
READ_NODE_FIELD(groupClause);
READ_NODE_FIELD(havingQual);
@@ -1128,6 +1132,19 @@ _readCurrentOfExpr(void)
}
/*
+ * _readExcludedExpr
+ */
+static ExcludedExpr *
+_readExcludedExpr(void)
+{
+ READ_LOCALS(ExcludedExpr);
+
+ READ_NODE_FIELD(arg);
+
+ READ_DONE();
+}
+
+/*
* _readTargetEntry
*/
static TargetEntry *
@@ -1392,6 +1409,8 @@ parseNodeString(void)
return_value = _readSetToDefault();
else if (MATCH("CURRENTOFEXPR", 13))
return_value = _readCurrentOfExpr();
+ else if (MATCH("EXCLUDED", 8))
+ return_value = _readExcludedExpr();
else if (MATCH("TARGETENTRY", 11))
return_value = _readTargetEntry();
else if (MATCH("RANGETBLREF", 11))
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index b86a3cd..6f75759 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4013,3 +4013,60 @@ string_to_const(const char *str, Oid datatype)
return makeConst(datatype, -1, collation, constlen,
conval, false, false);
}
+
+/*
+ * plan_speculative_use_index
+ * Use the planner to decide speculative insertion arbiter index
+ *
+ * Among indexes on target of INSERT ... ON CONFLICT UPDATE/IGNORE, decide
+ * which index to use to arbitrate taking alternative path. This should be
+ * called infrequently in practice, because its unusual for more than one index
+ * to be available that can satisfy a user-specified unique index inference
+ * specification.
+ *
+ * Note: caller had better already hold some type of lock on the table.
+ */
+Oid
+plan_speculative_use_index(PlannerInfo *root, List *indexList)
+{
+ IndexOptInfo *indexInfo;
+ RelOptInfo *rel;
+ IndexPath *cheapest;
+ IndexPath *indexScanPath;
+ ListCell *lc;
+
+ /* Set up RTE/RelOptInfo arrays if needed */
+ if (!root->simple_rel_array)
+ setup_simple_rel_arrays(root);
+
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, root->parse->resultRelation, RELOPT_BASEREL);
+
+ /* Locate cheapest IndexOptInfo for the target index */
+ cheapest = NULL;
+
+ foreach(lc, rel->indexlist)
+ {
+ indexInfo = (IndexOptInfo *) lfirst(lc);
+
+ if (!list_member_oid(indexList, indexInfo->indexoid))
+ continue;
+
+ /* Estimate the cost of index scan */
+ indexScanPath = create_index_path(root, indexInfo,
+ NIL, NIL, NIL, NIL, NIL,
+ ForwardScanDirection, false,
+ NULL, 1.0);
+
+ if (!cheapest || compare_fractional_path_costs(&cheapest->path,
+ &indexScanPath->path,
+ DEFAULT_RANGE_INEQ_SEL) > 0)
+ cheapest = indexScanPath;
+
+ }
+
+ if (cheapest)
+ return cheapest->indexinfo->indexoid;
+
+ return InvalidOid;
+}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 1258961..263ff5f 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -255,13 +255,17 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
/*
* We don't support pushing join clauses into the quals of a tidscan, but
* it could still have required parameterization due to LATERAL refs in
- * its tlist.
+ * its tlist. To be tidy, we disallow TID scans as the unexecuted scan
+ * node of an ON CONFLICT UPDATE auxiliary query, even though there is no
+ * reason to think that would be harmful; the optimizer should always
+ * prefer a SeqScan or Result node (actually, we assert that it's one of
+ * those two in several places, so accepting TID scans would break those).
*/
required_outer = rel->lateral_relids;
tidquals = TidQualFromRestrictinfo(rel->baserestrictinfo, rel->relid);
- if (tidquals)
+ if (tidquals && root->parse->specClause != SPEC_UPDATE)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
required_outer));
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 655be81..648dbbc 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4811,7 +4811,8 @@ make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam)
+ List *rowMarks, Plan *onConflictPlan, SpecCmd spec,
+ int epqParam)
{
ModifyTable *node = makeNode(ModifyTable);
Plan *plan = &node->plan;
@@ -4860,6 +4861,9 @@ make_modifytable(PlannerInfo *root,
node->resultRelations = resultRelations;
node->resultRelIndex = -1; /* will be set correctly in setrefs.c */
node->plans = subplans;
+ node->spec = spec;
+ node->arbiterIndex = InvalidOid;
+ node->onConflictPlan = onConflictPlan;
node->withCheckOptionLists = withCheckOptionLists;
node->returningLists = returningLists;
node->rowMarks = rowMarks;
@@ -4912,6 +4916,16 @@ make_modifytable(PlannerInfo *root,
}
node->fdwPrivLists = fdw_private_list;
+ /*
+ * If a set of unique index inference expressions was provided (for
+ * INSERT...ON CONFLICT UPDATE/IGNORE), then infer appropriate
+ * unique index (or throw an error if none is available). It's
+ * possible that there will be a costing step in the event of
+ * having to choose between multiple alternatives.
+ */
+ if (root->parse->arbiterExpr)
+ node->arbiterIndex = infer_unique_index(root);
+
return node;
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9cbbcfb..47d11d2 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -612,7 +612,58 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
+
+ if (parse->onConflict)
+ {
+ Query *conflictQry = (Query*) parse->onConflict;
+ ModifyTable *parent = (ModifyTable *) plan;
+
+ /*
+ * An ON CONFLICT UPDATE query is a subquery of its parent
+ * INSERT ModifyTable, but isn't formally a subplan -- it's an
+ * "auxiliary" plan.
+ *
+ * During execution, the auxiliary plan state is used to
+ * execute the UPDATE query in an ad-hoc manner, driven by the
+ * parent. The executor will only ever execute the auxiliary
+ * plan through its parent. onConflictPlan is "auxiliary" to
+ * its parent in the sense that it's strictly encapsulated from
+ * other code (for example, the executor does not separately
+ * track it within estate as a plan that needs to have
+ * execution finished when it appears within a data-modifying
+ * CTE -- only the parent is specifically tracked for that
+ * purpose).
+ *
+ * There is a fundamental nexus between parent and auxiliary
+ * plans that makes a fully unified representation seem
+ * compelling (a "CMD_UPSERT" ModifyTable plan and Query).
+ * That would obviate the need to specially track auxiliary
+ * state across all stages of execution just for this case;
+ * the optimizer would then not have to generate a
+ * fully-formed, independent UPDATE subquery plan (with a
+ * scanstate only useful for EvalPlanQual() re-evaluation).
+ * However, it's convenient to plan each ModifyTable
+ * separately, as doing so maximizes code reuse. The
+ * alternative must be to introduce abstractions that (for
+ * example) allow a single "CMD_UPSERT" ModifyTable to have two
+ * distinct types of targetlist (that will need to be processed
+ * differently during parsing and rewriting anyway). The
+ * auxiliary UPDATE plan is a good trade-off between a
+ * fully-fledged "CMD_UPSERT" representation, and the opposite
+ * extreme of tracking two separate ModifyTable nodes, joined
+ * by a contrived join type, with (for example) odd properties
+ * around tuple visibility not well encapsulated. A contrived
+ * join based design would also necessitate teaching
+ * ModifyTable nodes to support rescan just for the benefit of
+ * ON CONFLICT UPDATE.
+ */
+ parent->onConflictPlan = subquery_planner(glob, conflictQry,
+ root, hasRecursion,
+ 0, NULL);
+ }
}
}
@@ -1056,6 +1107,8 @@ inheritance_planner(PlannerInfo *root)
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5d865b0..2f88694 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -779,9 +779,34 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
* global list.
*/
splan->resultRelIndex = list_length(root->glob->resultRelations);
- root->glob->resultRelations =
- list_concat(root->glob->resultRelations,
- list_copy(splan->resultRelations));
+
+ if (!splan->onConflictPlan)
+ {
+ /*
+ * Only actually append result relation for non-auxiliary
+ * ModifyTable plans
+ */
+ root->glob->resultRelations =
+ list_concat(root->glob->resultRelations,
+ list_copy(splan->resultRelations));
+ }
+ else
+ {
+ /*
+ * Decrement rtoffset, to compensate for dummy RTE left by
+ * EXCLUDED.* alias. Auxiliary plan will have same
+ * resultRelation from flattened range table as its parent.
+ */
+ rtoffset -= PRS2_OLD_VARNO;
+ splan->onConflictPlan = (Plan *) set_plan_refs(root,
+ (Plan *) splan->onConflictPlan,
+ rtoffset);
+ /*
+ * Set up the visible plan targetlist as being the same as
+ * the parent. Again, this is for the use of EXPLAIN only.
+ */
+ splan->onConflictPlan->targetlist = splan->plan.targetlist;
+ }
}
break;
case T_Append:
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 78fb6b1..f7a0523 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2345,6 +2345,12 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
valid_params,
scan_params));
}
+
+ /*
+ * No need to directly handle onConflictPlan here, since it
+ * cannot have params (due to parse analysis enforced
+ * restrictions prohibiting subqueries).
+ */
}
break;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index fb7db6d..3086ca3 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -31,6 +31,7 @@
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
+#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
@@ -125,10 +126,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/*
* Make list of indexes. Ignore indexes on system catalogs if told to.
- * Don't bother with indexes for an inheritance parent, either.
+ * Don't bother with indexes for an inheritance parent or speculative
+ * insertion UPDATE auxiliary queries, either.
*/
if (inhparent ||
- (IgnoreSystemIndexes && IsSystemRelation(relation)))
+ (IgnoreSystemIndexes && IsSystemRelation(relation)) ||
+ root->parse->specClause == SPEC_UPDATE)
hasindex = false;
else
hasindex = relation->rd_rel->relhasindex;
@@ -394,6 +397,221 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
}
/*
+ * infer_unique_index -
+ * Retrieves unique index to arbitrate speculative insertion.
+ *
+ * Uses user-supplied inference clause expressions and predicate to match a
+ * unique index from those defined and ready on the heap relation (target). An
+ * exact match is required on columns/expressions (although they can appear in
+ * any order). However, the predicate given by the user need only restrict
+ * insertion to a subset of some part of the table covered by some particular
+ * unique index (in particular, a partial unique index) in order to be
+ * inferred.
+ *
+ * The implementation does not consider which B-Tree operator class any
+ * particular available unique index uses. In particular, there is no system
+ * dependency on the default operator class for the purposes of inference.
+ * This should be okay, since by convention non-default opclasses only
+ * introduce alternative sort orders, not alternative notions of equality
+ * (there are only trivial known exceptions to this convention, where "equals"
+ * operator of a type's opclasses do not match across opclasses, exceptions
+ * that exist precisely to discourage user code from using the divergent
+ * opclass). Even if we assume that a type could usefully have multiple
+ * alternative concepts of equality, surely the definition actually implied by
+ * the operator class of actually indexed attributes is pertinent. However,
+ * this is a bit of a wart, because strictly speaking there is leeway for a
+ * query to be interpreted in deference to available unique indexes, and
+ * indexes are traditionally only an implementation detail. It hardly seems
+ * worth it to waste cycles on this corner case, though.
+ *
+ * This logic somewhat mirrors get_relation_info(). This process is not
+ * deferred to a get_relation_info() call while planning because there may not
+ * be any such call. In the ON CONFLICT UPDATE case get_relation_info() will
+ * be called, for auxiliary query planning, but even then indexes won't be
+ * examined since they're not generally interesting to that case (building
+ * index paths is explicitly avoided for auxiliary query planning, in fact).
+ */
+Oid
+infer_unique_index(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+ Relation relation;
+ Oid relationObjectId;
+ Bitmapset *plainAttrs = NULL;
+ List *candidates = NIL;
+ ListCell *l;
+ List *indexList;
+
+ Assert(parse->specClause == SPEC_INSERT ||
+ parse->specClause == SPEC_IGNORE);
+
+ /*
+ * We need not lock the relation since it was already locked, either by
+ * the rewriter or when expand_inherited_rtentry() added it to the query's
+ * rangetable.
+ */
+ relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid;
+
+ relation = heap_open(relationObjectId, NoLock);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(l, parse->arbiterExpr)
+ {
+ Expr *elem;
+ Var *var;
+ int attno;
+
+ elem = (Expr *) lfirst(l);
+
+ /*
+ * Parse analysis of inference elements performs full parse analysis of
+ * Vars, even for non-expression indexes (in contrast with utility
+ * command related use of IndexElem). However, indexes are cataloged
+ * with simple attribute numbers for non-expression indexes.
+ * Therefore, we must build a compatible bms representation here.
+ */
+ if (!IsA(elem, Var))
+ continue;
+
+ var = (Var*) elem;
+ attno = var->varattno;
+
+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("system columns may not appear in unique index inference specification")));
+ else if (attno == 0)
+ elog(ERROR, "whole row unique index inference specifications are not valid");
+
+ plainAttrs = bms_add_member(plainAttrs, attno);
+ }
+
+ indexList = RelationGetIndexList(relation);
+
+ /*
+ * Using that representation, iterate through the list of indexes on the
+ * target relation to try and find a match
+ */
+ foreach(l, indexList)
+ {
+ Oid indexoid = lfirst_oid(l);
+ Relation idxRel;
+ Form_pg_index idxForm;
+ Bitmapset *indexedPlainAttrs = NULL;
+ List *idxExprs;
+ List *predExprs;
+ List *whereExplicit;
+ AttrNumber natt;
+ ListCell *e;
+
+ /*
+ * Extract info from the relation descriptor for the index. We know
+ * that this is a target, so get lock type it is known will ultimately
+ * be required by the executor.
+ *
+ * Let executor complain about !indimmediate case directly.
+ */
+ idxRel = index_open(indexoid, RowExclusiveLock);
+ idxForm = idxRel->rd_index;
+
+ if (!idxForm->indisunique ||
+ !IndexIsValid(idxForm))
+ goto next;
+
+ /*
+ * If the index is valid, but cannot yet be used, ignore it. See
+ * src/backend/access/heap/README.HOT for discussion.
+ */
+ if (idxForm->indcheckxmin &&
+ !TransactionIdPrecedes(HeapTupleHeaderGetXmin(idxRel->rd_indextuple->t_data),
+ TransactionXmin))
+ goto next;
+
+ /* Check in detail if the clause attributes/expressions match */
+ for (natt = 0; natt < idxForm->indnatts; natt++)
+ {
+ int attno = idxRel->rd_index->indkey.values[natt];
+
+ if (attno < 0)
+ elog(ERROR, "system column in index");
+
+ if (attno != 0)
+ indexedPlainAttrs = bms_add_member(indexedPlainAttrs, attno);
+ }
+
+ /*
+ * Since expressions were made unique during parse analysis, it's
+ * evident that we cannot proceed with this index if the number of
+ * attributes (plain or expression) does not match exactly. This
+ * precludes support for unique indexes created with redundantly
+ * referenced columns (which are not forbidden by CREATE INDEX), but
+ * this seems inconsequential.
+ */
+ if (list_length(parse->arbiterExpr) != idxForm->indnatts)
+ goto next;
+
+ idxExprs = RelationGetIndexExpressions(idxRel);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(e, parse->arbiterExpr)
+ {
+ Expr *elem = (Expr *) lfirst(e);
+
+ /* Plain Vars were already separately accounted for */
+ if (IsA(elem, Var))
+ continue;
+
+ if (!list_member(idxExprs, elem))
+ goto next;
+ }
+
+ /* Non-expression attributes (if any) must match */
+ if (!bms_equal(indexedPlainAttrs, plainAttrs))
+ goto next;
+
+ /*
+ * Any user-supplied ON CONFLICT unique index inference WHERE clause
+ * need only be implied by the cataloged index definitions predicate
+ */
+ predExprs = RelationGetIndexPredicate(idxRel);
+ whereExplicit = make_ands_implicit((Expr *) parse->arbiterWhere);
+
+ if (!predicate_implied_by(predExprs, whereExplicit))
+ goto next;
+
+ candidates = lappend_oid(candidates, idxForm->indexrelid);
+next:
+ index_close(idxRel, NoLock);
+ }
+
+ list_free(indexList);
+ heap_close(relation, NoLock);
+
+ /*
+ * In the common case where there is only a single candidate unique index,
+ * there is clearly no point in building index paths to determine which is
+ * cheapest.
+ */
+ if (list_length(candidates) == 1)
+ return linitial_oid(candidates);
+ else if (candidates == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT")));
+ else
+ /* Otherwise, deduce the least expensive unique index */
+ return plan_speculative_use_index(root, candidates);
+
+ return InvalidOid; /* keep compiler quiet */
+}
+
+/*
* estimate_rel_size - estimate # pages and # tuples in a table or index
*
* We also estimate the fraction of the pages that are marked all-visible in
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index df89065..6c194f9 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -387,6 +387,8 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -408,6 +410,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
{
Query *qry = makeNode(Query);
SelectStmt *selectStmt = (SelectStmt *) stmt->selectStmt;
+ SpecCmd spec = stmt->confClause? stmt->confClause->specclause:SPEC_NONE;
List *exprList = NIL;
bool isGeneralSelect;
List *sub_rtable;
@@ -425,6 +428,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
qry->commandType = CMD_INSERT;
pstate->p_is_insert = true;
+ pstate->p_is_speculative = spec != SPEC_NONE;
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -472,11 +476,16 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
sub_namespace = NIL;
}
+ /* INSERT with an ON CONFLICT clause forces the "target" alias */
+ if (pstate->p_is_speculative)
+ stmt->relation->alias = makeAlias("target", NIL);
+
/*
* Must get write lock on INSERT target table before scanning SELECT, else
* we will grab the wrong kind of initial lock if the target table is also
* mentioned in the SELECT part. Note that the target table is not added
- * to the joinlist or namespace.
+ * to the joinlist or namespace. Note also that additional requiredPerms
+ * may be added to the target RTE iff there is an auxiliary UPDATE.
*/
qry->resultRelation = setTargetTable(pstate, stmt->relation,
false, false, ACL_INSERT);
@@ -741,12 +750,13 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
}
/*
- * If we have a RETURNING clause, we need to add the target relation to
- * the query namespace before processing it, so that Var references in
- * RETURNING will work. Also, remove any namespace entries added in a
- * sub-SELECT or VALUES list.
+ * If we have a RETURNING clause, or there are attributes used as the
+ * condition on which to take an alternative ON CONFLICT path, we need to
+ * add the target relation to the query namespace before processing it, so
+ * that Var references in RETURNING/the alternative path key will work.
+ * Also, remove any namespace entries added in a sub-SELECT or VALUES list.
*/
- if (stmt->returningList)
+ if (stmt->returningList || stmt->confClause)
{
pstate->p_namespace = NIL;
addRTEtoQuery(pstate, pstate->p_target_rangetblentry,
@@ -758,8 +768,49 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
-
+ qry->specClause = spec;
qry->hasSubLinks = pstate->p_hasSubLinks;
+ qry->onConflict = NULL;
+
+ if (stmt->confClause)
+ {
+ /*
+ * ON CONFLICT UPDATE requires special parse analysis of auxiliary
+ * update Query
+ */
+ if (stmt->confClause->updatequery)
+ {
+ ParseState *sub_pstate = make_parsestate(pstate);
+ Query *uqry;
+
+ /*
+ * The optimizer is not prepared to accept a subquery RTE for a
+ * non-CMD_SELECT Query. The CMD_UPDATE Query is tracked as
+ * special auxiliary state, while there is more or less analogous
+ * auxiliary state tracked in later stages of query execution.
+ *
+ * Parent canSetTag only ever actually consulted, so no need to set
+ * that here.
+ */
+ uqry = transformStmt(sub_pstate, stmt->confClause->updatequery);
+ Assert(uqry->commandType == CMD_UPDATE &&
+ uqry->specClause == SPEC_UPDATE);
+
+ /* Save auxiliary query */
+ qry->onConflict = (Node *) uqry;
+
+ free_parsestate(sub_pstate);
+ }
+
+ /*
+ * Infer a unique index from columns/expressions. This is later used
+ * to infer a unique index which arbitrates whether or not to take the
+ * alternative ON CONFLICT path (i.e. whether or not to INSERT or
+ * UPDATE/IGNORE in respect of each slot proposed for insertion).
+ */
+ transformConflictClause(pstate, stmt->confClause, &qry->arbiterExpr,
+ &qry->arbiterWhere);
+ }
assign_query_collations(pstate, qry);
@@ -1006,6 +1057,8 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -1903,10 +1956,14 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
Node *qual;
ListCell *origTargetList;
ListCell *tl;
+ bool InhOption;
qry->commandType = CMD_UPDATE;
pstate->p_is_update = true;
+ /* for auxiliary UPDATEs, visit parent INSERT to set target table */
+ pstate->p_is_speculative = (stmt->relation == NULL);
+
/* process the WITH clause independently of all else */
if (stmt->withClause)
{
@@ -1915,8 +1972,20 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
}
+ if (!pstate->p_is_speculative)
+ {
+ InhOption = interpretInhOption(stmt->relation->inhOpt);
+ qry->specClause = SPEC_NONE;
+ }
+ else
+ {
+ /* auxiliary UPDATE does not accept ONLY */
+ InhOption = false;
+ qry->specClause = SPEC_UPDATE;
+ }
+
qry->resultRelation = setTargetTable(pstate, stmt->relation,
- interpretInhOption(stmt->relation->inhOpt),
+ InhOption,
true,
ACL_UPDATE);
@@ -1947,6 +2016,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 36dac29..f987432 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -215,6 +215,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RangeVar *range;
IntoClause *into;
WithClause *with;
+ InferClause *infer;
+ ConflictClause *conf;
A_Indices *aind;
ResTarget *target;
struct PrivTarget *privtarget;
@@ -415,6 +417,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <defelt> SeqOptElem
%type <istmt> insert_rest
+%type <infer> opt_conf_expr
+%type <conf> opt_on_conflict
%type <vsetstmt> generic_set set_rest set_rest_more generic_reset reset_rest
SetResetClause FunctionSetResetClause
@@ -513,6 +517,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> cte_list
%type <list> within_group_clause
+%type <node> UpdateInsertStmt
%type <node> filter_clause
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
@@ -551,8 +556,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
CACHE CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLLATION COLUMN COMMENT COMMENTS COMMIT
- COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
- CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
+ COMMITTED CONCURRENTLY CONFIGURATION CONFLICT CONNECTION CONSTRAINT
+ CONSTRAINTS CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
CROSS CSV CURRENT_P
CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
@@ -572,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+ IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -652,6 +657,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
+%nonassoc DISTINCT
+%nonassoc ON
%left POSTFIXOP /* dummy for postfix Op rules */
/*
* To support target_el without AS, we must give IDENT an explicit priority
@@ -9399,10 +9406,12 @@ DeallocateStmt: DEALLOCATE name
*****************************************************************************/
InsertStmt:
- opt_with_clause INSERT INTO qualified_name insert_rest returning_clause
+ opt_with_clause INSERT INTO qualified_name insert_rest
+ opt_on_conflict returning_clause
{
$5->relation = $4;
- $5->returningList = $6;
+ $5->confClause = $6;
+ $5->returningList = $7;
$5->withClause = $1;
$$ = (Node *) $5;
}
@@ -9447,6 +9456,44 @@ insert_column_item:
}
;
+opt_on_conflict:
+ ON CONFLICT opt_conf_expr UpdateInsertStmt
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_INSERT;
+ $$->infer = $3;
+ $$->updatequery = $4;
+ $$->location = @1;
+ }
+ |
+ ON CONFLICT opt_conf_expr IGNORE_P
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_IGNORE;
+ $$->infer = $3;
+ $$->updatequery = NULL;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
+opt_conf_expr:
+ '(' index_params where_clause ')'
+ {
+ $$ = makeNode(InferClause);
+ $$->indexElems = $2;
+ $$->whereClause = $3;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
returning_clause:
RETURNING target_list { $$ = $2; }
| /* EMPTY */ { $$ = NIL; }
@@ -9546,6 +9593,22 @@ UpdateStmt: opt_with_clause UPDATE relation_expr_opt_alias
}
;
+UpdateInsertStmt: UPDATE
+ SET set_clause_list
+ where_clause
+ {
+ UpdateStmt *n = makeNode(UpdateStmt);
+ /* NULL relation conveys auxiliary */
+ n->relation = NULL;
+ n->targetList = $3;
+ n->fromClause = NULL;
+ n->whereClause = $4;
+ n->returningList = NULL;
+ n->withClause = NULL;
+ $$ = (Node *)n;
+ }
+ ;
+
set_clause_list:
set_clause { $$ = $1; }
| set_clause_list ',' set_clause { $$ = list_concat($1,$3); }
@@ -13188,6 +13251,7 @@ unreserved_keyword:
| COMMIT
| COMMITTED
| CONFIGURATION
+ | CONFLICT
| CONNECTION
| CONSTRAINTS
| CONTENT_P
@@ -13247,6 +13311,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE_P
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 654dce6..0bc45d5 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -75,6 +75,8 @@ static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node,
List **tlist, ParseExprKind exprKind);
static int get_matching_location(int sortgroupref,
List *sortgrouprefs, List *exprs);
+static List* resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel);
static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle,
List *grouplist, List *targetlist, int location,
bool resolveUnknown);
@@ -145,7 +147,9 @@ transformFromClause(ParseState *pstate, List *frmList)
* We also open the target relation and acquire a write lock on it.
* This must be done before processing the FROM list, in case the target
* is also mentioned as a source relation --- we want to be sure to grab
- * the write lock before any read lock.
+ * the write lock before any read lock. Note that when called during
+ * the parse analysis of an auxiliary UPDATE query, relation may be
+ * NULL, and the details are acquired from the parent.
*
* If alsoSource is true, add the target to the query's joinlist and
* namespace. For INSERT, we don't want the target to be joined to;
@@ -172,19 +176,79 @@ setTargetTable(ParseState *pstate, RangeVar *relation,
/*
* Open target rel and grab suitable lock (which we will hold till end of
- * transaction).
+ * transaction), iff this is not an auxiliary ON CONFLICT UPDATE.
*
- * free_parsestate() will eventually do the corresponding heap_close(),
- * but *not* release the lock.
+ * free_parsestate() will eventually do the corresponding heap_close(), but
+ * *not* release the lock (again, iff this is not an auxiliary ON CONFLICT
+ * UPDATE).
*/
- pstate->p_target_relation = parserOpenTable(pstate, relation,
- RowExclusiveLock);
+ if (!pstate->p_is_speculative || pstate->p_is_insert)
+ {
+ pstate->p_target_relation = parserOpenTable(pstate, relation,
+ RowExclusiveLock);
+
+ /*
+ * Now build an RTE.
+ */
+ rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
+ relation->alias, inh, false);
+
+ /*
+ * Override addRangeTableEntry's default ACL_SELECT permissions
+ * check, and instead mark target table as requiring exactly the
+ * specified permissions.
+ *
+ * If we find an explicit reference to the rel later during parse
+ * analysis, we will add the ACL_SELECT bit back again; see
+ * markVarForSelectPriv and its callers.
+ */
+ rte->requiredPerms = requiredPerms;
+ }
+ else
+ {
+ RangeTblEntry *exclRte;
+
+ /* auxilary UPDATE (of ON CONFLICT UPDATE) */
+ Assert(pstate->p_is_update);
+ /* target shared with parent */
+ pstate->p_target_relation =
+ pstate->parentParseState->p_target_relation;
+ rte = pstate->parentParseState->p_target_rangetblentry;
+
+ /*
+ * When called for auxiliary UPDATE, same target RTE is processed here
+ * for a second time. Just append requiredPerms. There is no need to
+ * override addRangeTableEntry's default ACL_SELECT permissions check
+ * now.
+ */
+ rte->requiredPerms |= requiredPerms;
+
+ /*
+ * Build EXCLUDED alias for target relation. This can be used to
+ * reference the tuple originally proposed for insertion from within
+ * the ON CONFLICT UPDATE auxiliary query. This is not visible in the
+ * parent INSERT.
+ *
+ * NOTE: 'EXCLUDED' will always have a varno equal to 1 (at least until
+ * rewriting, where the RTE is effectively discarded -- its Vars are
+ * replaced with a special-purpose primnode, ExcludedExpr).
+ */
+ exclRte = addRangeTableEntryForRelation(pstate,
+ pstate->p_target_relation,
+ makeAlias("excluded", NIL),
+ false, false);
+
+ /*
+ * Add EXCLUDED RTE to namespace. It does not matter that the RTE is
+ * not added to the Query joinlist, since its Vars are merely
+ * placeholders for ExcludedExpr.
+ */
+ addRTEtoQuery(pstate, exclRte, false, true, true);
+
+ /* Append parent/our target to Query rtable (should be last) */
+ pstate->p_rtable = lappend(pstate->p_rtable, rte);
+ }
- /*
- * Now build an RTE.
- */
- rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
- relation->alias, inh, false);
pstate->p_target_rangetblentry = rte;
/* assume new rte is at end */
@@ -192,17 +256,6 @@ setTargetTable(ParseState *pstate, RangeVar *relation,
Assert(rte == rt_fetch(rtindex, pstate->p_rtable));
/*
- * Override addRangeTableEntry's default ACL_SELECT permissions check, and
- * instead mark target table as requiring exactly the specified
- * permissions.
- *
- * If we find an explicit reference to the rel later during parse
- * analysis, we will add the ACL_SELECT bit back again; see
- * markVarForSelectPriv and its callers.
- */
- rte->requiredPerms = requiredPerms;
-
- /*
* If UPDATE/DELETE, add table to joinlist and namespace.
*
* Note: some callers know that they can find the new ParseNamespaceItem
@@ -2166,6 +2219,167 @@ get_matching_location(int sortgroupref, List *sortgrouprefs, List *exprs)
}
/*
+ * resolve_unique_index_expr
+ * Infer a unique index from a list of indexElems, for ON
+ * CONFLICT UPDATE/IGNORE
+ *
+ * Perform parse analysis of expressions and columns appearing within ON
+ * CONFLICT clause. During planning, the returned list of expressions is used
+ * to infer which unique index to use.
+ */
+static List *
+resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel)
+{
+ List *clauseexprs = NIL;
+ ListCell *l;
+
+ if (heapRel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" is not an ordinary table",
+ RelationGetRelationName(heapRel)),
+ errhint("Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ if (heapRel->rd_rel->relhassubclass)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" has inheritance children",
+ RelationGetRelationName(heapRel)),
+ errhint("Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ foreach(l, infer->indexElems)
+ {
+ IndexElem *ielem = (IndexElem *) lfirst(l);
+ Node *trans;
+
+ /*
+ * Raw grammar re-uses CREATE INDEX infrastructure for unique index
+ * inference clause, and so will accept opclasses by name and so on.
+ * Reject these here explicitly.
+ */
+ if (ielem->ordering != SORTBY_DEFAULT ||
+ ielem->nulls_ordering != SORTBY_NULLS_DEFAULT)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT does not accept ordering or NULLS FIRST/LAST specifications"),
+ errhint("These factors do not affect uniqueness of indexed datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->collation != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT collation specification is unnecessary"),
+ errhint("Collations do not affect uniqueness of collatable datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->opclass != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ON CONFLICT cannot accept non-default operator class specifications"),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (!ielem->expr)
+ {
+ /* Simple index attribute */
+ ColumnRef *n;
+
+ /*
+ * Grammar won't have built raw expression for us in event of plain
+ * column reference. Create one directly, and perform expression
+ * transformation, which seems better principled than simply
+ * propagating catalog-style simple attribute numbers. For
+ * example, it means the Var is marked for SELECT privileges, which
+ * speculative insertion requires. Planner expects this, and
+ * performs its own normalization for the purposes of matching
+ * against pg_index.
+ */
+ n = makeNode(ColumnRef);
+ n->fields = list_make1(makeString(ielem->name));
+ /* Location is approximately that of inference specification */
+ n->location = infer->location;
+ trans = (Node *) n;
+ }
+ else
+ {
+ /* Do parse transformation of the raw expression */
+ trans = (Node *) ielem->expr;
+ }
+
+ /*
+ * transformExpr() should have already rejected subqueries,
+ * aggregates, and window functions, based on the EXPR_KIND_ for an
+ * index expression. Expressions returning sets won't have been
+ * rejected, but don't bother doing so here; there should be no
+ * available expression unique index to match any such expression
+ * against anyway.
+ */
+ trans = transformExpr(pstate, trans, EXPR_KIND_INDEX_EXPRESSION);
+ /* Save in list of transformed expressions */
+ clauseexprs = list_append_unique(clauseexprs, trans);
+ }
+
+ return clauseexprs;
+}
+
+/*
+ * transformConflictClauseExpr -
+ * transform expressions of ON CONFLICT UPDATE/IGNORE.
+ *
+ * Transformed expressions used to infer one unique index relation to serve as
+ * an ON CONFLICT arbiter. Partial unique indexes may be inferred using WHERE
+ * clause from inference specification clause.
+ */
+void
+transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere)
+{
+ InferClause *infer = confClause->infer;
+
+ if (confClause->specclause == SPEC_INSERT && !infer)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from"),
+ parser_errposition(pstate,
+ exprLocation((Node *) confClause))));
+
+ /* Raw grammar must ensure this invariant holds */
+ Assert(confClause->specclause != SPEC_INSERT ||
+ confClause->updatequery != NULL);
+
+ /*
+ * If there is no inference clause, this might be an updatable view, which
+ * are supported by ON CONFLICT IGNORE (without columns/ expressions
+ * specified to infer a unique index from -- this is mandatory for the
+ * UPDATE variant). It might also be a relation with inheritance children,
+ * which would also make proceeding with inference fail.
+ */
+ if (infer)
+ {
+ *arbiterExpr = resolve_unique_index_expr(pstate, infer,
+ pstate->p_target_relation);
+
+ /* Handling inference WHERE clause (for partial unique index inference) */
+ if (infer->whereClause)
+ *arbiterWhere = transformExpr(pstate, infer->whereClause,
+ EXPR_KIND_INDEX_PREDICATE);
+ }
+
+ /*
+ * It's convenient to form a list of expressions based on the
+ * representation used by CREATE INDEX, since the same restrictions are
+ * appropriate (on subqueries and so on). However, from here on, the
+ * handling of those expressions is identical to ordinary optimizable
+ * statements. In particular, assign_query_collations() can be trusted to
+ * do the right thing with the post parse analysis query tree inference
+ * clause representation.
+ */
+}
+
+/*
* addTargetToSortList
* If the given targetlist entry isn't already in the SortGroupClause
* list, add it to the end of the list, using the given sort ordering
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f0f0488..d1583c7 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1497,7 +1497,8 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
/*
* Check to see if the sublink is in an invalid place within the query. We
* allow sublinks everywhere in SELECT/INSERT/UPDATE/DELETE, but generally
- * not in utility statements.
+ * not in utility statements. They're also disallowed within auxiliary ON
+ * CONFLICT UPDATE commands, which we check for here.
*/
err = NULL;
switch (pstate->p_expr_kind)
@@ -1564,6 +1565,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update)
+ err = _("cannot use subquery in ON CONFLICT UPDATE");
if (err)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 4130cbf..9a94fa4 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -84,7 +84,13 @@ free_parsestate(ParseState *pstate)
errmsg("target lists can have at most %d entries",
MaxTupleAttributeNumber)));
- if (pstate->p_target_relation != NULL)
+ /*
+ * Don't close target relation for auxiliary ON CONFLICT UPDATE, since it
+ * is managed by parent INSERT directly
+ */
+ if (pstate->p_target_relation != NULL &&
+ (!pstate->p_is_speculative ||
+ pstate->p_is_insert))
heap_close(pstate->p_target_relation, NoLock);
pfree(pstate);
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index fab2948..f37760b 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -43,6 +43,12 @@ typedef struct acquireLocksOnSubLinks_context
bool for_execute; /* AcquireRewriteLocks' forExecute param */
} acquireLocksOnSubLinks_context;
+typedef struct excluded_replace_context
+{
+ int varno; /* varno of EXLCUDED.* Vars */
+ int rvarno; /* replace varno */
+} excluded_replace_context;
+
static bool acquireLocksOnSubLinks(Node *node,
acquireLocksOnSubLinks_context *context);
static Query *rewriteRuleAction(Query *parsetree,
@@ -66,11 +72,15 @@ static void markQueryForLocking(Query *qry, Node *jtnode,
LockClauseStrength strength, LockWaitPolicy waitPolicy,
bool pushedDown);
static List *matchLocks(CmdType event, RuleLock *rulelocks,
- int varno, Query *parsetree);
+ int varno, Query *parsetree, bool *hasUpdate);
static Query *fireRIRrules(Query *parsetree, List *activeRIRs,
bool forUpdatePushedDown);
static bool view_has_instead_trigger(Relation view, CmdType event);
static Bitmapset *adjust_view_column_set(Bitmapset *cols, List *targetlist);
+static Node *excluded_replace_vars(Node *expr,
+ excluded_replace_context *context);
+static Node *excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context);
/*
@@ -1288,7 +1298,8 @@ static List *
matchLocks(CmdType event,
RuleLock *rulelocks,
int varno,
- Query *parsetree)
+ Query *parsetree,
+ bool *hasUpdate)
{
List *matching_locks = NIL;
int nlocks;
@@ -1309,6 +1320,9 @@ matchLocks(CmdType event,
{
RewriteRule *oneLock = rulelocks->rules[i];
+ if (oneLock->event == CMD_UPDATE)
+ *hasUpdate = true;
+
/*
* Suppress ON INSERT/UPDATE/DELETE rules that are disabled or
* configured to not fire during the current sessions replication
@@ -2961,6 +2975,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
CmdType event = parsetree->commandType;
bool instead = false;
bool returning = false;
+ bool updatableview = false;
Query *qual_product = NULL;
List *rewritten = NIL;
ListCell *lc1;
@@ -3043,6 +3058,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
Relation rt_entry_relation;
List *locks;
List *product_queries;
+ bool hasUpdate = false;
result_relation = parsetree->resultRelation;
Assert(result_relation != 0);
@@ -3094,6 +3110,49 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
/* Process just the main targetlist */
rewriteTargetListIU(parsetree, rt_entry_relation, NULL);
}
+
+ if (parsetree->specClause == SPEC_INSERT)
+ {
+ Query *qry;
+ excluded_replace_context context;
+
+ /*
+ * While user-defined rules will never be applied in the
+ * auxiliary update query, normalization of tlist is still
+ * required
+ */
+ qry = (Query *) parsetree->onConflict;
+ rewriteTargetListIU(qry, rt_entry_relation, NULL);
+
+ /*
+ * Replace OLD Vars (associated with the EXCLUDED.* alias) with
+ * first (and only) "real" relation RTE in rtable. This allows
+ * the implementation to treat EXCLUDED.* as an alias for the
+ * target relation, which is useful during parse analysis,
+ * while ultimately having those references rewritten as
+ * special ExcludedExpr references to the corresponding Var in
+ * the target RTE.
+ *
+ * This is necessary because while we want a join-like syntax
+ * for aesthetic reasons, the resemblance is superficial. In
+ * fact, execution of the ModifyTable node (and its direct
+ * child auxiliary query) manages tupleslot state directly, and
+ * is directly tasked with making available the appropriate
+ * tupleslot to the expression context.
+ *
+ * This is a kludge, but appears necessary, since the slot made
+ * available for referencing via ExcludedExpr is in fact the
+ * slot just excluded from insertion by speculative insertion
+ * (with the effects of BEFORE ROW INSERT triggers carried).
+ * An ad-hoc method for making the excluded tuple available
+ * within the auxiliary expression context is appropriate.
+ */
+ context.varno = PRS2_OLD_VARNO;
+ context.rvarno = PRS2_OLD_VARNO + 1;
+
+ parsetree->onConflict =
+ excluded_replace_vars(parsetree->onConflict, &context);
+ }
}
else if (event == CMD_UPDATE)
{
@@ -3111,7 +3170,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
* Collect and apply the appropriate rules.
*/
locks = matchLocks(event, rt_entry_relation->rd_rules,
- result_relation, parsetree);
+ result_relation, parsetree, &hasUpdate);
product_queries = fireRules(parsetree,
result_relation,
@@ -3160,6 +3219,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
instead = true;
returning = true;
+ updatableview = true;
}
/*
@@ -3240,6 +3300,18 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
}
}
+ /*
+ * Updatable views are supported on a limited basis by ON CONFLICT
+ * IGNORE (if there is no unique index inference required, speculative
+ * insertion proceeds).
+ */
+ if (parsetree->specClause != SPEC_NONE &&
+ (product_queries != NIL || hasUpdate) &&
+ !updatableview)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules")));
+
heap_close(rt_entry_relation, NoLock);
}
@@ -3402,3 +3474,52 @@ QueryRewrite(Query *parsetree)
return results;
}
+
+/*
+ * Apply pullup variable replacement throughout an expression tree
+ *
+ * Returns modified tree, with user-specified rvarno replaced with varno.
+ */
+static Node *
+excluded_replace_vars(Node *expr, excluded_replace_context *context)
+{
+ /*
+ * Don't recurse into subqueries; they're forbidden in auxiliary ON
+ * CONFLICT query
+ */
+ return replace_rte_variables(expr,
+ context->varno, 0,
+ excluded_replace_vars_callback,
+ (void *) context,
+ NULL);
+}
+
+static Node *
+excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context)
+{
+ excluded_replace_context *rcon = (excluded_replace_context *) context->callback_arg;
+ ExcludedExpr *n = makeNode(ExcludedExpr);
+
+ /* Replace with an enclosing ExcludedExpr */
+ var->varno = rcon->rvarno;
+ n->arg = (Node *) var;
+
+ /*
+ * Would have to adjust varlevelsup if referenced item is from higher query
+ * (should not happen)
+ */
+ Assert(var->varlevelsup == 0);
+
+ if (var->varattno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference system column using EXCLUDED.* alias")));
+
+ if (var->varattno == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference whole-row using EXCLUDED.* alias")));
+
+ return (Node*) n;
+}
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a1ebc72..a1c5bcb 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -421,6 +421,13 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
latestXid))
ShmemVariableCache->latestCompletedXid = latestXid;
+ /* Also clear any speculative insertion information */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+
LWLockRelease(ProcArrayLock);
}
else
@@ -438,6 +445,11 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
pgxact->delayChkpt = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
Assert(pgxact->nxids == 0);
Assert(pgxact->overflowed == false);
@@ -476,6 +488,13 @@ ProcArrayClearTransaction(PGPROC *proc)
/* Clear the subtransaction-XID cache too */
pgxact->nxids = 0;
pgxact->overflowed = false;
+
+ /* these should be clear, but just in case.. */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
}
/*
@@ -1110,6 +1129,96 @@ TransactionIdIsActive(TransactionId xid)
/*
+ * SetSpeculativeInsertionToken -- Set speculative token
+ *
+ * The backend local counter value is set, to allow waiters to differentiate
+ * individual speculative insertions.
+ */
+void
+SetSpeculativeInsertionToken(uint32 token)
+{
+ MyProc->specInsertToken = token;
+}
+
+/*
+ * SetSpeculativeInsertionTid -- Set TID for speculative relfilenode
+ */
+void
+SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel = relnode;
+ ItemPointerCopy(tid, &MyProc->specInsertTid);
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * ClearSpeculativeInsertionState -- Clear token and TID for ourselves
+ */
+void
+ClearSpeculativeInsertionState(void)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * Returns a speculative insertion token for waiting for the insertion to
+ * finish
+ */
+uint32
+SpeculativeInsertionIsInProgress(TransactionId xid, RelFileNode rel,
+ ItemPointer tid)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ uint32 result = 0;
+
+ if (TransactionIdPrecedes(xid, RecentXmin))
+ return result;
+
+ /*
+ * Get the top transaction id.
+ *
+ * XXX We could search the proc array first, like
+ * TransactionIdIsInProgress() does, but this isn't performance-critical.
+ */
+ xid = SubTransGetTopmostTransaction(xid);
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+
+ if (pgxact->xid == xid)
+ {
+ /*
+ * Found the backend. Is it doing a speculative insertion of the
+ * given tuple?
+ */
+ if (RelFileNodeEquals(proc->specInsertRel, rel) &&
+ ItemPointerEquals(tid, &proc->specInsertTid))
+ result = proc->specInsertToken;
+
+ break;
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+
+ return result;
+}
+
+
+/*
* GetOldestXmin -- returns oldest transaction that was running
* when any current transaction was started.
*
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index d13a167..7a1df22 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -575,6 +575,69 @@ ConditionalXactLockTableWait(TransactionId xid)
return true;
}
+static uint32 speculativeInsertionToken = 0;
+
+/*
+ * SpeculativeInsertionLockAcquire
+ *
+ * Insert a lock showing that the given transaction ID is inserting a tuple,
+ * but hasn't yet decided whether it's going to keep it. The lock can then be
+ * used to wait for the decision to go ahead with the insertion, or aborting
+ * it.
+ *
+ * The token is used to distinguish multiple insertions by the same
+ * transaction. A counter will do, for example.
+ */
+void
+SpeculativeInsertionLockAcquire(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ speculativeInsertionToken++;
+ SetSpeculativeInsertionToken(speculativeInsertionToken);
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ (void) LockAcquire(&tag, ExclusiveLock, false, false);
+}
+
+/*
+ * SpeculativeInsertionLockRelease
+ *
+ * Delete the lock showing that the given transaction is speculatively
+ * inserting a tuple.
+ */
+void
+SpeculativeInsertionLockRelease(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ LockRelease(&tag, ExclusiveLock, false);
+}
+
+/*
+ * SpeculativeInsertionWait
+ *
+ * Wait for the specified transaction to finish or abort the insertion of a
+ * tuple.
+ */
+void
+SpeculativeInsertionWait(TransactionId xid, uint32 token)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, token);
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(token != 0);
+
+ (void) LockAcquire(&tag, ShareLock, false, false);
+ LockRelease(&tag, ShareLock, false);
+}
+
+
/*
* XactLockTableWaitErrorContextCb
* Error context callback for transaction lock waits.
@@ -873,6 +936,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
tag->locktag_field1,
tag->locktag_field2);
break;
+ case LOCKTAG_PROMISE_TUPLE_INSERTION:
+ appendStringInfo(buf,
+ _("tuple insertion by transaction %u"),
+ tag->locktag_field1);
+ break;
case LOCKTAG_OBJECT:
appendStringInfo(buf,
_("object %u of class %u of database %u"),
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 9c14e8a..41c4191 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -189,7 +189,8 @@ ProcessQuery(PlannedStmt *plan,
*/
if (completionTag)
{
- Oid lastOid;
+ Oid lastOid;
+ ModifyTableState *pstate;
switch (queryDesc->operation)
{
@@ -198,12 +199,16 @@ ProcessQuery(PlannedStmt *plan,
"SELECT %u", queryDesc->estate->es_processed);
break;
case CMD_INSERT:
+ pstate = (((ModifyTableState *) queryDesc->planstate));
+ Assert(IsA(pstate, ModifyTableState));
+
if (queryDesc->estate->es_processed == 1)
lastOid = queryDesc->estate->es_lastoid;
else
lastOid = InvalidOid;
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
- "INSERT %u %u", lastOid, queryDesc->estate->es_processed);
+ "%s %u %u", pstate->spec == SPEC_INSERT? "UPSERT":"INSERT",
+ lastOid, queryDesc->estate->es_processed);
break;
case CMD_UPDATE:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
@@ -1356,7 +1361,10 @@ PortalRunMulti(Portal portal, bool isTopLevel,
* 0" here because technically there is no query of the matching tag type,
* and printing a non-zero count for a different query type seems wrong,
* e.g. an INSERT that does an UPDATE instead should not print "0 1" if
- * one row was updated. See QueryRewrite(), step 3, for details.
+ * one row was updated (unless the ON CONFLICT UPDATE, or "UPSERT" variant
+ * of INSERT was used to update the row, where it's logically a direct
+ * effect of the top level command). See QueryRewrite(), step 3, for
+ * details.
*/
if (completionTag && completionTag[0] == '\0')
{
@@ -1366,6 +1374,8 @@ PortalRunMulti(Portal portal, bool isTopLevel,
sprintf(completionTag, "SELECT 0 0");
else if (strcmp(completionTag, "INSERT") == 0)
strcpy(completionTag, "INSERT 0 0");
+ else if (strcmp(completionTag, "UPSERT") == 0)
+ strcpy(completionTag, "UPSERT 0 0");
else if (strcmp(completionTag, "UPDATE") == 0)
strcpy(completionTag, "UPDATE 0");
else if (strcmp(completionTag, "DELETE") == 0)
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index a1967b69..95d62cb 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -28,6 +28,7 @@ static const char *const LockTagTypeNames[] = {
"tuple",
"transactionid",
"virtualxid",
+ "inserter transactionid",
"object",
"userlock",
"advisory"
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index c1d860c..04235e2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5645,6 +5645,24 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
return NULL;
}
+ else if (var->varno == INNER_VAR)
+ {
+ /* Assume an EXCLUDED variable */
+ rte = rt_fetch(PRS2_OLD_VARNO, dpns->rtable);
+
+ /*
+ * Sanity check: EXCLUDED.* Vars should only appear in auxiliary ON
+ * CONFLICT UPDATE queries. Assert that rte and planstate are
+ * consistent with that.
+ */
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(IsA(dpns->planstate, SeqScanState) ||
+ IsA(dpns->planstate, ResultState));
+
+ refname = "excluded";
+ colinfo = deparse_columns_fetch(PRS2_OLD_VARNO, dpns);
+ attnum = var->varattno;
+ }
else
{
elog(ERROR, "bogus varno: %d", var->varno);
@@ -6385,6 +6403,7 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags)
case T_CoerceToDomainValue:
case T_SetToDefault:
case T_CurrentOfExpr:
+ case T_ExcludedExpr:
/* single words: always simple */
return true;
@@ -7610,6 +7629,26 @@ get_rule_expr(Node *node, deparse_context *context,
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ Var *variable = (Var *) excludedexpr->arg;
+ bool save_varprefix;
+
+ /*
+ * Force parentheses because our caller probably assumed our
+ * Var is a simple expression.
+ */
+ appendStringInfoChar(buf, '(');
+ save_varprefix = context->varprefix;
+ /* Ensure EXCLUDED.* prefix is always visible */
+ context->varprefix = true;
+ get_rule_expr((Node *) variable, context, true);
+ context->varprefix = save_varprefix;
+ appendStringInfoChar(buf, ')');
+ }
+ break;
+
case T_List:
{
char *sep;
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 777f55c..99bb417 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -170,6 +170,13 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -726,6 +733,17 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Assert(htup->t_tableOid != InvalidOid);
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
+ snapshot->speculativeToken = 0;
+
+ /*
+ * Never return "super-deleted" tuples
+ *
+ * XXX: Comment this code out and you'll get conflicts within
+ * ExecLockUpdateTuple(), which result in an infinite loop.
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -807,6 +825,26 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
{
+ RelFileNode rnode;
+ ForkNumber forkno;
+ BlockNumber blockno;
+
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+
+ /* tuples can only be in the main fork */
+ Assert(forkno == MAIN_FORKNUM);
+ Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
+
+ /*
+ * Set speculative token. Caller can worry about xmax, since it
+ * requires a conclusively locked row version, and a concurrent
+ * update to this tuple is a conflict of its purposes.
+ */
+ snapshot->speculativeToken =
+ SpeculativeInsertionIsInProgress(HeapTupleHeaderGetRawXmin(tuple),
+ rnode,
+ &htup->t_self);
+
snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
/* XXX shouldn't we fall through to look at xmax? */
return true; /* in insertion by other */
@@ -922,6 +960,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1126,6 +1171,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
+ * Immediately VACUUM "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return HEAPTUPLE_DEAD;
+
+ /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 275bdcc..9302e41 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -894,9 +894,12 @@ PrintQueryResults(PGresult *results)
success = StoreQueryTuple(results);
else
success = PrintQueryTuples(results);
- /* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
+ /*
+ * if it's INSERT/UPSERT/UPDATE/DELETE RETURNING, also print status
+ */
cmdstatus = PQcmdStatus(results);
if (strncmp(cmdstatus, "INSERT", 6) == 0 ||
+ strncmp(cmdstatus, "UPSERT", 6) == 0 ||
strncmp(cmdstatus, "UPDATE", 6) == 0 ||
strncmp(cmdstatus, "DELETE", 6) == 0)
PrintQueryStatus(results);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 939d93d..62e760a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -28,6 +28,7 @@
#define HEAP_INSERT_SKIP_WAL 0x0001
#define HEAP_INSERT_SKIP_FSM 0x0002
#define HEAP_INSERT_FROZEN 0x0004
+#define HEAP_INSERT_SPECULATIVE 0x0008
typedef struct BulkInsertStateData *BulkInsertState;
@@ -141,7 +142,7 @@ extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
CommandId cid, int options, BulkInsertState bistate);
extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd);
+ HeapUpdateFailureData *hufd, bool killspeculative);
extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index a2ed2a0..870985d 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -73,6 +73,8 @@
#define XLOG_HEAP_SUFFIX_FROM_OLD (1<<6)
/* last xl_heap_multi_insert record for one heap_multi_insert() call */
#define XLOG_HEAP_LAST_MULTI_INSERT (1<<7)
+/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */
+#define XLOG_HEAP_KILLED_SPECULATIVE_TUPLE XLOG_HEAP_LAST_MULTI_INSERT
/* convenience macro for checking whether any form of old tuple was logged */
#define XLOG_HEAP_CONTAINS_OLD \
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index e7cc7a0..42c10d4 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -80,6 +80,8 @@ extern void index_drop(Oid indexId, bool concurrent);
extern IndexInfo *BuildIndexInfo(Relation index);
+extern void AddUniqueSpeculative(Relation index, IndexInfo *ii);
+
extern void FormIndexDatum(IndexInfo *indexInfo,
TupleTableSlot *slot,
EState *estate,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 40fde83..accdc83 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -352,16 +352,21 @@ extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
extern void ExecCloseScanRelation(Relation scanrel);
-extern void ExecOpenIndices(ResultRelInfo *resultRelInfo);
+extern void ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative);
extern void ExecCloseIndices(ResultRelInfo *resultRelInfo);
+extern List *ExecLockIndexValues(TupleTableSlot *slot, EState *estate,
+ SpecCmd specReason);
extern List *ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid,
- EState *estate);
-extern bool check_exclusion_constraint(Relation heap, Relation index,
- IndexInfo *indexInfo,
- ItemPointer tupleid,
- Datum *values, bool *isnull,
- EState *estate,
- bool newIndex, bool errorOK);
+ EState *estate, bool noDupErr, Oid arbiterIdx);
+extern bool ExecCheckIndexConstraints(TupleTableSlot *slot, EState *estate,
+ ItemPointer conflictTid, Oid arbiterIdx);
+extern bool check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo,
+ ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate,
+ bool newIndex, bool errorOK,
+ bool wait, ItemPointer conflictTid);
extern void RegisterExprContextCallback(ExprContext *econtext,
ExprContextCallbackFunction function,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41288ed..2e4e168 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -41,6 +41,9 @@
* ExclusionOps Per-column exclusion operators, or NULL if none
* ExclusionProcs Underlying function OIDs for ExclusionOps
* ExclusionStrats Opclass strategy numbers for ExclusionOps
+ * UniqueOps Theses are like Exclusion*, but for unique indexes
+ * UniqueProcs
+ * UniqueStrats
* Unique is it a unique index?
* ReadyForInserts is it valid for inserts?
* Concurrent are we doing a concurrent index build?
@@ -62,6 +65,9 @@ typedef struct IndexInfo
Oid *ii_ExclusionOps; /* array with one entry per column */
Oid *ii_ExclusionProcs; /* array with one entry per column */
uint16 *ii_ExclusionStrats; /* array with one entry per column */
+ Oid *ii_UniqueOps; /* array with one entry per column */
+ Oid *ii_UniqueProcs; /* array with one entry per column */
+ uint16 *ii_UniqueStrats; /* array with one entry per column */
bool ii_Unique;
bool ii_ReadyForInserts;
bool ii_Concurrent;
@@ -967,6 +973,16 @@ typedef struct DomainConstraintState
ExprState *check_expr; /* for CHECK, a boolean expression */
} DomainConstraintState;
+/* ----------------
+ * ExcludedExprState node
+ * ----------------
+ */
+typedef struct ExcludedExprState
+{
+ ExprState xprstate;
+ ExprState *arg; /* the argument */
+} ExcludedExprState;
+
/* ----------------------------------------------------------------
* Executor State Trees
@@ -1088,6 +1104,9 @@ typedef struct ModifyTableState
int mt_whichplan; /* which one is being executed (0..n-1) */
ResultRelInfo *resultRelInfo; /* per-subplan target relations */
List **mt_arowmarks; /* per-subplan ExecAuxRowMark lists */
+ SpecCmd spec; /* reason for speculative insertion */
+ Oid arbiterIndex; /* unique index to arbitrate taking alt path */
+ PlanState *onConflict; /* associated OnConflict state */
EPQState mt_epqstate; /* for evaluating EvalPlanQual rechecks */
bool fireBSTriggers; /* do we need to fire stmt triggers? */
} ModifyTableState;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 97ef0fc..8d6fba4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -168,6 +168,7 @@ typedef enum NodeTag
T_CoerceToDomainValue,
T_SetToDefault,
T_CurrentOfExpr,
+ T_ExcludedExpr,
T_TargetEntry,
T_RangeTblRef,
T_JoinExpr,
@@ -207,6 +208,7 @@ typedef enum NodeTag
T_NullTestState,
T_CoerceToDomainState,
T_DomainConstraintState,
+ T_ExcludedExprState,
/*
* TAGS FOR PLANNER NODES (relation.h)
@@ -412,6 +414,8 @@ typedef enum NodeTag
T_RowMarkClause,
T_XmlSerialize,
T_WithClause,
+ T_InferClause,
+ T_ConflictClause,
T_CommonTableExpr,
/*
@@ -624,4 +628,18 @@ typedef enum JoinType
(1 << JOIN_RIGHT) | \
(1 << JOIN_ANTI))) != 0)
+/*
+ * SpecCmd -
+ * "Speculative insertion" clause
+ *
+ * This is needed in both parsenodes.h and plannodes.h, so put it here...
+ */
+typedef enum
+{
+ SPEC_NONE, /* Not involved in speculative insertion */
+ SPEC_IGNORE, /* INSERT of "ON CONFLICT IGNORE" */
+ SPEC_INSERT, /* INSERT of "ON CONFLICT UPDATE" */
+ SPEC_UPDATE /* UPDATE of "ON CONFLICT UPDATE" */
+} SpecCmd;
+
#endif /* NODES_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 86d1c07..c03c9ca 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -132,6 +132,11 @@ typedef struct Query
List *withCheckOptions; /* a list of WithCheckOption's */
+ SpecCmd specClause; /* speculative insertion clause */
+ List *arbiterExpr; /* Unique index arbiter exprs */
+ Node *arbiterWhere; /* Unique index arbiter WHERE clause */
+ Node *onConflict; /* ON CONFLICT Query */
+
List *returningList; /* return-values list (of TargetEntry) */
List *groupClause; /* a list of SortGroupClause's */
@@ -564,7 +569,7 @@ typedef enum TableLikeOption
} TableLikeOption;
/*
- * IndexElem - index parameters (used in CREATE INDEX)
+ * IndexElem - index parameters (used in CREATE INDEX, and in ON CONFLICT)
*
* For a plain index attribute, 'name' is the name of the table column to
* index, and 'expr' is NULL. For an index expression, 'name' is NULL and
@@ -999,6 +1004,36 @@ typedef struct WithClause
} WithClause;
/*
+ * InferClause -
+ * ON CONFLICT unique index inference clause
+ *
+ * Note: InferClause does not propagate into the Query representation.
+ */
+typedef struct InferClause
+{
+ NodeTag type;
+ List *indexElems; /* IndexElems to infer unique index */
+ Node *whereClause; /* qualification (partial-index predicate) */
+ int location; /* token location, or -1 if unknown */
+} InferClause;
+
+/*
+ * ConflictClause -
+ * representation of ON CONFLICT clause
+ *
+ * Note: ConflictClause does not propagate into the Query representation.
+ * However, Query may contain onConflict child Query.
+ */
+typedef struct ConflictClause
+{
+ NodeTag type;
+ SpecCmd specclause; /* Variant specified */
+ InferClause *infer; /* Optional index inference clause */
+ Node *updatequery; /* Update parse stmt */
+ int location; /* token location, or -1 if unknown */
+} ConflictClause;
+
+/*
* CommonTableExpr -
* representation of WITH list element
*
@@ -1048,6 +1083,7 @@ typedef struct InsertStmt
RangeVar *relation; /* relation to insert into */
List *cols; /* optional: names of the target columns */
Node *selectStmt; /* the source SELECT/VALUES, or NULL */
+ ConflictClause *confClause; /* ON CONFLICT clause */
List *returningList; /* list of expressions to return */
WithClause *withClause; /* WITH clause */
} InsertStmt;
@@ -1073,7 +1109,7 @@ typedef struct DeleteStmt
typedef struct UpdateStmt
{
NodeTag type;
- RangeVar *relation; /* relation to update */
+ RangeVar *relation; /* relation to update (NULL for speculative) */
List *targetList; /* the target list (of ResTarget) */
Node *whereClause; /* qualifications */
List *fromClause; /* optional from clause for more tables */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 316c9ce..7e05cb7 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -177,6 +177,9 @@ typedef struct ModifyTable
List *resultRelations; /* integer list of RT indexes */
int resultRelIndex; /* index of first resultRel in plan's list */
List *plans; /* plan(s) producing source data */
+ SpecCmd spec; /* speculative insertion specification */
+ Oid arbiterIndex; /* Oid of ON CONFLICT arbiter index */
+ Plan *onConflictPlan; /* Plan for ON CONFLICT UPDATE auxiliary query */
List *withCheckOptionLists; /* per-target-table WCO lists */
List *returningLists; /* per-target-table RETURNING tlists */
List *fdwPrivLists; /* per-target-table FDW private data lists */
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1d06f42..21c39dc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1147,6 +1147,53 @@ typedef struct CurrentOfExpr
int cursor_param; /* refcursor parameter number, or 0 */
} CurrentOfExpr;
+/*
+ * ExcludedExpr - an EXCLUDED.* expression
+ *
+ * During parse analysis of ON CONFLICT UPDATE auxiliary queries, a dummy
+ * EXCLUDED range table entry is generated, which is actually just an alias for
+ * the target relation. This is useful during parse analysis, allowing the
+ * parser to produce simple error messages, for example. There is the
+ * appearance of a join within the auxiliary ON CONFLICT UPDATE, superficially
+ * similar to a join in an UPDATE ... FROM; this is a limited, ad-hoc join
+ * though, as the executor needs to tightly control the referenced tuple/slot
+ * through which update evaluation references excluded values originally
+ * proposed for insertion. Note that EXCLUDED.* values carry forward the
+ * effects of BEFORE ROW INSERT triggers.
+ *
+ * To implement a limited "join" for ON CONFLICT UPDATE auxiliary queries,
+ * during the rewrite stage, Vars referencing the alias EXCLUDED.* RTE are
+ * swapped with ExcludedExprs, which also contain Vars; their Vars are
+ * equivalent, but reference the target instead. The ExcludedExpr Var actually
+ * evaluates against varno INNER_VAR during expression evaluation (and not a
+ * varno INDEX_VAR associated with an entry in the flattened range table
+ * representing the target, which is necessarily being scanned whenever an
+ * ExcludedExpr is evaluated) while still being logically associated with the
+ * target. The Var is only rigged to reference the inner slot during
+ * ExcludedExpr initialization. The executor closely controls the evaluation
+ * expression, installing the EXCLUDED slot actually excluded from insertion
+ * into the inner slot of the child/auxiliary evaluation context in an ad-hoc
+ * fashion, which, after ExcludedExpr initialization, is expected (i.e. it is
+ * expected during ExcludedExpr evaluation that the parent insert will make
+ * each excluded tuple available in the inner slot in turn). ExcludedExpr are
+ * only ever evaluated during special speculative insertion related EPQ
+ * expression evaluation, purely for the benefit of auxiliary UPDATE
+ * expressions.
+ *
+ * Aside from representing a logical choke point for this special expression
+ * evaluation, having a dedicated primnode also prevents the optimizer from
+ * considering various optimization that might otherwise be attempted.
+ * Obviously there is no useful join optimization possible within the auxiliary
+ * query, and an ExcludedExpr based post-rewrite query tree representation is a
+ * convenient way of preventing that, as well as related inapplicable
+ * optimizations concerning the equivalence of Vars.
+ */
+typedef struct ExcludedExpr
+{
+ Expr xpr;
+ Node *arg; /* argument (Var) */
+} ExcludedExpr;
+
/*--------------------
* TargetEntry -
* a target entry (used in query target lists)
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..801effe 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -64,6 +64,7 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
int indexcol,
List **indexcolnos,
bool *var_on_left_p);
+extern Oid plan_speculative_use_index(PlannerInfo *root, List *indexList);
/*
* tidpath.h
diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h
index 8eb2e57..878adfe 100644
--- a/src/include/optimizer/plancat.h
+++ b/src/include/optimizer/plancat.h
@@ -28,6 +28,8 @@ extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
extern void get_relation_info(PlannerInfo *root, Oid relationObjectId,
bool inhparent, RelOptInfo *rel);
+extern Oid infer_unique_index(PlannerInfo *root);
+
extern void estimate_rel_size(Relation rel, int32 *attr_widths,
BlockNumber *pages, double *tuples, double *allvisfrac);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 082f7d7..d8cde27 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -84,7 +84,8 @@ extern ModifyTable *make_modifytable(PlannerInfo *root,
CmdType operation, bool canSetTag,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam);
+ List *rowMarks, Plan *onConflictPlan, SpecCmd spec,
+ int epqParam);
extern bool is_projection_capable_plan(Plan *plan);
/*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c243ec..cf501e6 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -87,6 +87,7 @@ PG_KEYWORD("commit", COMMIT, UNRESERVED_KEYWORD)
PG_KEYWORD("committed", COMMITTED, UNRESERVED_KEYWORD)
PG_KEYWORD("concurrently", CONCURRENTLY, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("configuration", CONFIGURATION, UNRESERVED_KEYWORD)
+PG_KEYWORD("conflict", CONFLICT, UNRESERVED_KEYWORD)
PG_KEYWORD("connection", CONNECTION, UNRESERVED_KEYWORD)
PG_KEYWORD("constraint", CONSTRAINT, RESERVED_KEYWORD)
PG_KEYWORD("constraints", CONSTRAINTS, UNRESERVED_KEYWORD)
@@ -180,6 +181,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 6a4438f..d1d0d12 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -41,6 +41,8 @@ extern List *transformDistinctClause(ParseState *pstate,
List **targetlist, List *sortClause, bool is_agg);
extern List *transformDistinctOnClause(ParseState *pstate, List *distinctlist,
List **targetlist, List *sortClause);
+extern void transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere);
extern List *addTargetToSortList(ParseState *pstate, TargetEntry *tle,
List *sortlist, List *targetlist, SortBy *sortby,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 3103b71..2b5804e 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -153,6 +153,7 @@ struct ParseState
bool p_hasModifyingCTE;
bool p_is_insert;
bool p_is_update;
+ bool p_is_speculative;
bool p_locked_from_parent;
Relation p_target_relation;
RangeTblEntry *p_target_rangetblentry;
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index f5d70e5..6bb95fc 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -76,6 +76,11 @@ extern bool ConditionalXactLockTableWait(TransactionId xid);
extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
+/* Lock an XID for tuple insertion (used to wait for an insertion to finish) */
+extern void SpeculativeInsertionLockAcquire(TransactionId xid);
+extern void SpeculativeInsertionLockRelease(TransactionId xid);
+extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
+
/* Lock a general object (other than a relation) of the current database */
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1100923..9c21810 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -176,6 +176,8 @@ typedef enum LockTagType
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
+ LOCKTAG_PROMISE_TUPLE_INSERTION, /* tuple insertion, keyed by Xid */
+ /* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
@@ -261,6 +263,14 @@ typedef struct LOCKTAG
(locktag).locktag_type = LOCKTAG_VIRTUALTRANSACTION, \
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
+ ((locktag).locktag_field1 = (xid), \
+ (locktag).locktag_field2 = (token), \
+ (locktag).locktag_field3 = 0, \
+ (locktag).locktag_field4 = 0, \
+ (locktag).locktag_type = LOCKTAG_PROMISE_TUPLE_INSERTION, \
+ (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+
#define SET_LOCKTAG_OBJECT(locktag,dboid,classoid,objoid,objsubid) \
((locktag).locktag_field1 = (dboid), \
(locktag).locktag_field2 = (classoid), \
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index e807a2e..cd15570 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,9 +16,11 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "storage/itemptr.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
+#include "storage/relfilenode.h"
/*
* Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
@@ -132,6 +134,17 @@ struct PGPROC
*/
SHM_QUEUE myProcLocks[NUM_LOCK_PARTITIONS];
+ /*
+ * Info to allow us to perform speculative insertion without "unprincipled
+ * deadlocks". This state allows others to wait on the outcome of an
+ * optimistically inserted speculative tuple for only the duration of the
+ * insertion (not to the end of our xact) iff the insertion does not work
+ * out (due to our detecting a conflict).
+ */
+ uint32 specInsertToken;
+ RelFileNode specInsertRel;
+ ItemPointerData specInsertTid;
+
struct XidCache subxids; /* cache for subtransaction XIDs */
/* Per-backend LWLock. Protects fields below. */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 97c6e93..ea2bba9 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -55,6 +55,13 @@ extern TransactionId GetOldestXmin(Relation rel, bool ignoreVacuum);
extern TransactionId GetOldestActiveTransactionId(void);
extern TransactionId GetOldestSafeDecodingTransactionId(void);
+extern void SetSpeculativeInsertionToken(uint32 token);
+extern void SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid);
+extern void ClearSpeculativeInsertionState(void);
+extern uint32 SpeculativeInsertionIsInProgress(TransactionId xid,
+ RelFileNode rel,
+ ItemPointer tid);
+
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 26fb257..cd5ad76 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -87,6 +87,17 @@ typedef struct SnapshotData
bool copied; /* false if it's a static snapshot */
/*
+ * Snapshot's speculative token is value set by HeapTupleSatisfiesDirty,
+ * indicating that the tuple is being inserted speculatively, and may yet
+ * be "super-deleted" before EOX. The caller may use the value with
+ * PromiseTupleInsertionWait to wait for the inserter to decide. It is only
+ * set when a valid 'xmin' is set, too. By convention, when
+ * speculativeToken is zero, the caller must assume that is should wait on
+ * a non-speculative tuple (i.e. wait for xmin/xmax to commit).
+ */
+ uint32 speculativeToken;
+
+ /*
* note: all ids in subxip[] are >= xmin, but we don't bother filtering
* out any that are >= xmax
*/
--
1.9.1
0003-RLS-support-for-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0003-RLS-support-for-ON-CONFLICT-UPDATE.patchDownload
From b5c7c6a77f5595b0eec29e1387e53606b35fb72c Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 6 Jan 2015 16:32:21 -0800
Subject: [PATCH 3/6] RLS support for ON CONFLICT UPDATE
Row-Level Security policies may apply to UPDATE commands or INSERT
commands only. UPDATE RLS policies can have both USING() security
barrier quals, and CHECK options (INSERT RLS policies may only have
CHECK options, though). It is necessary to carefully consider the
behavior of RLS policies in the context of INSERT with ON CONFLICT
UPDATE, since ON CONFLICT UPDATE is more or less a new top-level
command, conceptually quite different to two separate statements (an
INSERT and an UPDATE).
The approach taken is to "bunch together" both sets of policies, and to
enforce them in 3 different places against three different slots (3
different stages of query processing in the executor).
Note that UPDATE policy USING() barrier quals are always treated as
CHECK options. It is thought that silently failing when USING() barrier
quals are not satisfied is a more surprising outcome, even if it is
closer to the existing behavior of UPDATE statements. This is because
the user's intent to UPDATE one particular row based on simple criteria
is quite clear with ON CONFLICT UPDATE.
The 3 places that RLS policies are enforced are:
* Against row actually inserted, after insertion proceeds successfully
(INSERT-applicable policies only).
* Against row in target table that caused conflict. The implementation
is careful not to leak the contents of that row in diagnostic
messages (INSERT-applicable *and* UPDATE-applicable policies).
* Against the version of the row added by to the relation after
ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
policies).
Documentation and tests follow in later commits.
---
src/backend/executor/execMain.c | 25 ++++++---
src/backend/executor/nodeModifyTable.c | 53 ++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/nodes/outfuncs.c | 1 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/rewrite/rewriteHandler.c | 2 +
src/backend/rewrite/rowsecurity.c | 94 +++++++++++++++++++++++++++++-----
src/include/executor/executor.h | 3 +-
src/include/nodes/parsenodes.h | 1 +
10 files changed, 158 insertions(+), 24 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 3d7761d..56fa3bd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1697,7 +1697,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
*/
void
ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate)
+ TupleTableSlot *slot, bool detail,
+ bool onlyInsert, EState *estate)
{
Relation rel = resultRelInfo->ri_RelationDesc;
TupleDesc tupdesc = RelationGetDescr(rel);
@@ -1722,6 +1723,15 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
ExprState *wcoExpr = (ExprState *) lfirst(l2);
/*
+ * INSERT ... ON CONFLICT UPDATE callers may require that not all WITH
+ * CHECK OPTIONs associated with resultRelInfo are enforced at all
+ * stages of query processing. (UPDATE-related policies are not
+ * enforced in respect of a successfully inserted tuple).
+ */
+ if (onlyInsert && wco->commandType == CMD_UPDATE)
+ continue;
+
+ /*
* WITH CHECK OPTION checks are intended to ensure that the new tuple
* is visible (in the case of a view) or that it passes the
* 'with-check' policy (in the case of row security).
@@ -1732,16 +1742,17 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
*/
if (!ExecQual((List *) wcoExpr, econtext, false))
{
- char *val_desc;
+ char *val_desc = NULL;
Bitmapset *modifiedCols;
modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
- val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
- slot,
- tupdesc,
- modifiedCols,
- 64);
+ if (detail)
+ val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
+ slot,
+ tupdesc,
+ modifiedCols,
+ 64);
ereport(ERROR,
(errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION),
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 5411896..d02fdf9 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -453,7 +453,8 @@ vlock:
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, spec == SPEC_INSERT,
+ estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -947,7 +948,7 @@ lreplace:;
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, false, estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -1129,6 +1130,54 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+ /*
+ * For RLS with ON CONFLICT UPDATE, security quals are always
+ * treated as WITH CHECK options, even when there were separate
+ * security quals and explicit WITH CHECK options (ordinarily,
+ * security quals are only treated as WITH CHECK options when there
+ * are no explicit WITH CHECK options). Also, CHECK OPTIONs
+ * (originating either explicitly, or implicitly as security quals)
+ * for both UPDATE and INSERT policies (or ALL policies) are
+ * checked (as CHECK OPTIONs) at three different points for three
+ * distinct but related tuples/slots in the context of ON CONFLICT
+ * UPDATE. There are three relevant ExecWithCheckOptions() calls:
+ *
+ * * After successful insertion, within ExecInsert(), against the
+ * inserted tuple. This only includes INSERT-applicable policies.
+ *
+ * * Here, after row locking but before calling ExecUpdate(), on
+ * the existing tuple in the target relation (which we cannot leak
+ * details of). This is conceptually like a security barrier qual
+ * for the purposes of the auxiliary update, although unlike
+ * regular updates that require security barrier quals we prefer to
+ * raise an error (by treating the security barrier quals as CHECK
+ * OPTIONS) rather than silently not affect rows, because the
+ * intent to update seems clear and unambiguous for ON CONFLICT
+ * UPDATE. This includes both INSERT-applicable and
+ * UPDATE-applicable policies.
+ *
+ * * On the final tuple created by the update within ExecUpdate (if
+ * any). This is also subject to INSERT policy enforcement, unlike
+ * conventional ExecUpdate() calls for UPDATE statements -- it
+ * includes both INSERT-applicable and UPDATE-applicable policies.
+ */
+ if (resultRelInfo->ri_WithCheckOptions != NIL)
+ {
+ TupleTableSlot *opts;
+
+ /* Construct temp slot for locked tuple from target */
+ opts = MakeSingleTupleTableSlot(slot->tts_tupleDescriptor);
+ ExecStoreTuple(copyTuple, opts, InvalidBuffer, false);
+
+ /*
+ * Check, but without leaking contents of tuple; user only
+ * supplied one conflicting value or composition of values, and
+ * not the entire tuple.
+ */
+ ExecWithCheckOptions(resultRelInfo, opts, false, false,
+ estate);
+ }
+
if (!TupIsNull(slot))
*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
planSlot, &onConflict->mt_epqstate,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index df611d2..5c091e1 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2074,6 +2074,7 @@ _copyWithCheckOption(const WithCheckOption *from)
COPY_STRING_FIELD(viewname);
COPY_NODE_FIELD(qual);
+ COPY_SCALAR_FIELD(commandType);
COPY_SCALAR_FIELD(cascaded);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 24e58fa..4057c27 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2384,6 +2384,7 @@ _equalWithCheckOption(const WithCheckOption *a, const WithCheckOption *b)
{
COMPARE_STRING_FIELD(viewname);
COMPARE_NODE_FIELD(qual);
+ COMPARE_SCALAR_FIELD(commandType);
COMPARE_SCALAR_FIELD(cascaded);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 34e9163..d077882 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2336,6 +2336,7 @@ _outWithCheckOption(StringInfo str, const WithCheckOption *node)
WRITE_STRING_FIELD(viewname);
WRITE_NODE_FIELD(qual);
+ WRITE_ENUM_FIELD(commandType, CmdType);
WRITE_BOOL_FIELD(cascaded);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48a7206..9f3e0c8 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -272,6 +272,7 @@ _readWithCheckOption(void)
READ_STRING_FIELD(viewname);
READ_NODE_FIELD(qual);
+ READ_ENUM_FIELD(commandType, CmdType);
READ_BOOL_FIELD(cascaded);
READ_DONE();
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index f37760b..a2cc4f3 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1767,6 +1767,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs, bool forUpdatePushedDown)
List *quals = NIL;
wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->commandType = parsetree->commandType;
quals = lcons(wco->qual, quals);
activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs);
@@ -2935,6 +2936,7 @@ rewriteTargetView(Query *parsetree, Relation view)
wco->viewname = pstrdup(RelationGetRelationName(view));
wco->qual = NULL;
wco->cascaded = cascaded;
+ wco->commandType = viewquery->commandType;
parsetree->withCheckOptions = lcons(wco,
parsetree->withCheckOptions);
diff --git a/src/backend/rewrite/rowsecurity.c b/src/backend/rewrite/rowsecurity.c
index 7669130..09f1ac3 100644
--- a/src/backend/rewrite/rowsecurity.c
+++ b/src/backend/rewrite/rowsecurity.c
@@ -56,12 +56,14 @@
#include "utils/syscache.h"
#include "tcop/utility.h"
-static List *pull_row_security_policies(CmdType cmd, Relation relation,
- Oid user_id);
+static List *pull_row_security_policies(CmdType cmd, bool onConflict,
+ Relation relation, Oid user_id);
static void process_policies(List *policies, int rt_index,
Expr **final_qual,
Expr **final_with_check_qual,
- bool *hassublinks);
+ bool *hassublinks,
+ Expr **spec_with_check_eval,
+ bool onConflict);
static bool check_role_for_policy(ArrayType *policy_roles, Oid user_id);
/*
@@ -88,6 +90,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
Expr *rowsec_with_check_expr = NULL;
Expr *hook_expr = NULL;
Expr *hook_with_check_expr = NULL;
+ Expr *hook_spec_with_check_expr = NULL;
List *rowsec_policies;
List *hook_policies = NIL;
@@ -149,8 +152,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Grab the built-in policies which should be applied to this relation. */
rel = heap_open(rte->relid, NoLock);
- rowsec_policies = pull_row_security_policies(root->commandType, rel,
- user_id);
+ rowsec_policies = pull_row_security_policies(root->commandType,
+ root->specClause == SPEC_INSERT,
+ rel, user_id);
/*
* Check if this is only the default-deny policy.
@@ -168,7 +172,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Now that we have our policies, build the expressions from them. */
process_policies(rowsec_policies, rt_index, &rowsec_expr,
- &rowsec_with_check_expr, &hassublinks);
+ &rowsec_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
/*
* Also, allow extensions to add their own policies.
@@ -198,7 +204,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Build the expression from any policies returned. */
process_policies(hook_policies, rt_index, &hook_expr,
- &hook_with_check_expr, &hassublinks);
+ &hook_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
}
/*
@@ -230,6 +238,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) rowsec_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
@@ -244,6 +253,23 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) hook_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
+ root->withCheckOptions = lcons(wco, root->withCheckOptions);
+ }
+
+ /*
+ * Also add the expression, if any, returned from the extension that
+ * applies to auxiliary UPDATE within ON CONFLICT UPDATE.
+ */
+ if (hook_spec_with_check_expr)
+ {
+ WithCheckOption *wco;
+
+ wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->viewname = RelationGetRelationName(rel);
+ wco->qual = (Node *) hook_spec_with_check_expr;
+ wco->cascaded = false;
+ wco->commandType = CMD_UPDATE;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
}
@@ -288,7 +314,8 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
*
*/
static List *
-pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
+pull_row_security_policies(CmdType cmd, bool onConflict, Relation relation,
+ Oid user_id)
{
List *policies = NIL;
ListCell *item;
@@ -322,7 +349,9 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
if (policy->polcmd == ACL_INSERT_CHR
&& check_role_for_policy(policy->roles, user_id))
policies = lcons(policy, policies);
- break;
+ if (!onConflict)
+ break;
+ /* FALL THRU */
case CMD_UPDATE:
if (policy->polcmd == ACL_UPDATE_CHR
&& check_role_for_policy(policy->roles, user_id))
@@ -384,26 +413,41 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
*/
static void
process_policies(List *policies, int rt_index, Expr **qual_eval,
- Expr **with_check_eval, bool *hassublinks)
+ Expr **with_check_eval, bool *hassublinks,
+ Expr **spec_with_check_eval, bool onConflict)
{
ListCell *item;
List *quals = NIL;
List *with_check_quals = NIL;
+ List *conflict_update_quals = NIL;
/*
* Extract the USING and WITH CHECK quals from each of the policies
- * and add them to our lists.
+ * and add them to our lists. CONFLICT UPDATE quals are always treated
+ * as CHECK OPTIONS.
*/
foreach(item, policies)
{
RowSecurityPolicy *policy = (RowSecurityPolicy *) lfirst(item);
if (policy->qual != NULL)
- quals = lcons(copyObject(policy->qual), quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ quals = lcons(copyObject(policy->qual), quals);
+ else
+ conflict_update_quals = lcons(copyObject(policy->qual), quals);
+ }
if (policy->with_check_qual != NULL)
- with_check_quals = lcons(copyObject(policy->with_check_qual),
- with_check_quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ with_check_quals = lcons(copyObject(policy->with_check_qual),
+ with_check_quals);
+ else
+ conflict_update_quals =
+ lcons(copyObject(policy->with_check_qual),
+ conflict_update_quals);
+ }
if (policy->hassublinks)
*hassublinks = true;
@@ -420,6 +464,10 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
/*
* If we end up with only USING quals, then use those as
* WITH CHECK quals also.
+ *
+ * For the INSERT with ON CONFLICT UPDATE case, we always enforce that the
+ * UPDATE's USING quals are treated like WITH CHECK quals, enforced against
+ * the target relation's tuple in multiple places.
*/
if (with_check_quals == NIL)
with_check_quals = copyObject(quals);
@@ -453,6 +501,24 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
else
*with_check_eval = (Expr*) linitial(with_check_quals);
+ /*
+ * For INSERT with ON CONFLICT UPDATE, *both* sets of WITH CHECK options
+ * (from any INSERT policy and any UPDATE policy) are enforced.
+ *
+ * These are handled separately because enforcement of each type of WITH
+ * CHECK option is based on the point in query processing of INSERT ... ON
+ * CONFLICT UPDATE. The INSERT path does not enforce UPDATE related CHECK
+ * OPTIONs.
+ */
+ if (conflict_update_quals != NIL)
+ {
+ if (list_length(conflict_update_quals) > 1)
+ *spec_with_check_eval = makeBoolExpr(AND_EXPR,
+ conflict_update_quals, -1);
+ else
+ *spec_with_check_eval = (Expr*) linitial(conflict_update_quals);
+ }
+
return;
}
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index accdc83..a59e857 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -195,7 +195,8 @@ extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
extern void ExecConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
extern void ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate);
+ TupleTableSlot *slot, bool detail, bool onlyInsert,
+ EState *estate);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti);
extern ExecAuxRowMark *ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist);
extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index c03c9ca..19d2484 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -868,6 +868,7 @@ typedef struct WithCheckOption
NodeTag type;
char *viewname; /* name of view that specified the WCO */
Node *qual; /* constraint qual to check */
+ CmdType commandType; /* select|insert|update|delete */
bool cascaded; /* true = WITH CASCADED CHECK OPTION */
} WithCheckOption;
--
1.9.1
0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 30e580973f28babf9bb07d2de9b8b28a1d9cffee Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:11:15 -0700
Subject: [PATCH 4/6] Tests for INSERT ... ON CONFLICT {UPDATE | IGNORE}
Add dedicated isolation tests for both UPDATE and IGNORE variants,
illustrating the "MVCC violation" that allows a READ COMMITTED
transaction's UPDATE to succeed in updating a tuple with no version
visible to its command's MVCC snapshot. Add regression tests, which for
the most part are intended to exercise interactions with other features
(e.g. updatable views, inheritance, triggers, RLS).
Add a few general purpose smoke tests too, testing everything from
EXPLAIN output to unique index inference (expression indexes, partial
indexes, etc).
---
contrib/postgres_fdw/expected/postgres_fdw.out | 7 +
contrib/postgres_fdw/sql/postgres_fdw.sql | 3 +
.../isolation/expected/insert-conflict-ignore.out | 23 ++
.../expected/insert-conflict-update-2.out | 23 ++
.../expected/insert-conflict-update-3.out | 26 +++
.../isolation/expected/insert-conflict-update.out | 23 ++
src/test/isolation/isolation_schedule | 4 +
.../isolation/specs/insert-conflict-ignore.spec | 41 ++++
.../isolation/specs/insert-conflict-update-2.spec | 41 ++++
.../isolation/specs/insert-conflict-update-3.spec | 69 ++++++
.../isolation/specs/insert-conflict-update.spec | 40 ++++
src/test/regress/expected/insert_conflict.out | 241 +++++++++++++++++++++
src/test/regress/expected/privileges.out | 7 +-
src/test/regress/expected/rowsecurity.out | 90 ++++++++
src/test/regress/expected/rules.out | 21 ++
src/test/regress/expected/subselect.out | 22 ++
src/test/regress/expected/triggers.out | 102 ++++++++-
src/test/regress/expected/updatable_views.out | 4 +
src/test/regress/expected/update.out | 27 +++
src/test/regress/expected/with.out | 74 +++++++
src/test/regress/input/constraints.source | 5 +
src/test/regress/output/constraints.source | 15 +-
src/test/regress/parallel_schedule | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/insert_conflict.sql | 192 ++++++++++++++++
src/test/regress/sql/privileges.sql | 5 +-
src/test/regress/sql/rowsecurity.sql | 73 +++++++
src/test/regress/sql/rules.sql | 14 ++
src/test/regress/sql/subselect.sql | 14 ++
src/test/regress/sql/triggers.sql | 69 +++++-
src/test/regress/sql/updatable_views.sql | 2 +
src/test/regress/sql/update.sql | 14 ++
src/test/regress/sql/with.sql | 37 ++++
33 files changed, 1322 insertions(+), 8 deletions(-)
create mode 100644 src/test/isolation/expected/insert-conflict-ignore.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-2.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-3.out
create mode 100644 src/test/isolation/expected/insert-conflict-update.out
create mode 100644 src/test/isolation/specs/insert-conflict-ignore.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-2.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-3.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update.spec
create mode 100644 src/test/regress/expected/insert_conflict.out
create mode 100644 src/test/regress/sql/insert_conflict.sql
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 583cce7..5133386 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2327,6 +2327,13 @@ INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
ERROR: duplicate key value violates unique constraint "t1_pkey"
DETAIL: Key ("C 1")=(11) already exists.
CONTEXT: Remote SQL command: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
ERROR: new row for relation "T 1" violates check constraint "c2positive"
DETAIL: Failing row contains (1111, -2, null, null, null, null, ft1 , null).
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83e8fa7..e01d34e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -372,6 +372,9 @@ UPDATE ft2 SET c2 = c2 + 600 WHERE c1 % 10 = 8 AND c1 < 1200 RETURNING *;
ALTER TABLE "S 1"."T 1" ADD CONSTRAINT c2positive CHECK (c2 >= 0);
INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
UPDATE ft1 SET c2 = -c2 WHERE c1 = 1; -- c2positive
diff --git a/src/test/isolation/expected/insert-conflict-ignore.out b/src/test/isolation/expected/insert-conflict-ignore.out
new file mode 100644
index 0000000..e6cc2a1
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-ignore.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: ignore1 ignore2 c1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step c1: COMMIT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore1
+step c2: COMMIT;
+
+starting permutation: ignore1 ignore2 a1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step a1: ABORT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-2.out b/src/test/isolation/expected/insert-conflict-update-2.out
new file mode 100644
index 0000000..6a5ddfe
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-2.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-3.out b/src/test/isolation/expected/insert-conflict-update-3.out
new file mode 100644
index 0000000..29dd8b0
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-3.out
@@ -0,0 +1,26 @@
+Parsed test spec with 2 sessions
+
+starting permutation: update2 insert1 c2 select1surprise c1
+step update2: UPDATE colors SET is_active = true WHERE key = 1;
+step insert1:
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key; <waiting ...>
+step c2: COMMIT;
+step insert1: <... completed>
+key color is_active
+
+1 Red f
+2 Green f
+3 Blue f
+step select1surprise: SELECT * FROM colors ORDER BY key;
+key color is_active
+
+1 Brown t
+2 Green f
+3 Blue f
+step c1: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update.out b/src/test/isolation/expected/insert-conflict-update.out
new file mode 100644
index 0000000..6976124
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index c055a53..50948a2 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -16,6 +16,10 @@ test: fk-deadlock2
test: eval-plan-qual
test: lock-update-delete
test: lock-update-traversal
+test: insert-conflict-ignore
+test: insert-conflict-update
+test: insert-conflict-update-2
+test: insert-conflict-update-3
test: delete-abort-savept
test: delete-abort-savept-2
test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/insert-conflict-ignore.spec b/src/test/isolation/specs/insert-conflict-ignore.spec
new file mode 100644
index 0000000..fde43b3
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-ignore.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT IGNORE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions during INSERT...ON CONFLICT IGNORE.
+#
+# The convention here is that session 1 always ends up inserting, and session 2
+# always ends up ignoring.
+
+setup
+{
+ CREATE TABLE ints (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE ints;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore1" { INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore2" { INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; }
+step "select2" { SELECT * FROM ints; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# Regular case where one session block-waits on another to determine if it
+# should proceed with an insert or ignore.
+permutation "ignore1" "ignore2" "c1" "select2" "c2"
+permutation "ignore1" "ignore2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-2.spec b/src/test/isolation/specs/insert-conflict-update-2.spec
new file mode 100644
index 0000000..3e6e944
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-2.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test shows a plausible scenario in which the user might wish to UPDATE a
+# value that is also constrained by the unique index that is the arbiter of
+# whether the alternative path should be taken.
+
+setup
+{
+ CREATE TABLE upsert (key text not null, payload text);
+ CREATE UNIQUE INDEX ON upsert(lower(key));
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. The user can still usefully UPDATE
+# a column constrained by a unique index, as the example illustrates.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-3.spec b/src/test/isolation/specs/insert-conflict-update-3.spec
new file mode 100644
index 0000000..94ae3df
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-3.spec
@@ -0,0 +1,69 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# Other INSERT...ON CONFLICT UPDATE isolation tests illustrate the "MVCC
+# violation" added to facilitate the feature, whereby a
+# not-visible-to-our-snapshot tuple can be updated by our command all the same.
+# This is generally needed to provide a guarantee of a successful INSERT or
+# UPDATE in READ COMMITTED mode. This MVCC violation is quite distinct from
+# the putative "MVCC violation" that has existed in PostgreSQL for many years,
+# the EvalPlanQual() mechanism, because that mechanism always starts from a
+# tuple that is visible to the command's MVCC snapshot. This test illustrates
+# a slightly distinct user-visible consequence of the same MVCC violation
+# generally associated with INSERT...ON CONFLICT UPDATE. The impact of the
+# MVCC violation goes a little beyond updating MVCC-invisible tuples.
+#
+# With INSERT...ON CONFLICT UPDATE, the UPDATE predicate is only evaluated
+# once, on this conclusively-locked tuple, and not any other version of the
+# same tuple. It is therefore possible (in READ COMMITTED mode) that the
+# predicate "fail to be satisfied" according to the command's MVCC snapshot.
+# It might simply be that there is no row version visible, but it's also
+# possible that there is some row version visible, but only as a version that
+# doesn't satisfy the predicate. If, however, the conclusively-locked version
+# satisfies the predicate, that's good enough, and the tuple is updated. The
+# MVCC-snapshot-visible row version is denied the opportunity to prevent the
+# UPDATE from taking place, because we don't walk the UPDATE chain in the usual
+# way.
+
+setup
+{
+ CREATE TABLE colors (key int4 PRIMARY KEY, color text, is_active boolean);
+ INSERT INTO colors (key, color, is_active) VALUES(1, 'Red', false);
+ INSERT INTO colors (key, color, is_active) VALUES(2, 'Green', false);
+ INSERT INTO colors (key, color, is_active) VALUES(3, 'Blue', false);
+}
+
+teardown
+{
+ DROP TABLE colors;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" {
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key;}
+step "select1surprise" { SELECT * FROM colors ORDER BY key; }
+step "c1" { COMMIT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "update2" { UPDATE colors SET is_active = true WHERE key = 1; }
+step "c2" { COMMIT; }
+
+# Perhaps surprisingly, the session 1 MVCC-snapshot-visible tuple (the tuple
+# with the pre-populated color 'Red') is denied the opportunity to prevent the
+# UPDATE from taking place -- only the conclusively-locked tuple version
+# matters, and so the tuple with key value 1 was updated to 'Brown' (but not
+# tuple with key value 2, since nothing changed there):
+permutation "update2" "insert1" "c2" "select1surprise" "c1"
diff --git a/src/test/isolation/specs/insert-conflict-update.spec b/src/test/isolation/specs/insert-conflict-update.spec
new file mode 100644
index 0000000..6529a0c
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update.spec
@@ -0,0 +1,40 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions.
+
+setup
+{
+ CREATE TABLE upsert (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. Notably, this entails updating a
+# tuple while there is no version of that tuple visible to the updating
+# session's snapshot. This is permitted only in READ COMMITTED mode.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
new file mode 100644
index 0000000..c192bd3
--- /dev/null
+++ b/src/test/regress/expected/insert_conflict.out
@@ -0,0 +1,241 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+ QUERY PLAN
+----------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+(3 rows)
+
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+ QUERY PLAN
+----------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+ Filter: (fruit <> 'Cawesh'::text)
+(4 rows)
+
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+ QUERY PLAN
+----------------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+ Filter: ((excluded.fruit) <> 'Elderberry'::text)
+(4 rows)
+
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+ QUERY PLAN
+--------------------------------------------------
+ [ +
+ { +
+ "Plan": { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Insert", +
+ "Relation Name": "insertconflicttest", +
+ "Alias": "target", +
+ "Arbiter Index": "key_index", +
+ "Plans": [ +
+ { +
+ "Node Type": "Result", +
+ "Parent Relationship": "Member" +
+ }, +
+ { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Conflict Update", +
+ "Parent Relationship": "Member", +
+ "Relation Name": "insertconflicttest",+
+ "Alias": "target", +
+ "Filter": "(fruit <> 'Lime'::text)" +
+ } +
+ ] +
+ } +
+ } +
+ ]
+(1 row)
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+ERROR: ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from
+LINE 1: ...nsert into insertconflicttest values (1, 'Apple') on conflic...
+ ^
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+ERROR: invalid reference to FROM-clause entry for table "insertconflicttest"
+LINE 1: ...(1, 'Apple') on conflict (key) update set fruit = insertconf...
+ ^
+HINT: Perhaps you meant to reference the table alias "excluded".
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index key_index;
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index comp_key_index;
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_key_index;
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_comp_key_index;
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+ERROR: duplicate key value violates unique constraint "fruit_index"
+DETAIL: Key (fruit)=(Peach) already exists.
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+drop index key_index;
+drop index fruit_index;
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+ERROR: partial arbiter unique index has predicate that does not cover tuple proposed for insertion
+DETAIL: ON CONFLICT inference clause implies that the tuple proposed for insertion must be covered by predicate for partial index "partial_key_index".
+drop index partial_key_index;
+-- Cleanup
+drop table insertconflicttest;
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+create table capitals (
+ state char(2)
+) inherits (cities);
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+-- Tests proper for inheritance:
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+-- Succeeds:
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 74b0450..bc44c45 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -269,7 +269,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
ERROR: permission denied for relation atest2
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -367,6 +367,11 @@ UPDATE atest5 SET one = 8; -- fail
ERROR: permission denied for relation atest5
UPDATE atest5 SET three = 5, one = 2; -- fail
ERROR: permission denied for relation atest5
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+ERROR: permission denied for relation atest5
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
+ERROR: permission denied for relation atest5
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
GRANT SELECT (one,two,blue) ON atest6 TO regressuser4;
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 21817d8..07cb54f 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -1179,6 +1179,96 @@ NOTICE: f_leak => yyyyyy
(3 rows)
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------
+ 33 | 22 | 1 | rls_regress_user1 | okay science fiction
+(1 row)
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+SET SESSION AUTHORIZATION rls_regress_user1;
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------
+ 2 | 11 | 2 | rls_regress_user1 | my first novel
+(1 row)
+
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+------------------
+ 78 | 11 | 1 | rls_regress_user1 | some other novel
+(1 row)
+
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------------------
+ 88 | 33 | 1 | rls_regress_user1 | technology book, can only insert
+(1 row)
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d50b103..c634579 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1123,6 +1123,10 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
SELECT * FROM shoelace_obsolete WHERE sl_avail = 0;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
+ERROR: INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
------------+----------+------------+--------+----------+-----------
@@ -2351,6 +2355,23 @@ DETAIL: Key (id3a, id3c)=(1, 13) is not present in table "rule_and_refint_t2".
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
+ERROR: relation "shoelace" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
where (((rule_and_refint_t3.id3a = new.id3a)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index b14410f..9ba3a44 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -639,6 +639,28 @@ from
(0 rows)
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 1: ...conflict (key) update set val = 'unsupported ' || (select f1...
+ ^
+select * from upsert;
+ key | val
+-----+-----
+ 1 | val
+(1 row)
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: on conflict (key) update set val = (select u from aa)
+ ^
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
create temp table outer_7597 (f1 int4, f2 int4);
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index f1a5fde..77dfa06 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -274,7 +274,7 @@ drop sequence ttdummy_seq;
-- tests for per-statement triggers
--
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
BEGIN
@@ -291,6 +291,14 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
--
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
+NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+NOTICE: trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
INSERT INTO main_table DEFAULT VALUES;
@@ -305,6 +313,8 @@ NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, lev
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
@@ -1731,3 +1741,93 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (1,black)
+WARNING: after insert (new): (1,black)
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (2,red)
+WARNING: before insert (new, modified): (3,"red trig modified")
+WARNING: after insert (new): (3,"red trig modified")
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (3,orange)
+WARNING: before update (old): (3,"red trig modified")
+WARNING: before update (new): (3,"updated red trig modified")
+WARNING: after update (old): (3,"updated red trig modified")
+WARNING: after update (new): (3,"updated red trig modified")
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (4,green)
+WARNING: before insert (new, modified): (5,"green trig modified")
+WARNING: after insert (new): (5,"green trig modified")
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (5,purple)
+WARNING: before update (old): (5,"green trig modified")
+WARNING: before update (new): (5,"updated green trig modified")
+WARNING: after update (old): (5,"updated green trig modified")
+WARNING: after update (new): (5,"updated green trig modified")
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (6,white)
+WARNING: before insert (new, modified): (7,"white trig modified")
+WARNING: after insert (new): (7,"white trig modified")
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (7,pink)
+WARNING: before update (old): (7,"white trig modified")
+WARNING: before update (new): (7,"updated white trig modified")
+WARNING: after update (old): (7,"updated white trig modified")
+WARNING: after update (new): (7,"updated white trig modified")
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (8,yellow)
+WARNING: before insert (new, modified): (9,"yellow trig modified")
+WARNING: after insert (new): (9,"yellow trig modified")
+select * from upsert;
+ key | color
+-----+-----------------------------
+ 1 | black
+ 3 | updated red trig modified
+ 5 | updated green trig modified
+ 7 | updated white trig modified
+ 9 | yellow trig modified
+(5 rows)
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 80c5706..22b5bc1 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -215,6 +215,10 @@ INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
DETAIL: View columns that are not columns of their base relation are not updatable.
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
+ERROR: relation "rw_view15" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index 1de2a86..58714ac 100644
--- a/src/test/regress/expected/update.out
+++ b/src/test/regress/expected/update.out
@@ -147,4 +147,31 @@ SELECT a, b, char_length(c) FROM update_test;
42 | 12 | 10000
(4 rows)
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a NOT IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE EXISTS(SELECT b FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ALL(SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ANY(SELECT a FROM update_test);
+ ^
DROP TABLE update_test;
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index 06b372b..81d664e 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -1806,6 +1806,80 @@ SELECT * FROM y;
-400
(22 rows)
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+ k | v | a
+---+--------+---
+ 0 | insert | 0
+ 0 | insert | 0
+(2 rows)
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+ k | v
+----+------------------
+ 0 | insert
+ 1 | 1 v, now update
+ 2 | insert
+ 3 | insert
+ 4 | 4 v, now update
+ 5 | insert
+ 6 | insert
+ 7 | 7 v, now update
+ 8 | insert
+ 9 | insert
+ 10 | 10 v, now update
+ 11 | insert
+ 12 | insert
+ 13 | 13 v, now update
+ 14 | insert
+ 15 | insert
+ 16 | 16 v, now update
+(17 rows)
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ...ICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a ...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+DROP TABLE z;
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
INSERT INTO y SELECT generate_series(1, 3);
diff --git a/src/test/regress/input/constraints.source b/src/test/regress/input/constraints.source
index 8ec0054..46bce36 100644
--- a/src/test/regress/input/constraints.source
+++ b/src/test/regress/input/constraints.source
@@ -292,6 +292,11 @@ INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+
SELECT '' AS five, * FROM UNIQUE_TBL;
DROP TABLE UNIQUE_TBL;
diff --git a/src/test/regress/output/constraints.source b/src/test/regress/output/constraints.source
index 0d32a9eab..add3f0c 100644
--- a/src/test/regress/output/constraints.source
+++ b/src/test/regress/output/constraints.source
@@ -421,16 +421,23 @@ INSERT INTO UNIQUE_TBL VALUES (4, 'four');
INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+ERROR: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
+HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
SELECT '' AS five, * FROM UNIQUE_TBL;
- five | i | t
-------+---+-------
+ five | i | t
+------+---+--------------------
| 1 | one
| 2 | two
| 4 | four
- | 5 | one
| | six
| | seven
-(6 rows)
+ | 5 | five-upsert-update
+ | 6 | six-upsert-insert
+(7 rows)
DROP TABLE UNIQUE_TBL;
CREATE TABLE UNIQUE_TBL (i int, t text,
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e0ae2f2..528d3b7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -36,6 +36,7 @@ test: geometry horology regex oidjoins type_sanity opr_sanity
# These four each depend on the previous one
# ----------
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7f762bd..b7c8f53 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -50,6 +50,7 @@ test: oidjoins
test: type_sanity
test: opr_sanity
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql
new file mode 100644
index 0000000..472d4ab
--- /dev/null
+++ b/src/test/regress/sql/insert_conflict.sql
@@ -0,0 +1,192 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+
+drop index key_index;
+
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index comp_key_index;
+
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index expr_key_index;
+
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+
+drop index expr_comp_key_index;
+
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index key_index;
+drop index fruit_index;
+
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+
+drop index partial_key_index;
+
+-- Cleanup
+drop table insertconflicttest;
+
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+
+create table capitals (
+ state char(2)
+) inherits (cities);
+
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+
+-- Tests proper for inheritance:
+
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+
+-- Succeeds:
+
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index f97a75a..861eac6 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -194,7 +194,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -245,6 +245,9 @@ INSERT INTO atest5 VALUES (5,5,5); -- fail
UPDATE atest5 SET three = 10; -- ok
UPDATE atest5 SET one = 8; -- fail
UPDATE atest5 SET three = 5, one = 2; -- fail
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index ed7adbf..5c660d5 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -436,6 +436,79 @@ DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1;
DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1;
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+
+SET SESSION AUTHORIZATION rls_regress_user1;
+
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index 1e15f84..7cb5f39 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -680,6 +680,9 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
SELECT * FROM shoelace_candelete;
@@ -844,6 +847,17 @@ insert into rule_and_refint_t3 values (1, 12, 11, 'row3');
insert into rule_and_refint_t3 values (1, 12, 12, 'row4');
insert into rule_and_refint_t3 values (1, 11, 13, 'row5');
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 4be2e40..2be9cb7 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -374,6 +374,20 @@ from
int4_tbl i4 on dummy = i4.f1;
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+
+select * from upsert;
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index 0ea2c31..323ca1a 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -208,7 +208,7 @@ drop sequence ttdummy_seq;
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
5 10
@@ -237,6 +237,12 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
@@ -246,6 +252,9 @@ UPDATE main_table SET a = a + 1 WHERE b < 30;
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
+
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
30 40
@@ -1173,3 +1182,61 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+
+select * from upsert;
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 60c7e29..48dd9a9 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -69,6 +69,8 @@ DELETE FROM rw_view14 WHERE a=3; -- should be OK
-- Partially updatable view
INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
UPDATE rw_view15 SET upper='ROW 3' WHERE a=3; -- should fail
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
index e71128c..903f3fb 100644
--- a/src/test/regress/sql/update.sql
+++ b/src/test/regress/sql/update.sql
@@ -74,4 +74,18 @@ UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+
DROP TABLE update_test;
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index c716369..8d49384 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -795,6 +795,43 @@ SELECT * FROM t LIMIT 10;
SELECT * FROM y;
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+
+DROP TABLE z;
+
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
--
1.9.1
0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchtext/x-patch; charset=US-ASCII; name=0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchDownload
From aa5f130d6ae1e3fb47a74fc900599c63ec75174e Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:16:11 -0700
Subject: [PATCH 5/6] Internal documentation for INSERT ... ON CONFLICT {UPDATE
| IGNORE}
Includes documentation for executor README. A high-level handling of
approach #2 to value locking also appears there, since in contrast with
design #1, that is something that lives in the head of the executor.
---
src/backend/executor/README | 128 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 128 insertions(+)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 8afa1e3..b5a5c33 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -200,3 +200,131 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
that only the first result row of an SRF counts, because all subsequent
rows will result in attempts to re-update an already updated target row.
This is historical behavior and seems not worth changing.)
+
+Speculative insertion
+---------------------
+
+Speculative insertion is a process that the executor manages for the benefit of
+INSERT...ON CONFLICT UPDATE/IGNORE. Supported indexes include nbtree unique
+indexes (nbtree is currently the only amcanunique index access method), or
+exclusion constraint indexes (exclusion constraints are considered a
+generalization of unique constraints). Only ON CONFLICT IGNORE is supported
+with exclusion constraints.
+
+The primary user-visible goal for INSERT...ON CONFLICT UPDATE is to guarantee
+either an insert or update under normal operating conditions in READ COMMITTED
+mode (where serialization failures are just as unacceptable as they are with
+regular UPDATEs). A would-be conflict (and the associated index) are the
+arbiters of whether or not the alternative (UPDATE/IGNORE) path is taken. The
+implementation more or less tries to update or insert until one or the other of
+those two outcomes occurs successfully. There are some non-obvious hazards
+involved that are carefully avoided. These hazards relate to concurrent
+activity causing conflicts for the implementation, which must be handled.
+
+The index is the authoritative source of truth for whether there is or is not a
+conflict, for unique index enforcement in general, and for speculative
+insertion in particular. The heap must still be considered, though, not least
+since it alone has authoritative visibility information. Through looping, we
+hope to overcome the disconnect between the heap and the arbiter index. We
+must lock the row, and then verify that there is no conflict. Only then do we
+UPDATE. Theoretically, some individual session could loop forever, although
+under high concurrency one session always proceeds.
+
+There are 2 sources of conflicts for ON CONFLICT UPDATE:
+
+1. Conflicts from going to update (having found a conflict during the
+pre-check), and finding the tuple changed (which may or may not involve new,
+distinct constrained values in later tuple versions -- for simplicity, we don't
+bother with considering that). This is not a conflict that the IGNORE variant
+considers.
+
+2. Conflicts from inserting a tuple (having not found a conflict during the
+pre-check), and only then finding a conflict at insertion time (when inserting
+index tuples, and finding a conflicting one when a buffer lock is held on an
+index page in the ordinary course of insertion). This can happen if a
+concurrent insertion occurs after the pre-check, but before physical index
+tuple insertion.
+
+The first step in the loop is to perform a pre-check. The indexes are scanned
+for existing conflicting values. At this point, we may have to wait until the
+end of another xact (or xact's promise token -- more on that later), iff it
+isn't immediately conclusive that there is or is not a conflict (when we finish
+the pre-check, there is a preliminary conclusion about there either being or
+not being a conflict -- but the conclusion only holds if there are no
+subsequent concurrent conflicts). If a conclusively committed conflict tuple
+is detected during the first step, the executor goes to lock and update the row
+(for ON CONFLICT UPDATE -- otherwise, for ON CONFLICT IGNORE, we're done). The
+TID to lock (and potentially UPDATE) can only be determined during the first
+step. If locking the row finds a concurrent conflict (which may be from a
+concurrent UPDATE that hasn't even physically inspected the arbiter index yet)
+then we restart the loop from the very beginning. We restart from scratch
+because all bets are off; it's possible that the process will find no conflict
+the second time around, and will successfully insert, or will UPDATE another
+tuple that is not even part of the same UPDATE chain as first time around.
+
+The second step (skipped when a conflict is found) is to insert a heap tuple
+and related index tuples opportunistically. This uses the same mechanism as
+deferred unique indexes, and so we never wait for a possibly conflicting xact
+to commit or abort (unlike with conventional unique index insertion) -- we
+simply detect a possible conflict.
+
+When opportunistically inserting during the second step, we are not logically
+inserting a tuple as such. Rather, the process is somewhat similar to the
+conventional unique index insertion steps taken within the nbtree AM, where we
+must briefly lock the *value* being inserted: in that codepath, the value
+proposed for insertion is for an instant locked *in the abstract*, by way of a
+buffer lock on "the first leaf page the value could be on". Then, having
+established the right to physically insert, do so (or throw an error). For
+speculative insertion, if no conflict occurs during the insertion (which is
+usually the case, since it was just determined in the first step that there was
+no conflict), then we're done. Otherwise, we must restart (and likely find the
+same conflict tuple during the first step of the new iteration). But a
+counter-intuitive step must be taken first (which is what makes this whole
+dance similar to conventional nbtree "value locking").
+
+We must "super delete" the tuple when the opportunistic insertion finds a
+conflict. This means that it immediately becomes invisible to all snapshot
+types, and immediately becomes reclaimable by VACUUM. Other backends
+(speculative inserters or ordinary inserters) know to not wait on our
+transaction end when they encounter an optimistically inserted "promise tuple".
+Rather, they wait on a corresponding promise token lock, which we hold only for
+as long as opportunistically inserting. We release the lock when done
+opportunistically inserting (and after "super deleting", if that proved
+necessary), releasing our waiters (who will ordinarily re-find our promise
+tuple as a bona fide tuple, or occasionally will find that they can insert
+after all). It's important that other xacts not wait on the end of our xact
+until we've established that we've successfully and conclusively inserted
+logically (or established that there was an insertion conflict, and cleaned up
+after it by "super deleting"). Otherwise, concurrent speculative inserters
+could be involved in "unprincipled deadlocks": deadlocks where there is no
+user-visible mutual dependency, and yet an implementation related mutual
+dependency is unexpectedly introduced. The user might be left with no
+reasonable way of avoiding these deadlocks, which would not be okay.
+
+Speculative insertion and EvalPlanQual()
+----------------------------------------
+
+Updating the tuple involves locking it first (to establish a definitive tuple
+to consider evaluating the additional UPDATE qual against). The EvalPlanQual()
+mechanism (or, rather, some associated infrastructure) is reused for the
+benefit of auxiliary UPDATE expression evaluation.
+
+Locking first deviates from how conventional UPDATEs work, but allows the
+implementation to consider the possibility of conflicts first, and then, having
+reached a definitive conclusion, separately evaluate.
+
+ExecLockUpdateTuple() is somewhat similar to EvalPlanQual(), except it locks
+the TID reported as conflicting, and upon successfully locking, installs that
+into the UPDATE's EPQ slot. There is no UPDATE chain to walk -- rather, new
+tuples to check the qual against come from continuous attempts at locking a
+tuple conclusively (avoiding conflicts). The qual (if any) is then evaluated.
+Note that at READ COMMITTED, it's possible that *no* version of the tuple is
+visible, and yet it may still be updated. Similarly, since we do not walk the
+UPDATE chain, concurrent READ COMMITTED INSERT ... ON CONFLICT UPDATE sessions
+always attempt to lock the conclusively visible tuple, without regard to any
+other tuple version (repeatable read isolation level and up must consider MVCC
+visibility, though). A further implication of this is that the
+MVCC-snapshot-visible row version is denied the opportunity to prevent the
+UPDATE from taking place, should it not pass our qual (while a later version
+does pass it). This is fundamentally similar to updating a tuple when no
+version is visible, though.
--
1.9.1
0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchtext/x-patch; charset=US-ASCII; name=0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchDownload
From 252c2ef5824e252093fd1f86a52f5d66d0efbec0 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 26 Sep 2014 20:59:04 -0700
Subject: [PATCH 6/6] User-visible documentation for INSERT ... ON CONFLICT
{UPDATE | IGNORE}
INSERT ... ON CONFLICT {UPDATE | IGNORE} is documented as a new clause
of the INSERT command. Some potentially surprising interactions with
triggers are noted -- BEFORE INSERT per-row triggers must fire without
the INSERT path necessarily being taken, for example.
All the existing features that INSERT ... ON CONFLICT {UPDATE | IGNORE}
interacts with have these interactions noted. This includes
postgres_fdw, updatable views, table inheritance, RLS and partial unique
indexes.
Finally, a user-level description of the new "MVCC violation" that the
ON CONFLICT UPDATE variant sometimes requires has been added to "Chapter
13 - Concurrency Control", beside existing commentary on READ COMMITTED
mode's special handling of concurrent updates. The new "MVCC violation"
introduced seems somewhat distinct from the existing one (i.e. READ
COMMITTED's handling of when an UPDATE affects a concurrently
updated/deleted tuple, which internally uses a mechanism called
EvalPlanQual()), because in READ COMMITTED mode it is no longer
necessary for any row version to be conventionally visible to the
command's MVCC snapshot for an UPDATE of the row to occur (or for the
row to be locked, should the UPDATE's WHERE clause not be satisfied).
---
doc/src/sgml/ddl.sgml | 23 +++
doc/src/sgml/fdwhandler.sgml | 8 +
doc/src/sgml/keywords.sgml | 7 +
doc/src/sgml/mvcc.sgml | 24 +++
doc/src/sgml/plpgsql.sgml | 14 +-
doc/src/sgml/postgres-fdw.sgml | 8 +
doc/src/sgml/protocol.sgml | 13 +-
doc/src/sgml/ref/alter_policy.sgml | 7 +-
doc/src/sgml/ref/create_policy.sgml | 37 +++-
doc/src/sgml/ref/create_rule.sgml | 7 +-
doc/src/sgml/ref/create_table.sgml | 5 +-
doc/src/sgml/ref/create_trigger.sgml | 5 +-
doc/src/sgml/ref/create_view.sgml | 33 ++-
doc/src/sgml/ref/insert.sgml | 375 ++++++++++++++++++++++++++++++++--
doc/src/sgml/ref/set_constraints.sgml | 6 +-
doc/src/sgml/trigger.sgml | 49 ++++-
16 files changed, 570 insertions(+), 51 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 570a003..7b43a10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2428,9 +2428,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
+ There is limited inheritance support for <command>INSERT</command>
+ commands with <literal>ON CONFLICT</> clauses. Tables with
+ children are not generally accepted as targets. One notable
+ exception is that such tables are accepted as targets for
+ <command>INSERT</command> commands with <literal>ON CONFLICT
+ IGNORE</> clauses, provided a unique index inference clause was
+ omitted (which implies that there is no concern about
+ <emphasis>which</> unique index any would-be conflict might arise
+ from). However, tables that happen to be inheritance children are
+ accepted as targets for all variants of <command>INSERT</command>
+ with <literal>ON CONFLICT</>.
+ </para>
+
+ <para>
All check constraints and not-null constraints on a parent table are
automatically inherited by its children. Other types of constraints
(unique, primary key, and foreign key constraints) are not inherited.
+ Therefore, <command>INSERT</command> with <literal>ON CONFLICT</>
+ unique index inference considers only unique constraints/indexes
+ directly associated with the child
+ table.
</para>
<para>
@@ -2515,6 +2533,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
+ <literal>INSERT</literal> with an <literal>ON CONFLICT
+ UPDATE</literal> clause does not support the
+ <literal>ONLY</literal> notation, and so in effect tables with
+ inheritance children are not supported for the <literal>ON
+ CONFLICT</literal> variant.
Commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
typically only work on individual, physical tables and do not
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..0c3dcb5 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,14 @@ GetForeignServerByName(const char *name, bool missing_ok);
source provides.
</para>
+ <para>
+ <command>INSERT</> with an <literal>ON CONFLICT</> clause is not supported
+ with a unique index inference specification (this implies that <literal>ON
+ CONFLICT UPDATE</> is never supported, since the specification is
+ mandatory there). When planning an <command>INSERT</>,
+ <function>PlanForeignModify</> should reject these cases.
+ </para>
+
</sect1>
</chapter>
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index b0dfd5f..ea58211 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -854,6 +854,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>CONFLICT</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>CONNECT</token></entry>
<entry></entry>
<entry>reserved</entry>
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0d6867..5e310d7 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -326,6 +326,30 @@
</para>
<para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</> clause is
+ another special case. In Read Committed mode, the implementation will
+ either insert or update each row proposed for insertion, with either one of
+ those two outcomes guaranteed. This is a useful guarantee for many
+ use-cases, but it implies that further liberties must be taken with
+ snapshot isolation. Should a conflict originate in another transaction
+ whose effects are not visible to the <command>INSERT</command>, the
+ <command>UPDATE</command> may affect that row, even though it may be the
+ case that <emphasis>no</> version of that row is conventionally visible to
+ the command. In the same vein, if the secondary search condition of the
+ command (an explicit <literal>WHERE</> clause) is supplied, it is only
+ evaluated on the most recent row version, which is not necessarily the
+ version conventionally visible to the command (if indeed there is a row
+ version conventionally visible to the command at all).
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT IGNORE</> clause may
+ have insertion not proceed for a row due to the outcome of another
+ transaction whose effects are not visible to the <command>INSERT</command>
+ snapshot. Again, this is only the case in Read Committed mode.
+ </para>
+
+ <para>
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands on the same rows it is trying to update, but it
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 69a0885..59a5945 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2607,7 +2607,11 @@ END;
<para>
This example uses exception handling to perform either
- <command>UPDATE</> or <command>INSERT</>, as appropriate:
+ <command>UPDATE</> or <command>INSERT</>, as appropriate. It is
+ recommended that applications use <command>INSERT</> with
+ <literal>ON CONFLICT UPDATE</> rather than actually emulating this
+ pattern. This example serves only to illustrate use of
+ <application>PL/pgSQL</application> control flow structures:
<programlisting>
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
@@ -3771,9 +3775,11 @@ RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;
<command>INSERT</> and <command>UPDATE</> operations, the return value
should be <varname>NEW</>, which the trigger function may modify to
support <command>INSERT RETURNING</> and <command>UPDATE RETURNING</>
- (this will also affect the row value passed to any subsequent triggers).
- For <command>DELETE</> operations, the return value should be
- <varname>OLD</>.
+ (this will also affect the row value passed to any subsequent triggers,
+ or passed to a special <varname>EXCLUDED</> alias reference within
+ an <command>INSERT</> statement with an <literal>ON CONFLICT UPDATE</>
+ clause). For <command>DELETE</> operations, the return
+ value should be <varname>OLD</>.
</para>
<para>
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fa39661 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -69,6 +69,14 @@
</para>
<para>
+ Note that <filename>postgres_fdw</> currently lacks support for
+ <command>INSERT</command> statements with an <literal>ON CONFLICT
+ UPDATE</> clause. However, the <literal>ON CONFLICT IGNORE</>
+ clause is supported, provided a unique index inference specification
+ is omitted.
+ </para>
+
+ <para>
It is generally recommended that the columns of a foreign table be declared
with exactly the same data types, and collations if applicable, as the
referenced columns of the remote table. Although <filename>postgres_fdw</>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3a753a0..ac13d32 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2998,9 +2998,16 @@ CommandComplete (B)
<literal>INSERT <replaceable>oid</replaceable>
<replaceable>rows</replaceable></literal>, where
<replaceable>rows</replaceable> is the number of rows
- inserted. <replaceable>oid</replaceable> is the object ID
- of the inserted row if <replaceable>rows</replaceable> is 1
- and the target table has OIDs;
+ inserted. However, if and only if <literal>ON CONFLICT
+ UPDATE</> is specified, then the tag is <literal>UPSERT
+ <replaceable>oid</replaceable>
+ <replaceable>rows</replaceable></literal>, where
+ <replaceable>rows</replaceable> is the number of rows inserted
+ <emphasis>or updated</emphasis>.
+ <replaceable>oid</replaceable> is the object ID of the
+ inserted row if <replaceable>rows</replaceable> is 1 and the
+ target table has OIDs, and (for the <literal>UPSERT</literal>
+ tag), the row was actually inserted rather than updated;
otherwise <replaceable>oid</replaceable> is 0.
</para>
diff --git a/doc/src/sgml/ref/alter_policy.sgml b/doc/src/sgml/ref/alter_policy.sgml
index 6d03db5..65cd85c 100644
--- a/doc/src/sgml/ref/alter_policy.sgml
+++ b/doc/src/sgml/ref/alter_policy.sgml
@@ -93,8 +93,11 @@ ALTER POLICY <replaceable class="parameter">name</replaceable> ON <replaceable c
The USING expression for the policy. This expression will be added as a
security-barrier qualification to queries which use the table
automatically. If multiple policies are being applied for a given
- table then they are all combined and added using OR. The USING
- expression applies to records which are being retrieved from the table.
+ table then they are all combined and added using OR (except as noted in
+ the <xref linkend="sql-createpolicy"> documentation for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ The USING expression applies to records which are being retrieved from the
+ table.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 868a6c1..f17192e 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -70,11 +70,12 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
Policies can be applied for specific commands or for specific roles. The
default for newly created policies is that they apply for all commands and
roles, unless otherwise specified. If multiple policies apply to a given
- query, they will be combined using OR. Further, for commands which can have
- both USING and WITH CHECK policies (ALL and UPDATE), if no WITH CHECK policy
- is defined then the USING policy will be used for both what rows are visible
- (normal USING case) and which rows will be allowed to be added (WITH CHECK
- case).
+ query, they will be combined using OR (except as noted for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ Further, for commands which can have both USING and WITH CHECK policies (ALL
+ and UPDATE), if no WITH CHECK policy is defined then the USING policy will
+ be used for both what rows are visible (normal USING case) and which rows
+ will be allowed to be added (WITH CHECK case).
</para>
<para>
@@ -255,6 +256,19 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
as it only ever applies in cases where records are being added to the
relation.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>INSERT</literal> policy WITH
+ CHECK expression also passes for both any existing tuple in the target
+ table that necessitates that the <literal>UPDATE</literal> path be
+ taken, and the final tuple added back into the relation.
+ <literal>INSERT</literal> policies are separately combined using
+ <literal>OR</literal>, and this distinct set of policy expressions must
+ always pass, regardless of whether any or all <literal>UPDATE</literal>
+ policies also pass (in the same tuple check). However, successfully
+ inserted tuples are not subject to <literal>UPDATE</literal> policy
+ enforcement.
+ </para>
</listitem>
</varlistentry>
@@ -263,7 +277,9 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
<listitem>
<para>
Using <literal>UPDATE</literal> for a policy means that it will apply
- to <literal>UPDATE</literal> commands. As <literal>UPDATE</literal>
+ to <literal>UPDATE</literal> commands (or auxiliary <literal>ON
+ CONFLICT UPDATE</literal> clauses of <literal>INSERT</literal>
+ commands). As <literal>UPDATE</literal>
involves pulling an existing record and then making changes to some
portion (but possibly not all) of the record, the
<literal>UPDATE</literal> policy accepts both a USING expression and
@@ -279,6 +295,15 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
used for both <literal>USING</literal> and
<literal>WITH CHECK</literal> cases.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>UPDATE</literal> policy
+ USING expression always be treated as a WITH CHECK
+ expression. This <literal>UPDATE</literal> policy must
+ always pass, regardless of whether any
+ <literal>INSERT</literal> policy also passes in the same
+ tuple check.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml
index 677766a..34a4ae1 100644
--- a/doc/src/sgml/ref/create_rule.sgml
+++ b/doc/src/sgml/ref/create_rule.sgml
@@ -136,7 +136,12 @@ CREATE [ OR REPLACE ] RULE <replaceable class="parameter">name</replaceable> AS
<para>
The event is one of <literal>SELECT</literal>,
<literal>INSERT</literal>, <literal>UPDATE</literal>, or
- <literal>DELETE</literal>.
+ <literal>DELETE</literal>. Note that an
+ <command>INSERT</command> containing an <literal>ON
+ CONFLICT</literal> clause cannot be used on tables that have
+ either <literal>INSERT</literal> or <literal>UPDATE</literal>
+ rules. Consider using an updatable view instead, which have
+ limited support for <literal>ON CONFLICT IGNORE</literal> only.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 299cce8..a9c1124 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -708,7 +708,10 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<literal>EXCLUDE</>, and
<literal>REFERENCES</> (foreign key) constraints accept this
clause. <literal>NOT NULL</> and <literal>CHECK</> constraints are not
- deferrable.
+ deferrable. Note that constraints that were created with this
+ clause cannot be used as arbiters of whether or not to take the
+ alternative path with an <command>INSERT</command> statement
+ that includes an <literal>ON CONFLICT UPDATE</> clause.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index aae0b41..1b75b1a 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -76,7 +76,10 @@ CREATE [ CONSTRAINT ] TRIGGER <replaceable class="PARAMETER">name</replaceable>
executes once for any given operation, regardless of how many rows
it modifies (in particular, an operation that modifies zero rows
will still result in the execution of any applicable <literal>FOR
- EACH STATEMENT</literal> triggers).
+ EACH STATEMENT</literal> triggers). Note that since
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is considered an <command>INSERT</command> statement, no
+ <command>UPDATE</command> statement level trigger will be fired.
</para>
<para>
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 5dadab1..599c1cb 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -286,8 +286,9 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
<para>
Simple views are automatically updatable: the system will allow
<command>INSERT</>, <command>UPDATE</> and <command>DELETE</> statements
- to be used on the view in the same way as on a regular table. A view is
- automatically updatable if it satisfies all of the following conditions:
+ to be used on the view in the same way as on a regular table (aside from
+ the limitations on ON CONFLICT noted below). A view is automatically
+ updatable if it satisfies all of the following conditions:
<itemizedlist>
<listitem>
@@ -383,6 +384,34 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
not need any permissions on the underlying base relations (see
<xref linkend="rules-privileges">).
</para>
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT</> clause
+ is only supported on updatable views under specific circumstances.
+ If a set of columns/expressions has been provided with which to
+ infer a unique index to consider as the arbiter of whether the
+ statement ultimately takes an alternative path - if a would-be
+ duplicate violation in some particular unique index is tacitly
+ taken as provoking an alternative <command>UPDATE</command> or
+ <literal>IGNORE</> path - then updatable views are not supported.
+ Since this specification is already mandatory for
+ <command>INSERT</command> with <literal>ON CONFLICT UPDATE</>,
+ this implies that only the <literal>ON CONFLICT IGNORE</> variant
+ is supported, and only when there is no such specification. For
+ example:
+ </para>
+ <para>
+<programlisting>
+-- Unsupported:
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'foo') ON CONFLICT (key)
+ UPDATE SET val = EXCLUDED.val;
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'bar') ON CONFLICT (key)
+ IGNORE;
+
+-- Supported (note the omission of "key" column):
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'baz') ON CONFLICT
+ IGNORE;
+</programlisting>
+ </para>
</refsect2>
</refsect1>
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a3cccb9..a53b0bf 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -24,6 +24,14 @@ PostgreSQL documentation
[ WITH [ RECURSIVE ] <replaceable class="parameter">with_query</replaceable> [, ...] ]
INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) [, ...] | <replaceable class="PARAMETER">query</replaceable> }
+ [ ON CONFLICT [ ( { <replaceable class="parameter">column_name_index</replaceable> | ( <replaceable class="parameter">expression_index</replaceable> ) } [, ...] [ WHERE <replaceable class="PARAMETER">index_condition</replaceable> ] ) ]
+ { IGNORE | UPDATE
+ SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } |
+ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] )
+ } [, ...]
+ [ WHERE <replaceable class="PARAMETER">condition</replaceable> ]
+ }
+ ]
[ RETURNING * | <replaceable class="parameter">output_expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
</synopsis>
</refsynopsisdiv>
@@ -32,9 +40,15 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<title>Description</title>
<para>
- <command>INSERT</command> inserts new rows into a table.
- One can insert one or more rows specified by value expressions,
- or zero or more rows resulting from a query.
+ <command>INSERT</command> inserts new rows into a table. One can
+ insert one or more rows specified by value expressions, or zero or
+ more rows resulting from a query. An alternative path
+ (<literal>IGNORE</literal> or <literal>UPDATE</literal>) can
+ optionally be specified, to be taken in the event of detecting that
+ proceeding with insertion would result in a conflict (i.e. a
+ conflicting tuple already exists). The alternative path is
+ considered individually for each row proposed for insertion, and is
+ taken (or not taken) once per row.
</para>
<para>
@@ -59,25 +73,216 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</para>
<para>
+ The optional <literal>ON CONFLICT</> clause specifies a path to
+ take as an alternative to raising a conflict related error.
+ <literal>ON CONFLICT IGNORE</> simply avoids inserting any
+ individual row when it is determined that a conflict related error
+ would otherwise need to be raised. <literal>ON CONFLICT UPDATE</>
+ has the system take an <command>UPDATE</command> path in respect of
+ such rows instead. <literal>ON CONFLICT UPDATE</> guarantees an
+ atomic <command>INSERT</command> or <command>UPDATE</command>
+ outcome - provided there is no incidental error, one of those two
+ outcomes is guaranteed, even under high concurrency.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> optionally accepts a
+ <literal>WHERE</> clause <replaceable>condition</>. When provided,
+ the statement only proceeds with updating if the
+ <replaceable>condition</> is satisfied. Otherwise, unlike a
+ conventional <command>UPDATE</command>, the row is still locked for
+ update. Note that the <replaceable>condition</> is evaluated last,
+ after a conflict has been identified as a candidate to update.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> is effectively an auxiliary query of
+ its parent <command>INSERT</command>. Two special aliases are
+ visible when <literal>ON CONFLICT UPDATE</> is specified -
+ <varname>TARGET</> and <varname>EXCLUDED</>. The first alias is a
+ standard, generic alias for the target relation, while the second
+ alias refers to rows originally proposed for insertion. Both
+ aliases can be used in the auxiliary query targetlist and
+ <literal>WHERE</> clause, while the <varname>TARGET</> alias can be
+ used anywhere within the entire statement (e.g., within the
+ <literal>RETURNING</> clause). This allows expressions (in
+ particular, assignments) to reference rows originally proposed for
+ insertion. Note that the effects of all per-row <literal>BEFORE
+ INSERT</> triggers are carried forward. This is particularly
+ useful for multi-insert <literal>ON CONFLICT UPDATE</> statements;
+ when inserting or updating multiple rows, constants or parameter
+ values need only appear once.
+ </para>
+
+ <para>
+ There are several restrictions on the <literal>ON CONFLICT
+ UPDATE</> clause that do not apply to <command>UPDATE</command>
+ statements. Subqueries may not appear in either the
+ <command>UPDATE</command> targetlist, nor its <literal>WHERE</>
+ clause (although simple multi-assignment expressions are
+ supported). <literal>WHERE CURRENT OF</> cannot be used. In
+ general, only columns in the target table, and excluded values
+ originally proposed for insertion may be referenced. Operators and
+ functions may be used freely, though.
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is a <quote>deterministic</quote> statement. This means
+ that the command will not be allowed to affect any single existing
+ row more than once; a cardinality violation error will be raised
+ when this situation arises. Rows proposed for insertion should not
+ duplicate each other in terms of attributes constrained by the
+ conflict-arbitrating unique index. Note that the ordinary rules
+ for unique indexes with regard to null apply analogously to whether
+ or not an arbitrating unique index indicates if the alternative
+ path should be taken. This means that when a null value appears in
+ any uniquely constrained tuple's attribute in an
+ <command>INSERT</command> statement with <literal>ON CONFLICT
+ UPDATE</literal>, rows proposed for insertion will never take the
+ alternative path (provided that a <literal>BEFORE ROW
+ INSERT</literal> trigger does not make null values non-null before
+ insertion); the statement will always insert, assuming there is no
+ unrelated error. Note that merely locking a row (by having it not
+ satisfy the <literal>WHERE</> clause <replaceable>condition</>)
+ does not count towards whether or not the row has been affected
+ multiple times (and whether or not a cardinality violation error is
+ raised). However, the implementation checks for cardinality
+ violations after locking the row, and before updating (or
+ considering updating), so a cardinality violation may be raised
+ despite the fact that the row would not otherwise have gone on to
+ be updated if and only if the existing row was updated by the
+ <literal>ON CONFLICT UPDATE</literal> command at least once
+ already.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> requires a <emphasis>unique index
+ inference</emphasis> specification, which consists of one or more
+ <replaceable class="PARAMETER">column_name_index</replaceable>
+ columns and/or <replaceable
+ class="PARAMETER">expression_index</replaceable> expressions on
+ columns, appearing between parenthesis. These are used to infer a
+ single unique index to limit pre-checking for conflicts to (if no
+ appropriate index is available, an error is raised). A subset of
+ the table to limit the check for conflicts to can optionally also
+ be specified using <replaceable
+ class="PARAMETER">index_condition</replaceable>. Note that any
+ available unique index must only cover at least that subset in
+ order to be arbitrate taking the alternative path; it need not
+ match exactly, and so a non-partial unique index that otherwise
+ matches is applicable. <literal>ON CONFLICT IGNORE</> makes an
+ inference specification optional; omitting the specification
+ indicates a total indifference to where any conflict could occur,
+ which isn't always appropriate. At times, it may be desirable for
+ <literal>ON CONFLICT IGNORE</> to <emphasis>not</emphasis> suppress
+ a conflict related error associated with an index where that isn't
+ explicitly anticipated. Note that <literal>ON CONFLICT UPDATE</>
+ assignment may result in a uniqueness violation, just as with a
+ conventional <command>UPDATE</command>.
+ </para>
+
+ <para>
+ Columns and/or expressions appearing in a unique index inference
+ specification must match all the columns/expressions of some
+ existing unique index on <replaceable
+ class="PARAMETER">table_name</replaceable> - there can be no
+ columns/expressions from the unique index that do not appear in the
+ inference specification, nor can there be any columns/expressions
+ appearing in the inference specification that do not appear in the
+ unique index definition. However, the order of the
+ columns/expressions in the index definition, or whether or not the
+ index definition specified <literal>NULLS FIRST</> or
+ <literal>NULLS LAST</>, or the internal sort order of each column
+ (whether <literal>DESC</> or <literal>ASC</> were specified) are
+ all irrelevant. Deferred unique constraints are not supported as
+ arbiters of whether an alternative <literal>ON CONFLICT</> path
+ should be taken.
+ </para>
+
+ <para>
+ The definition of a conflict for the purposes of <literal>ON
+ CONFLICT</> is somewhat subtle, although the exact definition is
+ seldom of great interest. A conflict is either a unique violation
+ from a unique constraint (or unique index), or an exclusion
+ violation from an exclusion constraint. Only unique indexes can be
+ inferred with a unique index inference specification, which is
+ required for the <command>UPDATE</command> variant, so in effect
+ only unique constraints (and unique indexes) are supported by the
+ <command>UPDATE</command> variant. In contrast to the rules around
+ certain other SQL clauses, like the <literal>DISTINCT</literal>
+ clause, the definition of a duplicate (a conflict) is based on
+ whatever unique indexes happen to be defined on columns on the
+ table. This means that if a user-defined type has multiple sort
+ orders, and the "equals" operator of any of those available sort
+ orders happens to be inconsistent (which goes against an unenforced
+ convention of <productname>PostgreSQL</productname>), the exact
+ behavior depends on the choice of operator class when the unique
+ index was created initially, and not any other consideration such
+ as the default operator class for the type of each indexed column.
+ If there are multiple unique indexes available that seem like
+ equally suitable candidates, but with inconsistent definitions of
+ "equals", then the system chooses whatever it estimates to be the
+ cheapest one to use as an arbiter of taking the alternative
+ <command>UPDATE</command>/<literal>IGNORE</literal> path.
+ </para>
+
+ <para>
+ The optional <replaceable
+ class="PARAMETER">index_condition</replaceable> can be used to
+ allow the inference specification to infer that a partial unique
+ index can be used. Any unique index that otherwise satisfies the
+ inference specification, while also covering at least all the rows
+ in the table covered by <replaceable
+ class="PARAMETER">index_condition</replaceable> may be used. It is
+ recommended that the partial index predicate of the unique index
+ intended to be used as the arbiter of taking the alternative path
+ be matched exactly, but this is not required. Note that an error
+ will be raised if an arbiter unique index is chosen that does not
+ cover the tuple or tuples ultimately proposed for insertion.
+ However, an overly specific <replaceable
+ class="PARAMETER">index_condition</replaceable> does not imply that
+ arbitrating conflicts will be limited to the subset of rows covered
+ by the inferred unique index corresponding to <replaceable
+ class="PARAMETER">index_condition</replaceable>.
+ </para>
+
+ <para>
The optional <literal>RETURNING</> clause causes <command>INSERT</>
- to compute and return value(s) based on each row actually inserted.
+ to compute and return value(s) based on each row actually inserted
+ (or updated, if an <literal>ON CONFLICT UPDATE</> clause was used).
This is primarily useful for obtaining values that were supplied by
defaults, such as a serial sequence number. However, any expression
using the table's columns is allowed. The syntax of the
<literal>RETURNING</> list is identical to that of the output list
- of <command>SELECT</>.
+ of <command>SELECT</>. Only rows that were successfully inserted
+ or updated will be returned. If a row was locked but not updated
+ because an <literal>ON CONFLICT UPDATE</> <literal>WHERE</> clause
+ did not pass, the row will not be returned. Since
+ <literal>RETURNING</> is not part of the <command>UPDATE</>
+ auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
+ (<varname>TARGET</> and <varname>EXCLUDED</>) may not be
+ referenced; only the row as it exists after updating (or
+ inserting) is returned.
</para>
<para>
You must have <literal>INSERT</literal> privilege on a table in
- order to insert into it. If a column list is specified, you only
- need <literal>INSERT</literal> privilege on the listed columns.
- Use of the <literal>RETURNING</> clause requires <literal>SELECT</>
- privilege on all columns mentioned in <literal>RETURNING</>.
- If you use the <replaceable
- class="PARAMETER">query</replaceable> clause to insert rows from a
- query, you of course need to have <literal>SELECT</literal> privilege on
- any table or column used in the query.
+ order to insert into it, as well as <literal>UPDATE
+ privilege</literal> if and only if <literal>ON CONFLICT UPDATE</>
+ is specified. If a column list is specified, you only need
+ <literal>INSERT</literal> privilege on the listed columns.
+ Similarly, when <literal>ON CONFLICT UPDATE</> is specified, you
+ only need <literal>UPDATE</> privilege on the column(s) that are
+ listed to be updated, as well as SELECT privilege on any column
+ whose values are read in the <literal>ON CONFLICT UPDATE</>
+ expressions or <replaceable>condition</>. Use of the
+ <literal>RETURNING</> clause requires <literal>SELECT</> privilege
+ on all columns mentioned in <literal>RETURNING</>. If you use the
+ <replaceable class="PARAMETER">query</replaceable> clause to insert
+ rows from a query, you of course need to have
+ <literal>SELECT</literal> privilege on any table or column used in
+ the query.
</para>
</refsect1>
@@ -121,7 +326,54 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
The name of a column in the table named by <replaceable class="PARAMETER">table_name</replaceable>.
The column name can be qualified with a subfield name or array
subscript, if needed. (Inserting into only some fields of a
- composite column leaves the other fields null.)
+ composite column leaves the other fields null.) When
+ referencing a column with <literal>ON CONFLICT UPDATE</>, do not
+ include the table's name in the specification of a target
+ column. For example, <literal>INSERT ... ON CONFLICT UPDATE tab
+ SET TARGET.col = 1</> is invalid.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name_index</replaceable></term>
+ <listitem>
+ <para>
+ The name of a <replaceable
+ class="PARAMETER">table_name</replaceable> column (with several
+ columns potentially named). These are used to infer a
+ particular unique index defined on <replaceable
+ class="PARAMETER">table_name</replaceable>. This requires
+ <literal>ON CONFLICT UPDATE</> and <literal>ON CONFLICT
+ IGNORE</> to assume that all expected sources of uniqueness
+ violations originate within the columns/rows constrained by the
+ unique index. When this is omitted, (which is forbidden with
+ the <literal>ON CONFLICT UPDATE</> variant), the system checks
+ for sources of uniqueness violations ahead of time in all unique
+ indexes. Otherwise, only a single specified unique index is
+ checked ahead of time, and uniqueness violation errors can
+ appear for conflicts originating in any other unique index. If
+ a unique index cannot be inferred, an error is raised.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">expression_index</replaceable></term>
+ <listitem>
+ <para>
+ Equivalent to <replaceable
+ class="PARAMETER">column_name_index</replaceable>, but used to
+ infer a particular expressional index instead.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">index_condition</replaceable></term>
+ <listitem>
+ <para>
+ Used to allow inference of partial unique indexes.
</para>
</listitem>
</varlistentry>
@@ -167,12 +419,25 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</varlistentry>
<varlistentry>
+ <term><replaceable class="PARAMETER">condition</replaceable></term>
+ <listitem>
+ <para>
+ An expression that returns a value of type <type>boolean</type>.
+ Only rows for which this expression returns <literal>true</>
+ will be updated, although all rows will be locked when the
+ <literal>ON CONFLICT UPDATE</> path is taken.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+
<term><replaceable class="PARAMETER">output_expression</replaceable></term>
<listitem>
<para>
An expression to be computed and returned by the <command>INSERT</>
- command after each row is inserted. The expression can use any
- column names of the table named by <replaceable class="PARAMETER">table_name</replaceable>.
+ command after each row is inserted (not updated). The
+ expression can use any column names of the table named by
+ <replaceable class="PARAMETER">table_name</replaceable>.
Write <literal>*</> to return all columns of the inserted row(s).
</para>
</listitem>
@@ -198,20 +463,29 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<screen>
INSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
</screen>
+ However, in the event of an <literal>ON CONFLICT UPDATE</> clause
+ (but <emphasis>not</emphasis> in the event of an <literal>ON
+ CONFLICT IGNORE</> clause), the command tag reports the number of
+ rows inserted or updated together, of the form
+<screen>
+UPSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
+</screen>
The <replaceable class="parameter">count</replaceable> is the number
of rows inserted. If <replaceable class="parameter">count</replaceable>
is exactly one, and the target table has OIDs, then
<replaceable class="parameter">oid</replaceable> is the
- <acronym>OID</acronym> assigned to the inserted row. Otherwise
- <replaceable class="parameter">oid</replaceable> is zero.
+ <acronym>OID</acronym>
+ assigned to the inserted row (but not if there is only a single
+ updated row). Otherwise <replaceable
+ class="parameter">oid</replaceable> is zero..
</para>
<para>
If the <command>INSERT</> command contains a <literal>RETURNING</>
clause, the result will be similar to that of a <command>SELECT</>
statement containing the columns and values defined in the
- <literal>RETURNING</> list, computed over the row(s) inserted by the
- command.
+ <literal>RETURNING</> list, computed over the row(s) inserted or
+ updated by the command.
</para>
</refsect1>
@@ -311,7 +585,63 @@ WITH upd AS (
RETURNING *
)
INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
-</programlisting></para>
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Assumes a unique
+ index has been defined that constrains values appearing in the
+ <literal>did</literal> column. Note that an <varname>EXCLUDED</>
+ expression is used to reference values originally proposed for
+ insertion:
+<programlisting>
+ INSERT INTO distributors (did, dname)
+ VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc')
+ ON CONFLICT (did) UPDATE SET dname = EXCLUDED.dname
+</programlisting>
+ </para>
+ <para>
+ Insert a distributor, or do nothing for rows proposed for insertion
+ when an existing, excluded row (a row with a matching constrained
+ column or columns after before row insert triggers fire) exists.
+ Example assumes a unique index has been defined that constrains
+ values appearing in the <literal>did</literal> column (although
+ since the <literal>IGNORE</> variant was used, the specification of
+ columns to infer a unique index from is not mandatory):
+<programlisting>
+ INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH')
+ ON CONFLICT (did) IGNORE
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Example assumes
+ a unique index has been defined that constrains values appearing in
+ the <literal>did</literal> column. <literal>WHERE</> clause is
+ used to limit the rows actually updated (any existing row not
+ updated will still be locked, though):
+<programlisting>
+ -- Don't update existing distributors based in a certain ZIP code
+ INSERT INTO distributors (did, dname) VALUES (8, 'Anvil Distribution')
+ ON CONFLICT (did) UPDATE
+ SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'
+ WHERE TARGET.zipcode != '21201'
+</programlisting>
+ </para>
+ <para>
+ Insert new distributor if possible; otherwise
+ <literal>IGNORE</literal>. Example assumes a unique index has been
+ defined that constrains values appearing in the
+ <literal>did</literal> column on a subset of rows where the
+ <literal>is_active</literal> boolean column evaluates to
+ <literal>true</literal>:
+<programlisting>
+ -- This statement could infer a partial unique index on did
+ -- with a predicate of WHERE is_active, but it could also
+ -- just use a regular unique constraint on did if that was
+ -- all that was available.
+ INSERT INTO distributors (did, dname) VALUES (9, 'Antwerp Design')
+ ON CONFLICT (did WHERE is_active) IGNORE
+</programlisting>
+ </para>
</refsect1>
<refsect1>
@@ -321,7 +651,8 @@ INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
<command>INSERT</command> conforms to the SQL standard, except that
the <literal>RETURNING</> clause is a
<productname>PostgreSQL</productname> extension, as is the ability
- to use <literal>WITH</> with <command>INSERT</>.
+ to use <literal>WITH</> with <command>INSERT</>, and the ability to
+ specify an alternative path with <literal>ON CONFLICT</>.
Also, the case in
which a column name list is omitted, but not all the columns are
filled from the <literal>VALUES</> clause or <replaceable>query</>,
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
index 7c31871..1e0a2f8 100644
--- a/doc/src/sgml/ref/set_constraints.sgml
+++ b/doc/src/sgml/ref/set_constraints.sgml
@@ -69,7 +69,11 @@ SET CONSTRAINTS { ALL | <replaceable class="parameter">name</replaceable> [, ...
<para>
Currently, only <literal>UNIQUE</>, <literal>PRIMARY KEY</>,
<literal>REFERENCES</> (foreign key), and <literal>EXCLUDE</>
- constraints are affected by this setting.
+ constraints are affected by this setting. Note that constraints
+ that were created with this clause cannot be used as arbiters of
+ whether or not to take the alternative path with an
+ <command>INSERT</command> statement that includes an <literal>ON
+ CONFLICT UPDATE</> clause.
<literal>NOT NULL</> and <literal>CHECK</> constraints are
always checked immediately when a row is inserted or modified
(<emphasis>not</> at the end of the statement).
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f94aea1..5141690 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -40,14 +40,17 @@
On tables and foreign tables, triggers can be defined to execute either
before or after any <command>INSERT</command>, <command>UPDATE</command>,
or <command>DELETE</command> operation, either once per modified row,
- or once per <acronym>SQL</acronym> statement.
- <command>UPDATE</command> triggers can moreover be set to fire only if
- certain columns are mentioned in the <literal>SET</literal> clause of the
- <command>UPDATE</command> statement.
- Triggers can also fire for <command>TRUNCATE</command> statements.
- If a trigger event occurs, the trigger's function is called at the
- appropriate time to handle the event. Foreign tables do not support the
- TRUNCATE statement at all.
+ or once per <acronym>SQL</acronym> statement. If an
+ <command>INSERT</command> contains an <literal>ON CONFLICT UPDATE</>
+ clause, it is possible that the effects of a BEFORE insert trigger and
+ a BEFORE update trigger can both be applied twice, if a reference to
+ an <varname>EXCLUDED</> column appears. <command>UPDATE</command>
+ triggers can moreover be set to fire only if certain columns are
+ mentioned in the <literal>SET</literal> clause of the
+ <command>UPDATE</command> statement. Triggers can also fire for
+ <command>TRUNCATE</command> statements. If a trigger event occurs,
+ the trigger's function is called at the appropriate time to handle the
+ event. Foreign tables do not support the TRUNCATE statement at all.
</para>
<para>
@@ -119,6 +122,36 @@
</para>
<para>
+ If an <command>INSERT</command> contains an <literal>ON CONFLICT
+ UPDATE</> clause, it is possible that the effects of all row-level
+ <literal>BEFORE</> <command>INSERT</command> triggers and all
+ row-level BEFORE <command>UPDATE</command> triggers can both be
+ applied in a way that is apparent from the final state of the updated
+ row, if an <varname>EXCLUDED</> column is referenced. There need not
+ be an <varname>EXCLUDED</> column reference for both sets of BEFORE
+ row-level triggers to execute, though. The possibility of surprising
+ outcomes should be considered when there are both <literal>BEFORE</>
+ <command>INSERT</command> and <literal>BEFORE</>
+ <command>UPDATE</command> row-level triggers that both affect a row
+ being inserted/updated (this can still be problematic if the
+ modifications are more or less equivalent if they're not also
+ idempotent). Note that statement-level <command>UPDATE</command>
+ triggers are executed when <literal>ON CONFLICT UPDATE</> is
+ specified, regardless of whether or not any rows were affected by
+ the <command>UPDATE</command>. An <command>INSERT</command> with
+ an <literal>ON CONFLICT UPDATE</> clause will execute
+ statement-level <literal>BEFORE</> <command>INSERT</command>
+ triggers first, then statement-level <literal>BEFORE</>
+ <command>UPDATE</command> triggers, followed by statement-level
+ <literal>AFTER</> <command>UPDATE</command> triggers and finally
+ statement-level <literal>AFTER</> <command>INSERT</command>
+ triggers. <literal>ON CONFLICT UPDATE</> is not supported on
+ views (Only <literal>ON CONFLICT IGNORE</> is supported on
+ updatable views); therefore, unpredictable interactions with
+ <literal>INSTEAD OF</> triggers are not possible.
+ </para>
+
+ <para>
Trigger functions invoked by per-statement triggers should always
return <symbol>NULL</symbol>. Trigger functions invoked by per-row
triggers can return a table row (a value of
--
1.9.1
On 2015-02-04 16:49:46 -0800, Peter Geoghegan wrote:
On Tue, Feb 2, 2015 at 01:06 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Generally the split into the individual commits doesn't seem to make
much sense to me.
I think trying to make that possible is a good idea in patches of this
size. It e.g. seems entirely possible to structure the patchset so that
the speculative lock infrastructure is added first and the rest
later. I've not thought more about how to split it up further, but I'm
pretty sure it's possible.
The commits individually (except the first) aren't
indivdiually commitable and aren't even meaningful. Splitting off the
internal docs, tests and such actually just seems to make reviewing
harder because you miss context. Splitting it so that individual piece
are committable and reviewable makes sense, but... I have no problem
doing the user docs later. If you split of RLS support, you need to
throw an error before it's implemented.I mostly agree. Basically, I did not intend for all of the patches to
be individually committed. The mechanism by which EXCLUDED.*
expressions are added is somewhat novel, and deserves to be
independently *considered*. I'm trying to show how the parts fit
together more so than breaking things down in to smaller commits (as
you picked up on, 0001 is the exception - that is genuinely intended
to be committed early). Also, those commit messages give me the
opportunity to put those parts in their appropriate context vis-a-vis
our discussions. They refer to the Wiki, for example, or reasons why
pg_stat_statements shouldn't care about ExcludedExpr. Obviously the
final commit messages won't look that way.
FWIW, I don't think anything here really should refer to the wiki...
0002:
* Tentatively I'd say that killspeculative should be done via a separate
function instead of heap_delete()Really? I guess if that were to happen, it would entail refactoring
heap_delete() to call a static function, which was also called by a
new kill_speculative() function that does this. Otherwise, you'd have
far too much duplication.
I don't really think there actually is that much common inbetween
those. It looks to me that most of the code in heap_delete isn't
actually relevant for this case and could be cut short. My guess is that
only the WAL logging would be separated out.
* I doubt logical decoding works with the patch as it stands.
I thought so. Perhaps you could suggest a better use of the available
XLOG_HEAP_* bits. I knew I needed to consider that more carefully
(hence the XXX comment), but didn't get around to it.
I think you probably need to add test macros that make sure only the
individual bits are sets, and not the combination and then only use those.
* If a arbiter index is passed to ExecCheckIndexConstraints(), can't we
abort the loop after checking it? Also, do we really have to iterate
over indexes for that case? How about moving the loop contents to a
separate function and using that separately for the arbiter cases?Well, the failure to do that implies very few extra cycles, but sure.
It's not that much about the CPU cycles, but also about the mental
ones. If you have to think what happens if there's more than one
match...
* ItemPointerIsValid
What about it?
Uh. Oh. No idea. I wrote this pretty late at night ;)
* /*
* This may occur when an instantaneously invisible tuple is blamed
* as a conflict because multiple rows are inserted with the same
* constrained values.
How can this happen? We don't insert multiple rows with the same
command id?This is a cardinality violation [1]. It can definitely happen - just
try the examples you see on the Wiki.
I don't care about the wiki from the point of code comments. This needs
to be understandable in five years time.
* Perhaps it has previously been discussed but I'm not convinced by the
reasoning for not looking at opclasses in infer_unique_index(). This
seems like it'd prohibit ever having e.g. case insensitive opclasses -
something surely worthwile.I don't think anyone gave that idea the thumbs-up. However, I really
don't see the problem. Sure, we could have case insensitive opclasses
in the future, and you may want to make a unique index using one.
Then the problem suddenly becomes that previous choices of
indexes/statements aren't possible anymore. It seems much better to
introduce the syntax now and not have too much of a usecase for
it.
* The whole speculative insert logic isn't really well documented. Why,
for example, do we actually need the token? And why are there no
issues with overflow? And where is it documented that a 0 means
there's no token? ...Fair enough. Presumably it's okay that overflow theoretically could
occur, because a race is all but impossible. The token represents a
particular attempt by some backend at inserting a tuple, that needs to
be waited on specifically only if it is their active attempt (and the
xact is still running). Otherwise, you get unprincipled deadlocks.
Even if by some incredibly set of circumstances it wraps around, worst
case scenario you get an unprinciped deadlock, which is hardly the end
of the world given the immense number of insertions required, and the
immense unlikelihood that things would work out that way - it'd be
basically impossible.I'll document the "0" thing.
It's really not about me understanding it right now, but about longer term.
* /* XXX: Make sure that re-use of bits is safe here */ - no, not
unless you change existing checks.I think that this is a restatement of your remarks on logical
decoding. No?
Yea. By here it was even later :P.
Can you add a UPSERT test for logical decoding? I doubt it'll work right
now, even in the repost.
* /*
* Immediately VACUUM "super-deleted" tuples
*/
if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))
return HEAPTUPLE_DEAD;
Is that branch really needed? Shouldn't it just be happening as a
consequence of the already existing code? Same in SatisfiesMVCC. If
you actually needed that block, it'd need to be done in SatisfiesSelf
as well, no? You have a comment about a possible loop - but that seems
wrong to me, implying that HEAP_XMIN_COMMITTED was set invalidly.Indeed, this code is kind of odd. While I think the omission within
SatisfiesSelf() may be problematic too, if you really want to know why
this code is needed, uncomment it and run Jeff's stress-test. It will
reliably break.
Again. I don't care about running some strange tool when reading code
comments.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 10, 2015 at 12:04 AM, Andres Freund <andres@2ndquadrant.com> wrote:
FWIW, I don't think anything here really should refer to the wiki...
The Wiki pages have done a good job of summarizing things...it
certainly didn't seem that hard for you to get up to speed here. Also,
as you'll know from working on logical decoding, it's hard to know
what is complex and esoteric and what is relatively accessible when
you're this close to a big project. I recall that you said as much
before. I'm focused on signposting so that reviewers can follow what
each patch does with the minimum of effort (with reference to the Wiki
or whatever). I see no reason to not do things that way until
commit...it feels like there is less chance of the concepts going over
people's head this way.
I don't really think there actually is that much common inbetween
those. It looks to me that most of the code in heap_delete isn't
actually relevant for this case and could be cut short. My guess is that
only the WAL logging would be separated out.
I'll think about that some more.
* I doubt logical decoding works with the patch as it stands.
I thought so. Perhaps you could suggest a better use of the available
XLOG_HEAP_* bits. I knew I needed to consider that more carefully
(hence the XXX comment), but didn't get around to it.I think you probably need to add test macros that make sure only the
individual bits are sets, and not the combination and then only use those.
This too.
* /*
* This may occur when an instantaneously invisible tuple is blamed
* as a conflict because multiple rows are inserted with the same
* constrained values.
How can this happen? We don't insert multiple rows with the same
command id?This is a cardinality violation [1]. It can definitely happen - just
try the examples you see on the Wiki.I don't care about the wiki from the point of code comments. This needs
to be understandable in five years time.
That wasn't clear before - you seemed to me to be questioning if this
was even possible. There is a section in the INSERT SQL reference page
about cardinality violations, so we're certainly talking about
something that a code reader likely heard of. Also, the nitty gritty
showing various scenarios on the Wiki is the quickest way to know what
is possible (but is much too long winded for user visible
documentation or code comments).
* Perhaps it has previously been discussed but I'm not convinced by the
reasoning for not looking at opclasses in infer_unique_index(). This
seems like it'd prohibit ever having e.g. case insensitive opclasses -
something surely worthwile.I don't think anyone gave that idea the thumbs-up. However, I really
don't see the problem. Sure, we could have case insensitive opclasses
in the future, and you may want to make a unique index using one.Then the problem suddenly becomes that previous choices of
indexes/statements aren't possible anymore. It seems much better to
introduce the syntax now and not have too much of a usecase for
it.
The only way the lack of a way of specifying which opclass to use
could bite us is if there were two *actually* defined unique indexes
on the same column, each with different "equality" operators. That
seems like kind of a funny scenario, even if that were quite possible
(even if non-default opclasses existed that had a non-identical
"equality" operators, which is basically not the case today).
I grant that is a bit odd that we're talking about unique indexes
definitions affecting semantics, but that is to a certain extent the
nature of the beast. As a compromise, I suggest having the inference
specification optionally accept a named opclass per attribute, in the
style of CREATE INDEX (I'm already reusing a bit of the raw parser
support for CREATE INDEX, you see) - that'll make inference insist on
that opclass, rather than make it a strict matter of costing available
alternatives (not that any alternative is expected with idiomatic
usage). That implies no additional parser overhead, really. If that's
considered ugly, then at least it's an ugly thing that literally no
one will ever use in the foreseeable future...and an ugly thing that
is no more necessary in CREATE INDEX than here (and yet CREATE INDEX
lives with the ugliness).
It's really not about me understanding it right now, but about longer term.
Sure.
Can you add a UPSERT test for logical decoding? I doubt it'll work right
now, even in the repost.
Okay.
* /*
* Immediately VACUUM "super-deleted" tuples
*/
if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))
return HEAPTUPLE_DEAD;
Is that branch really needed? Shouldn't it just be happening as a
consequence of the already existing code? Same in SatisfiesMVCC. If
you actually needed that block, it'd need to be done in SatisfiesSelf
as well, no? You have a comment about a possible loop - but that seems
wrong to me, implying that HEAP_XMIN_COMMITTED was set invalidly.Indeed, this code is kind of odd. While I think the omission within
SatisfiesSelf() may be problematic too, if you really want to know why
this code is needed, uncomment it and run Jeff's stress-test. It will
reliably break.Again. I don't care about running some strange tool when reading code
comments.
Again, I thought you were skeptical about the very need for this code
(and not how it was presented). If that was the case, that tool would
provide you with a pretty quick and easy way of satisfying yourself
that it is needed. The actual reason that it is needed is that if it
isn't, then the system can see a "broken promise" tuple as spuriously
visible. This will cause Jeff's tool to spit out a bunch of errors due
to finding all-NULL values in these tuples. VACUUM could not reclaim
the tuples unless and until they stopped appearing visible for
VACUUM's purposes (which this particular code snippet relates to).
Maybe the comment could be improved, and maybe the code could be
improved, but the code is necessary as things stand.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Feb 10, 2015 at 9:21 AM, Peter Geoghegan <pg@heroku.com> wrote:
* There was some squashing of commits, since Andres felt that they
weren't all useful as separate commits. I've still split out the RTE
permissions commit, as well as the RLS commit (plus the documentation
and test commits, FWIW). I hope that this will make it easier to
review parts of the patch, without being generally excessive.When documentation and tests are left out, the entire patch series is left
at:
Patch moved to next CF.
--
Michael
On 2/9/15 6:21 PM, Peter Geoghegan wrote:
Thanks for taking a look at it. That's somewhat cleaned up in the
attached patchseries - V2.2.
In patch 1, "sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken."
Doesn't that warrant bumping catversion?
Patch 2
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
and
+if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
When doing this, should we also set the HEAP_XMIN_INVALID hint bit?
<reads more of patch>
Ok, I see we're not doing this because the check for a super deleted
tuple is already cheap. Probably worth mentioning that in the comment...
ExecInsert():
+ * We don't suppress the effects (or, perhaps, side-effects) of
+ * BEFORE ROW INSERT triggers. This isn't ideal, but then we
+ * cannot proceed with even considering uniqueness violations until
+ * these triggers fire on the one hand, but on the other hand they
+ * have the ability to execute arbitrary user-defined code which
+ * may perform operations entirely outside the system's ability to
+ * nullify.
I'm a bit confused as to why we're calling BEFORE triggers out here...
hasn't this always been true for both BEFORE *and* AFTER triggers? The
comment makes me wonder if there's some magic that happens with AFTER...
+spec != SPEC_NONE? HEAP_INSERT_SPECULATIVE:0
Non-standard formatting. Given the size of the patch and work already
into it I'd just leave it for the next formatting run; I only mention it
in case someone has some compelling reason not to.
ExecLockUpdateTuple():
+ * Try to lock tuple for update as part of speculative insertion. If
+ * a qual originating from ON CONFLICT UPDATE is satisfied, update
+ * (but still lock row, even though it may not satisfy estate's
+ * snapshot).
I find this confusing... which row is being locked? The previously
inserted one? Perhaps a better wording would be "satisfied, update. Lock
the original row even if it doesn't satisfy estate's snapshot."
infer_unique_index():
+/*
+ * We need not lock the relation since it was already locked, either by
+ * the rewriter or when expand_inherited_rtentry() added it to the query's
+ * rangetable.
+ */
+relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid;
+
+relation = heap_open(relationObjectId, NoLock);
Seems like there should be an Assert here. Also, the comment should
probably go before the heap_open call.
heapam_xlog.h:
+/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */
I wish this would mention why it's safe to do this. Also, the comment
mentions xl_heap_delete when the #define is for
XLOG_HEAP_KILLED_SPECULATIVE_TUPLE; presumably that's wrong. Perhaps:
/*
* reuse XLOG_HEAP_LAST_MULTI_INSERT bit for
* XLOG_HEAP_KILLED_SPECULATIVE_TUPLE. This is safe because we never do
* multi-inserts for INSERT ON CONFLICT.
*/
I'll review the remaining patches later.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/10/2015 02:21 AM, Peter Geoghegan wrote:
On Fri, Feb 6, 2015 at 1:51 PM, Bruce Momjian <bruce@momjian.us> wrote:
Other than the locking part, the biggest part of this patch is adjusting
things so that an INSERT can change into an UPDATE.Thanks for taking a look at it. That's somewhat cleaned up in the
attached patchseries - V2.2. This has been rebased to repair the minor
bit-rot pointed out by Thom.
I don't really have the energy to review this patch in one batch, so I'd
really like to split this into two:
1. Solve the existing "problem" with exclusion constraints that if two
insertions happen concurrently, one of them might be aborted with
"deadlock detected" error, rather then "conflicting key violation"
error. That really wouldn't be worth fixing otherwise, but it happens to
be a useful stepping stone for this upsert patch.
2. All the rest.
I took a stab at extracting the parts needed to do 1. See attached. I
didn't modify ExecUpdate to use speculative insertions, so the issue
remains for UPDATEs, but that's easy to add.
I did not solve the potential for livelocks (discussed at
/messages/by-id/CAM3SWZTfTt_fehet3tU3YKCpCYPYnNaUqUZ3Q+NAASnH-60teA@mail.gmail.com).
The patch always super-deletes the already-inserted tuple on conflict,
and then waits for the other inserter. That would be nice to fix, using
one of the ideas from that thread, or something else.
We never really debated the options for how to do the speculative
insertion and super-deletion. This patch is still based on the quick &
dirty demo patches I posted about a year ago, in response to issues you
found with earlier versions. That might be OK - maybe I really hit the
nail on designing those things and got them right on first try - but
more likely there are better alternatives.
Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock
for every insertion? That seems bad for performance reasons. Also, are
we happy with adding the new fields to the proc array? Another
alternative that was discussed was storing the speculative insertion
token on the heap tuple itself. (See
/messages/by-id/52D00D2D.6030307@vmware.com)
Are we happy with the way super-deletion works? Currently, the xmin
field is set to invalid to mark the tuple as super-deleted. That
required checks in HeapTupleSatisfies* functions. One alternative would
be to set xmax equal to xmin, and teach the code currently calls
XactLockTableWait() to check if xmax=xmin, and not consider the tuple as
conflicting.
- Heikki
Attachments:
fix-exclusion-constraint-deadlocks-1.patchapplication/x-patch; name=fix-exclusion-constraint-deadlocks-1.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 46060bc1..0aa3e575 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
*
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc->specInsert fields
+ * are filled.
+ *
* Note that these options will be applied when inserting into the heap's
* TOAST table, too, if the tuple requires any out-of-line data.
*
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
END_CRIT_SECTION();
+ /*
+ * Let others know that we speculatively inserted this tuple, before
+ * releasing the buffer lock.
+ */
+ if (options & HEAP_INSERT_SPECULATIVE)
+ SetSpeculativeInsertionTid(relation->rd_node, &heaptup->t_self);
+
UnlockReleaseBuffer(buffer);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we "super-delete" a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id. Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
*/
HTSU_Result
heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd)
+ HeapUpdateFailureData *hufd, bool killspeculative)
{
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ if (!killspeculative)
+ {
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ }
+ else
+ {
+ if (tp.t_data->t_choice.t_heap.t_xmin != xid ||
+ tp.t_data->t_choice.t_heap.t_field3.t_cid != cid)
+ elog(ERROR, "attempted to super-delete a tuple from other CID");
+ result = HeapTupleMayBeUpdated;
+ }
+
if (result == HeapTupleInvisible)
{
@@ -2823,12 +2850,15 @@ l1:
* using our own TransactionId below, since some other backend could
* incorporate our XID into a MultiXact immediately afterwards.)
*/
- MultiXactIdSetOldestMember();
+ if (!killspeculative)
+ {
+ MultiXactIdSetOldestMember();
- compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
- tp.t_data->t_infomask, tp.t_data->t_infomask2,
- xid, LockTupleExclusive, true,
- &new_xmax, &new_infomask, &new_infomask2);
+ compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
+ tp.t_data->t_infomask, tp.t_data->t_infomask2,
+ xid, LockTupleExclusive, true,
+ &new_xmax, &new_infomask, &new_infomask2);
+ }
START_CRIT_SECTION();
@@ -2855,8 +2885,23 @@ l1:
tp.t_data->t_infomask |= new_infomask;
tp.t_data->t_infomask2 |= new_infomask2;
HeapTupleHeaderClearHotUpdated(tp.t_data);
- HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
- HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ /*
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
+ * instead of setting xmax, to make the tuple clearly invisible to
+ * everyone. In particular, we want HeapTupleSatisfiesDirty() to regard
+ * the tuple as dead, so that another backend inserting a duplicate key
+ * value won't unnecessarily wait for our transaction to finish.
+ */
+ if (!killspeculative)
+ {
+ HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
+ HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ }
+ else
+ {
+ HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ }
+
/* Make sure there is no forward chain link in t_ctid */
tp.t_data->t_ctid = tp.t_self;
@@ -2872,7 +2917,11 @@ l1:
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);
- xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
+ xlrec.flags = 0;
+ if (all_visible_cleared)
+ xlrec.flags |= XLOG_HEAP_ALL_VISIBLE_CLEARED;
+ if (killspeculative)
+ xlrec.flags |= XLOG_HEAP_KILLED_SPECULATIVE_TUPLE;
xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask,
tp.t_data->t_infomask2);
xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self);
@@ -2977,7 +3026,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
result = heap_delete(relation, tid,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd, false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -4070,14 +4119,16 @@ get_mxact_status_for_lock(LockTupleMode mode, bool is_update)
*
* Function result may be:
* HeapTupleMayBeUpdated: lock was successfully acquired
+ * HeapTupleInvisible: lock failed because tuple instantaneously invisible
* HeapTupleSelfUpdated: lock failed because tuple updated by self
* HeapTupleUpdated: lock failed because tuple updated by other xact
* HeapTupleWouldBlock: lock couldn't be acquired and wait_policy is skip
*
- * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
- * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
- * (the last only for HeapTupleSelfUpdated, since we
- * cannot obtain cmax from a combocid generated by another transaction).
+ * In the failure cases other than HeapTupleInvisible, the routine fills
+ * *hufd with the tuple's t_ctid, t_xmax (resolving a possible MultiXact,
+ * if necessary), and t_cmax (the last only for HeapTupleSelfUpdated,
+ * since we cannot obtain cmax from a combocid generated by another
+ * transaction).
* See comments for struct HeapUpdateFailureData for additional info.
*
* See README.tuplock for a thorough explanation of this mechanism.
@@ -4115,8 +4166,15 @@ l3:
if (result == HeapTupleInvisible)
{
- UnlockReleaseBuffer(*buffer);
- elog(ERROR, "attempted to lock invisible tuple");
+ LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
+
+ /*
+ * This is possible, but only when locking a tuple for speculative
+ * insertion. We return this value here rather than throwing an error
+ * in order to give that case the opportunity to throw a more specific
+ * error.
+ */
+ return HeapTupleInvisible;
}
else if (result == HeapTupleBeingUpdated)
{
@@ -7326,7 +7384,10 @@ heap_xlog_delete(XLogReaderState *record)
HeapTupleHeaderClearHotUpdated(htup);
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ else
+ HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 932c6f78..1a4e18d 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -51,7 +51,8 @@ static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
Relation heapRel, Buffer buf, OffsetNumber offset,
ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique);
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken);
static void _bt_findinsertloc(Relation rel,
Buffer *bufptr,
OffsetNumber *offsetptr,
@@ -159,17 +160,27 @@ top:
*/
if (checkUnique != UNIQUE_CHECK_NO)
{
- TransactionId xwait;
+ TransactionId xwait;
+ uint32 speculativeToken;
offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey,
- checkUnique, &is_unique);
+ checkUnique, &is_unique, &speculativeToken);
if (TransactionIdIsValid(xwait))
{
/* Have to wait for the other guy ... */
_bt_relbuf(rel, buf);
- XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+ /*
+ * If it's a speculative insertion, wait for it to finish (ie.
+ * to go ahead with the insertion, or kill the tuple). Otherwise
+ * wait for the transaction to finish as usual.
+ */
+ if (speculativeToken)
+ SpeculativeInsertionWait(xwait, speculativeToken);
+ else
+ XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+
/* start over... */
_bt_freestack(stack);
goto top;
@@ -211,9 +222,12 @@ top:
* also point to end-of-page, which means that the first tuple to check
* is the first tuple on the next page.
*
- * Returns InvalidTransactionId if there is no conflict, else an xact ID
- * we must wait for to see if it commits a conflicting tuple. If an actual
- * conflict is detected, no return --- just ereport().
+ * Returns InvalidTransactionId if there is no conflict, else an xact ID we
+ * must wait for to see if it commits a conflicting tuple. If an actual
+ * conflict is detected, no return --- just ereport(). If an xact ID is
+ * returned, and the conflicting tuple still has a speculative insertion in
+ * progress, *speculativeToken is set to non-zero, and the caller can wait for
+ * the verdict on the insertion using SpeculativeInsertionWait().
*
* However, if checkUnique == UNIQUE_CHECK_PARTIAL, we always return
* InvalidTransactionId because we don't want to wait. In this case we
@@ -223,7 +237,8 @@ top:
static TransactionId
_bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
Buffer buf, OffsetNumber offset, ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique)
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int natts = rel->rd_rel->relnatts;
@@ -340,6 +355,7 @@ _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
if (nbuf != InvalidBuffer)
_bt_relbuf(rel, nbuf);
/* Tell _bt_doinsert to wait... */
+ *speculativeToken = SnapshotDirty.speculativeToken;
return xwait;
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 33b172b..8d278b8 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1162,6 +1162,7 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
resultRelInfo->ri_NumIndices = 0;
resultRelInfo->ri_IndexRelationDescs = NULL;
resultRelInfo->ri_IndexRelationInfo = NULL;
+ resultRelInfo->ri_HasExclusionConstraints = false; /* set by ExecOpenIndices */
/* make a copy so as not to depend on relcache info not changing... */
resultRelInfo->ri_TrigDesc = CopyTriggerDesc(resultRelationDesc->trigdesc);
if (resultRelInfo->ri_TrigDesc)
@@ -2094,7 +2095,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* the latest version of the row was deleted, so we need do
* nothing. (Should be safe to examine xmin without getting
* buffer's content lock, since xmin never changes in an existing
- * tuple.)
+ * non-promise tuple, and there is no reason to lock a promise
+ * tuple until it is clear that it has been fulfilled.)
*/
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple.t_data),
priorXmax))
@@ -2175,11 +2177,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* case, so as to avoid the "Halloween problem" of
* repeated update attempts. In the latter case it might
* be sensible to fetch the updated tuple instead, but
- * doing so would require changing heap_lock_tuple as well
- * as heap_update and heap_delete to not complain about
- * updating "invisible" tuples, which seems pretty scary.
- * So for now, treat the tuple as deleted and do not
- * process.
+ * doing so would require changing heap_update and
+ * heap_delete to not complain about updating "invisible"
+ * tuples, which seems pretty scary (heap_lock_tuple will
+ * not complain, but few callers expect HeapTupleInvisible,
+ * and we're not one of them). So for now, treat the tuple
+ * as deleted and do not process.
*/
ReleaseBuffer(buffer);
return NULL;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 022041b..838d2c6 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -44,11 +44,14 @@
#include "access/relscan.h"
#include "access/transam.h"
+#include "access/xact.h"
#include "catalog/index.h"
#include "executor/execdebug.h"
#include "nodes/nodeFuncs.h"
#include "parser/parsetree.h"
#include "storage/lmgr.h"
+#include "storage/procarray.h"
+#include "storage/proc.h"
#include "utils/memutils.h"
#include "utils/tqual.h"
@@ -938,6 +941,9 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo)
/* extract index key information from the index's pg_index info */
ii = BuildIndexInfo(indexDesc);
+ if (ii->ii_ExclusionOps != NULL)
+ resultRelInfo->ri_HasExclusionConstraints = true;
+
relationDescs[i] = indexDesc;
indexInfoArray[i] = ii;
i++;
@@ -990,7 +996,8 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
*
* This returns a list of index OIDs for any unique or exclusion
* constraints that are deferred and that had
- * potential (unconfirmed) conflicts.
+ * potential (unconfirmed) conflicts. (if noDupErr == true, the
+ * same is done for non-deferred constraints)
*
* CAUTION: this must not be called for a HOT update.
* We can't defend against that here for lack of info.
@@ -1158,6 +1165,8 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* newIndex: if true, we are trying to build a new index (this affects
* only the wording of error messages)
* errorOK: if true, don't throw error for violation
+ * wait: if true, wait for conflicting transaction to finish, even if !errorOK
+ * conflictTid: if not-NULL, the TID of conflicting tuple is returned here.
*
* Returns true if OK, false if actual or potential violation
*
@@ -1169,11 +1178,16 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*
* When errorOK is false, we'll throw error on violation, so a false result
* is impossible.
+ *
+ * If this is a speculative insertion (MyProc->specInsertTid is valud),
+ * waiting on anyone else, kill our already-inserted tuple.
*/
bool
-check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
- ItemPointer tupleid, Datum *values, bool *isnull,
- EState *estate, bool newIndex, bool errorOK)
+check_exclusion_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo, ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate, bool newIndex,
+ bool errorOK)
{
Oid *constr_procs = indexInfo->ii_ExclusionProcs;
uint16 *constr_strats = indexInfo->ii_ExclusionStrats;
@@ -1307,10 +1321,28 @@ retry:
if (TransactionIdIsValid(xwait))
{
+ bool speculative = ItemPointerIsValid(&MyProc->specInsertTid);
ctid_wait = tup->t_data->t_ctid;
index_endscan(index_scan);
- XactLockTableWait(xwait, heap, &ctid_wait,
- XLTW_RecheckExclusionConstr);
+
+ if (speculative)
+ {
+ HeapUpdateFailureData hufd;
+
+ Assert(ItemPointerEquals(&MyProc->specInsertTid, tupleid));
+ heap_delete(heap, tupleid,
+ estate->es_output_cid, InvalidSnapshot, false,
+ &hufd, true);
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+ }
+
+ if (DirtySnapshot.speculativeToken)
+ SpeculativeInsertionWait(DirtySnapshot.xmin,
+ DirtySnapshot.speculativeToken);
+ else
+ XactLockTableWait(xwait, heap, &ctid_wait,
+ XLTW_RecheckExclusionConstr);
goto retry;
}
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 48107d9..4699060 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -151,10 +151,11 @@ lnext:
* case, so as to avoid the "Halloween problem" of repeated
* update attempts. In the latter case it might be sensible
* to fetch the updated tuple instead, but doing so would
- * require changing heap_lock_tuple as well as heap_update and
- * heap_delete to not complain about updating "invisible"
- * tuples, which seems pretty scary. So for now, treat the
- * tuple as deleted and do not process.
+ * require changing heap_update and heap_delete to not complain
+ * about updating "invisible" tuples, which seems pretty scary
+ * (heap_lock_tuple will not complain, but few callers expect
+ * HeapTupleInvisible, and we're not one of them). So for now,
+ * treat the tuple as deleted and do not process.
*/
goto lnext;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index f96fb24..c477d1d 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -46,6 +46,9 @@
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/proc.h"
+#include "storage/procarray.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
@@ -246,6 +249,8 @@ ExecInsert(TupleTableSlot *slot,
}
else
{
+ bool speculative = false;
+
/*
* Constraints might reference the tableoid column, so initialize
* t_tableOid before evaluating them.
@@ -258,6 +263,19 @@ ExecInsert(TupleTableSlot *slot,
if (resultRelationDesc->rd_att->constr)
ExecConstraints(resultRelInfo, slot, estate);
+ vlock:
+ if (resultRelInfo->ri_HasExclusionConstraints)
+ {
+ /*
+ * Before we start insertion proper, acquire our "promise tuple
+ * insertion lock". Others can use that (rather than an XID lock,
+ * which is appropriate only for non-promise tuples) to wait for
+ * us to decide if we're going to go ahead with the insertion.
+ */
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+ speculative = true;
+ }
+
/*
* insert the tuple
*
@@ -265,14 +283,52 @@ ExecInsert(TupleTableSlot *slot,
* the t_self field.
*/
newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, 0, NULL);
+ estate->es_output_cid,
+ speculative ? HEAP_INSERT_SPECULATIVE : 0,
+ NULL);
/*
* insert index entries for tuple
*/
if (resultRelInfo->ri_NumIndices > 0)
+ {
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
estate);
+
+ if (speculative && !ItemPointerIsValid(&MyProc->specInsertTid))
+ {
+ /*
+ * Looks like check_exclusion_constraint decided to
+ * abort the insertion. It already waited for the conflicting
+ * insertion to finish.
+ */
+ /*
+ * Consider possible race: concurrent insertion conflicts with
+ * our speculative heap tuple. Must then "super-delete" the
+ * heap tuple and retry from the start.
+ *
+ * This is occasionally necessary so that "unprincipled
+ * deadlocks" are avoided; now that a conflict was found,
+ * other sessions should not wait on our speculative token, and
+ * they certainly shouldn't treat our speculatively-inserted
+ * heap tuple as an ordinary tuple that it must wait on the
+ * outcome of our xact to UPDATE/DELETE. This makes heap
+ * tuples behave as conceptual "value locks" of short duration,
+ * distinct from ordinary tuples that other xacts must wait on
+ * xmin-xact-end of in the event of a possible unique/exclusion
+ * violation (the violation that arbitrates taking the
+ * alternative UPDATE/IGNORE path).
+ */
+ list_free(recheckIndexes);
+ goto vlock;
+ }
+ }
+
+ if (speculative)
+ {
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+ }
}
if (canSetTag)
@@ -399,7 +455,8 @@ ldelete:;
estate->es_output_cid,
estate->es_crosscheck_snapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd,
+ false);
switch (result)
{
case HeapTupleSelfUpdated:
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a1ebc72..a1c5bcb 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -421,6 +421,13 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
latestXid))
ShmemVariableCache->latestCompletedXid = latestXid;
+ /* Also clear any speculative insertion information */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+
LWLockRelease(ProcArrayLock);
}
else
@@ -438,6 +445,11 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
pgxact->delayChkpt = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
Assert(pgxact->nxids == 0);
Assert(pgxact->overflowed == false);
@@ -476,6 +488,13 @@ ProcArrayClearTransaction(PGPROC *proc)
/* Clear the subtransaction-XID cache too */
pgxact->nxids = 0;
pgxact->overflowed = false;
+
+ /* these should be clear, but just in case.. */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
}
/*
@@ -1110,6 +1129,96 @@ TransactionIdIsActive(TransactionId xid)
/*
+ * SetSpeculativeInsertionToken -- Set speculative token
+ *
+ * The backend local counter value is set, to allow waiters to differentiate
+ * individual speculative insertions.
+ */
+void
+SetSpeculativeInsertionToken(uint32 token)
+{
+ MyProc->specInsertToken = token;
+}
+
+/*
+ * SetSpeculativeInsertionTid -- Set TID for speculative relfilenode
+ */
+void
+SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel = relnode;
+ ItemPointerCopy(tid, &MyProc->specInsertTid);
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * ClearSpeculativeInsertionState -- Clear token and TID for ourselves
+ */
+void
+ClearSpeculativeInsertionState(void)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * Returns a speculative insertion token for waiting for the insertion to
+ * finish
+ */
+uint32
+SpeculativeInsertionIsInProgress(TransactionId xid, RelFileNode rel,
+ ItemPointer tid)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ uint32 result = 0;
+
+ if (TransactionIdPrecedes(xid, RecentXmin))
+ return result;
+
+ /*
+ * Get the top transaction id.
+ *
+ * XXX We could search the proc array first, like
+ * TransactionIdIsInProgress() does, but this isn't performance-critical.
+ */
+ xid = SubTransGetTopmostTransaction(xid);
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+
+ if (pgxact->xid == xid)
+ {
+ /*
+ * Found the backend. Is it doing a speculative insertion of the
+ * given tuple?
+ */
+ if (RelFileNodeEquals(proc->specInsertRel, rel) &&
+ ItemPointerEquals(tid, &proc->specInsertTid))
+ result = proc->specInsertToken;
+
+ break;
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+
+ return result;
+}
+
+
+/*
* GetOldestXmin -- returns oldest transaction that was running
* when any current transaction was started.
*
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index d13a167..7a1df22 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -575,6 +575,69 @@ ConditionalXactLockTableWait(TransactionId xid)
return true;
}
+static uint32 speculativeInsertionToken = 0;
+
+/*
+ * SpeculativeInsertionLockAcquire
+ *
+ * Insert a lock showing that the given transaction ID is inserting a tuple,
+ * but hasn't yet decided whether it's going to keep it. The lock can then be
+ * used to wait for the decision to go ahead with the insertion, or aborting
+ * it.
+ *
+ * The token is used to distinguish multiple insertions by the same
+ * transaction. A counter will do, for example.
+ */
+void
+SpeculativeInsertionLockAcquire(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ speculativeInsertionToken++;
+ SetSpeculativeInsertionToken(speculativeInsertionToken);
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ (void) LockAcquire(&tag, ExclusiveLock, false, false);
+}
+
+/*
+ * SpeculativeInsertionLockRelease
+ *
+ * Delete the lock showing that the given transaction is speculatively
+ * inserting a tuple.
+ */
+void
+SpeculativeInsertionLockRelease(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ LockRelease(&tag, ExclusiveLock, false);
+}
+
+/*
+ * SpeculativeInsertionWait
+ *
+ * Wait for the specified transaction to finish or abort the insertion of a
+ * tuple.
+ */
+void
+SpeculativeInsertionWait(TransactionId xid, uint32 token)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, token);
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(token != 0);
+
+ (void) LockAcquire(&tag, ShareLock, false, false);
+ LockRelease(&tag, ShareLock, false);
+}
+
+
/*
* XactLockTableWaitErrorContextCb
* Error context callback for transaction lock waits.
@@ -873,6 +936,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
tag->locktag_field1,
tag->locktag_field2);
break;
+ case LOCKTAG_PROMISE_TUPLE_INSERTION:
+ appendStringInfo(buf,
+ _("tuple insertion by transaction %u"),
+ tag->locktag_field1);
+ break;
case LOCKTAG_OBJECT:
appendStringInfo(buf,
_("object %u of class %u of database %u"),
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index a1967b69..95d62cb 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -28,6 +28,7 @@ static const char *const LockTagTypeNames[] = {
"tuple",
"transactionid",
"virtualxid",
+ "inserter transactionid",
"object",
"userlock",
"advisory"
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 777f55c..99bb417 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -170,6 +170,13 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -726,6 +733,17 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Assert(htup->t_tableOid != InvalidOid);
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
+ snapshot->speculativeToken = 0;
+
+ /*
+ * Never return "super-deleted" tuples
+ *
+ * XXX: Comment this code out and you'll get conflicts within
+ * ExecLockUpdateTuple(), which result in an infinite loop.
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -807,6 +825,26 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
{
+ RelFileNode rnode;
+ ForkNumber forkno;
+ BlockNumber blockno;
+
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+
+ /* tuples can only be in the main fork */
+ Assert(forkno == MAIN_FORKNUM);
+ Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
+
+ /*
+ * Set speculative token. Caller can worry about xmax, since it
+ * requires a conclusively locked row version, and a concurrent
+ * update to this tuple is a conflict of its purposes.
+ */
+ snapshot->speculativeToken =
+ SpeculativeInsertionIsInProgress(HeapTupleHeaderGetRawXmin(tuple),
+ rnode,
+ &htup->t_self);
+
snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
/* XXX shouldn't we fall through to look at xmax? */
return true; /* in insertion by other */
@@ -922,6 +960,13 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ /*
+ * Never return "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return false;
+
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1126,6 +1171,13 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
+ * Immediately VACUUM "super-deleted" tuples
+ */
+ if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
+ InvalidTransactionId))
+ return HEAPTUPLE_DEAD;
+
+ /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 939d93d..62e760a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -28,6 +28,7 @@
#define HEAP_INSERT_SKIP_WAL 0x0001
#define HEAP_INSERT_SKIP_FSM 0x0002
#define HEAP_INSERT_FROZEN 0x0004
+#define HEAP_INSERT_SPECULATIVE 0x0008
typedef struct BulkInsertStateData *BulkInsertState;
@@ -141,7 +142,7 @@ extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
CommandId cid, int options, BulkInsertState bistate);
extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd);
+ HeapUpdateFailureData *hufd, bool killspeculative);
extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index a2ed2a0..870985d 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -73,6 +73,8 @@
#define XLOG_HEAP_SUFFIX_FROM_OLD (1<<6)
/* last xl_heap_multi_insert record for one heap_multi_insert() call */
#define XLOG_HEAP_LAST_MULTI_INSERT (1<<7)
+/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */
+#define XLOG_HEAP_KILLED_SPECULATIVE_TUPLE XLOG_HEAP_LAST_MULTI_INSERT
/* convenience macro for checking whether any form of old tuple was logged */
#define XLOG_HEAP_CONTAINS_OLD \
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41288ed..123bbae 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -318,6 +318,7 @@ typedef struct ResultRelInfo
int ri_NumIndices;
RelationPtr ri_IndexRelationDescs;
IndexInfo **ri_IndexRelationInfo;
+ bool ri_HasExclusionConstraints;
TriggerDesc *ri_TrigDesc;
FmgrInfo *ri_TrigFunctions;
List **ri_TrigWhenExprs;
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index f5d70e5..6bb95fc 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -76,6 +76,11 @@ extern bool ConditionalXactLockTableWait(TransactionId xid);
extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
+/* Lock an XID for tuple insertion (used to wait for an insertion to finish) */
+extern void SpeculativeInsertionLockAcquire(TransactionId xid);
+extern void SpeculativeInsertionLockRelease(TransactionId xid);
+extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
+
/* Lock a general object (other than a relation) of the current database */
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1100923..9c21810 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -176,6 +176,8 @@ typedef enum LockTagType
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
+ LOCKTAG_PROMISE_TUPLE_INSERTION, /* tuple insertion, keyed by Xid */
+ /* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
@@ -261,6 +263,14 @@ typedef struct LOCKTAG
(locktag).locktag_type = LOCKTAG_VIRTUALTRANSACTION, \
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
+ ((locktag).locktag_field1 = (xid), \
+ (locktag).locktag_field2 = (token), \
+ (locktag).locktag_field3 = 0, \
+ (locktag).locktag_field4 = 0, \
+ (locktag).locktag_type = LOCKTAG_PROMISE_TUPLE_INSERTION, \
+ (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+
#define SET_LOCKTAG_OBJECT(locktag,dboid,classoid,objoid,objsubid) \
((locktag).locktag_field1 = (dboid), \
(locktag).locktag_field2 = (classoid), \
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index e807a2e..cd15570 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,9 +16,11 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "storage/itemptr.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
+#include "storage/relfilenode.h"
/*
* Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
@@ -132,6 +134,17 @@ struct PGPROC
*/
SHM_QUEUE myProcLocks[NUM_LOCK_PARTITIONS];
+ /*
+ * Info to allow us to perform speculative insertion without "unprincipled
+ * deadlocks". This state allows others to wait on the outcome of an
+ * optimistically inserted speculative tuple for only the duration of the
+ * insertion (not to the end of our xact) iff the insertion does not work
+ * out (due to our detecting a conflict).
+ */
+ uint32 specInsertToken;
+ RelFileNode specInsertRel;
+ ItemPointerData specInsertTid;
+
struct XidCache subxids; /* cache for subtransaction XIDs */
/* Per-backend LWLock. Protects fields below. */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 97c6e93..ea2bba9 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -55,6 +55,13 @@ extern TransactionId GetOldestXmin(Relation rel, bool ignoreVacuum);
extern TransactionId GetOldestActiveTransactionId(void);
extern TransactionId GetOldestSafeDecodingTransactionId(void);
+extern void SetSpeculativeInsertionToken(uint32 token);
+extern void SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid);
+extern void ClearSpeculativeInsertionState(void);
+extern uint32 SpeculativeInsertionIsInProgress(TransactionId xid,
+ RelFileNode rel,
+ ItemPointer tid);
+
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 26fb257..cd5ad76 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -87,6 +87,17 @@ typedef struct SnapshotData
bool copied; /* false if it's a static snapshot */
/*
+ * Snapshot's speculative token is value set by HeapTupleSatisfiesDirty,
+ * indicating that the tuple is being inserted speculatively, and may yet
+ * be "super-deleted" before EOX. The caller may use the value with
+ * PromiseTupleInsertionWait to wait for the inserter to decide. It is only
+ * set when a valid 'xmin' is set, too. By convention, when
+ * speculativeToken is zero, the caller must assume that is should wait on
+ * a non-speculative tuple (i.e. wait for xmin/xmax to commit).
+ */
+ uint32 speculativeToken;
+
+ /*
* note: all ids in subxip[] are >= xmin, but we don't bother filtering
* out any that are >= xmax
*/
On Sat, Feb 14, 2015 at 2:06 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
Thanks for taking a look at it. That's somewhat cleaned up in the
attached patchseries - V2.2. This has been rebased to repair the minor
bit-rot pointed out by Thom.I don't really have the energy to review this patch in one batch, so I'd
really like to split this into two:
I think we're all feeling worn out at this point, by this patch and by
others. I do appreciate your making the effort.
1. Solve the existing "problem" with exclusion constraints that if two
insertions happen concurrently, one of them might be aborted with "deadlock
detected" error, rather then "conflicting key violation" error. That really
wouldn't be worth fixing otherwise, but it happens to be a useful stepping
stone for this upsert patch.2. All the rest.
I think we need a more pragmatic approach to dealing with the
exclusion constraint problems.
I took a stab at extracting the parts needed to do 1. See attached. I didn't
modify ExecUpdate to use speculative insertions, so the issue remains for
UPDATEs, but that's easy to add.
Cool.
I did not solve the potential for livelocks (discussed at
/messages/by-id/CAM3SWZTfTt_fehet3tU3YKCpCYPYnNaUqUZ3Q+NAASnH-60teA@mail.gmail.com).
The patch always super-deletes the already-inserted tuple on conflict, and
then waits for the other inserter. That would be nice to fix, using one of
the ideas from that thread, or something else.
How about we don't super-delete at all with exclusion constraints? I'd
be willing to accept unprincipled deadlocks for exclusion constraints,
because they already exist today for UPSERT/NOSERT type use cases, and
with idiomatic usage seem far less likely for the IGNORE variant
(especially with exclusion constraints). I can see people using ON
CONFLICT UPDATE where they'd almost or actually be better off using a
plain UPDATE - that's quite a different situation. I find livelocks to
be a *very* scary prospect, and I don't think the remediations that
were mentioned are going to fly. It's just too messy, and too much of
a niche use case. TBH I am skeptical of the idea that we can fix
exclusion constraints properly in this way at all, at least not
without the exponential backoff thing, which just seems horrible.
We never really debated the options for how to do the speculative insertion
and super-deletion. This patch is still based on the quick & dirty demo
patches I posted about a year ago, in response to issues you found with
earlier versions. That might be OK - maybe I really hit the nail on
designing those things and got them right on first try - but more likely
there are better alternatives.
Intuitively, it seem likely that you're right here. However, it was
necessary to work through the approach to see what the problems are.
For example, the need for modifications to tqual.c became apparent
only through putting a full implementation of ON CONFLICT UPDATE
through various tests. In general, I've emphasized that the problems
with any given value locking implementation are non-obvious. Anyone
who thinks that he can see all the problems with his approach to value
locking without having a real implementation that is tested for
problems like unprincipled deadlocks is probably wrong.
That's sort of where I'm coming from with suggesting we allow
unprincipled deadlocks with exclusion constraints. I'm not confident
that we can find a perfect solution, and know that it's a perfect
solution. It's too odd, and too niche a requirement. Although you
understood what I was on about when I first talked about unprincipled
deadlocks, I think that acceptance of that idea came much later from
other people, because it's damn complicated. It's not worth it to add
some weird "Dining philosophers" exponential backoff thing to make
sure that the IGNORE variant when used with exclusion constraints can
never deadlock in an unprincipled way, but if it is worth it then it
seems unreasonable to suppose that this patch needs to solve that
pre-existing problem. No?
If we do something like an exponential backoff, which I think might
work, I fear that that kind of yak-shaving will leave us with
something impossible to justify committing; a total mess. Better the
devil you know (possible unprincipled deadlocks with the IGNORE
variant + exclusion constraints).
Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock for
every insertion? That seems bad for performance reasons. Also, are we happy
with adding the new fields to the proc array? Another alternative that was
discussed was storing the speculative insertion token on the heap tuple
itself. (See
/messages/by-id/52D00D2D.6030307@vmware.com)
Whatever works, really. I can't say that the performance implications
of acquiring that hwlock are at the forefront of my mind. I never
found that to be a big problem on an 8 core box, relative to vanilla
INSERTs, FWIW - lock contention is not normal, and may be where any
heavweight lock costs would really be encountered. Besides, the update
path won't have to do this at all.
I don't see how storing the promise token in heap tuples buys us not
having to involve heavyweight locking of tokens. (I think you may have
made a thinko in suggesting otherwise)
Are we happy with the way super-deletion works? Currently, the xmin field is
set to invalid to mark the tuple as super-deleted. That required checks in
HeapTupleSatisfies* functions. One alternative would be to set xmax equal to
xmin, and teach the code currently calls XactLockTableWait() to check if
xmax=xmin, and not consider the tuple as conflicting.
That couldn't work without further HeapTupleSatisfiesDirty() logic.
Besides, why should one transaction be entitled to insert a
conflicting value tuple just because a still running transaction
deleted it (having also inserted the tuple itself)? Didn't one
prototype version of value locking #2 have that as a bug [1]/messages/by-id/52D5C74E.6090608@vmware.com -- Peter Geoghegan? In fact,
originally, didn't the "xmin set to invalid" thing come immediately
from realizing that that wasn't workable?
I too think the tqual.c changes are ugly. I can't see a way around
using a token lock, though - I would only consider (and only consider)
refining the interface by which a waiter becomes aware that it must
wait on the outcome of the inserting xact's speculative
insertion/value lock (and in particular, that is should not wait on
xact outcome). We clearly need something to wait on that isn't an
XID...heavyweight locking of a token value is the obvious way of doing
that.
(thinks about it some more for a while)
It seems like what you're really taking issue with - the real issue -
is that we're introducing this whole new idea of making a tuple
visible for a moment, a moment that is potentially only a slice of its
originating transaction's total duration. Setting xmin to
invalidTransactionId is one way to do that, and may,
counterintuitively, even be the best way, but the fact that we're
playing new games with visibility is the real point of concern. We
could do something like store the token in the heap tuple, allowing us
to make fewer additions to PGPROC, but that seems kind of pointless
(and a waste of disk space). Playing new games with visibility is the
nature of the beast.
We keep talking about mechanism, but what if we *did* have the
infomask bit space to represent that xmin=xmax is a broken promise
tuple (and not a deleted, self-inserted, combocid-requiring tuple that
may or may not have been a promise tuple at some point in time)? I
think that a minor aesthetic improvement is the best we can hope for,
and maybe even that is too much to expect. Maybe we should just own
the fact that we're playing a new sort of game with visibility, and
keep things as they are. You might consider that setting xmin to
invalidTransactionId is more "honest" than any alternative scheme that
happens to avoid changes to HeapTupleSatisfiesMVCC() and so on.
Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
Maybe he is right...if that can be made to be reliable (always
WAL-logged), it could be marginally better than setting xmin to
invalidTransactionId. But only marginally; it seems like your
discomfort is basically inherent to playing these new games with
visibility, which is inherent to this design. As I said, I am having a
hard time seeing a way to do anything more than polish what we have
here. That's probably fine, though...it hasn't proven to be too
problematic (exclusion constraints aside).
Not having to change HeapTupleSatisfiesMVCC() and so on (to check if
xmin = InvalidTransactionId) is not obviously a useful goal here,
since ISTM that any alternative would have to *logically* do the same
thing. If I'm off the mark about your thinking here, please correct
me....are you just worried about extra cycles in the
HeapTupleSatisfies* routines?
[1]: /messages/by-id/52D5C74E.6090608@vmware.com -- Peter Geoghegan
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 13, 2015 at 7:22 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
In patch 1, "sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken."Doesn't that warrant bumping catversion?
Yes, but traditionally that is left until the last minute - when the
patch is committed. That's why it's called out in the commit message
(it isn't otherwise obvious - it's not a common catversion
necessitating change like an addition to a system catalog).
Patch 2 + * When killing a speculatively-inserted tuple, we set xmin to invalid and +if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))When doing this, should we also set the HEAP_XMIN_INVALID hint bit?
<reads more of patch>
Ok, I see we're not doing this because the check for a super deleted tuple
is already cheap. Probably worth mentioning that in the comment...
See my remarks to Heikki on this. I don't think it adds much.
ExecInsert(): + * We don't suppress the effects (or, perhaps, side-effects) of + * BEFORE ROW INSERT triggers. This isn't ideal, but then we + * cannot proceed with even considering uniqueness violations until + * these triggers fire on the one hand, but on the other hand they + * have the ability to execute arbitrary user-defined code which + * may perform operations entirely outside the system's ability to + * nullify.I'm a bit confused as to why we're calling BEFORE triggers out here...
hasn't this always been true for both BEFORE *and* AFTER triggers? The
comment makes me wonder if there's some magic that happens with AFTER...
Yes, but the difference here is that the UPDATE path could be taken
(which is sort of like when a before row insert path returns NULL).
What I'm calling out is the dependency on firing before row insert
triggers to decide if the alternative path must be taken. Roughly
speaking, we must perform "part" of the INSERT (firing of before row
insert triggers) in order to decide to do or not do an INSERT. This
is, as I said, similar to when those triggers return NULL, and won't
matter with idiomatic patterns of before trigger usage. Still feels
worth calling out, because sometimes users do foolishly write before
triggers with many external side-effects.
ExecLockUpdateTuple(): + * Try to lock tuple for update as part of speculative insertion. If + * a qual originating from ON CONFLICT UPDATE is satisfied, update + * (but still lock row, even though it may not satisfy estate's + * snapshot).I find this confusing... which row is being locked? The previously inserted
one? Perhaps a better wording would be "satisfied, update. Lock the original
row even if it doesn't satisfy estate's snapshot."
Take a look at the executor README. We're locking the only row that
can be locked when an UPSERT non-conclusively thinks to take the
UPDATE path: the row that was found during our pre-check. We can only
UPDATE when we find such a tuple, and then lock it without finding a
row-level conflict.
infer_unique_index(): +/* + * We need not lock the relation since it was already locked, either by + * the rewriter or when expand_inherited_rtentry() added it to the query's + * rangetable. + */ +relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid; + +relation = heap_open(relationObjectId, NoLock);Seems like there should be an Assert here. Also, the comment should probably
go before the heap_open call.
An Assert() of what? Note that the similar function
get_relation_info() does about the same thing here.
heapam_xlog.h:
+/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */
I wish this would mention why it's safe to do this. Also, the comment
mentions xl_heap_delete when the #define is for
XLOG_HEAP_KILLED_SPECULATIVE_TUPLE; presumably that's wrong. Perhaps:
/*
* reuse XLOG_HEAP_LAST_MULTI_INSERT bit for
* XLOG_HEAP_KILLED_SPECULATIVE_TUPLE. This is safe because we never do
* multi-inserts for INSERT ON CONFLICT.
*/
It's safe, as the comment indicates, because the former is only used
for xl_heap_multi_insert records, while the latter is only used for
xl_heap_delete records. There is no ambiguity, because naturally we're
always able to establish what type of record is under consideration
before checking the bit is set.
The XLOG_HEAP_* format is used for other flags there, despite the fact
that other flags (like XLOG_HEAP_PREFIX_FROM_OLD) can only appear in
certain context-appropriate xl_heap_* records. AFAICT, all that I've
done that's new here is rely on that, since a bunch of those bits were
used up in the last year or two, and the need to even consider bit
reuse here is a new problem.
I'll review the remaining patches later.
Thanks.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/16/2015 02:44 AM, Peter Geoghegan wrote:
On Sat, Feb 14, 2015 at 2:06 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:I did not solve the potential for livelocks (discussed at
/messages/by-id/CAM3SWZTfTt_fehet3tU3YKCpCYPYnNaUqUZ3Q+NAASnH-60teA@mail.gmail.com).
The patch always super-deletes the already-inserted tuple on conflict, and
then waits for the other inserter. That would be nice to fix, using one of
the ideas from that thread, or something else.How about we don't super-delete at all with exclusion constraints? I'd
be willing to accept unprincipled deadlocks for exclusion constraints,
because they already exist today for UPSERT/NOSERT type use cases, and
with idiomatic usage seem far less likely for the IGNORE variant
(especially with exclusion constraints).
So INSERT ON CONFLICT IGNORE on a table with an exclusion constraint
might fail. I don't like that. The point of having the command in the
first place is to deal with concurrency issues. If it sometimes doesn't
work, it's broken.
I can see people using ON
CONFLICT UPDATE where they'd almost or actually be better off using a
plain UPDATE - that's quite a different situation. I find livelocks to
be a *very* scary prospect, and I don't think the remediations that
were mentioned are going to fly. It's just too messy, and too much of
a niche use case. TBH I am skeptical of the idea that we can fix
exclusion constraints properly in this way at all, at least not
without the exponential backoff thing, which just seems horrible.
The idea of comparing the TIDs of the tuples as a tie-breaker seems most
promising to me. If the conflicting tuple's TID is smaller than the
inserted tuple's, super-delete and wait. Otherwise, wait without
super-deletion. That's really simple. Do you see a problem with that?
Although you understood what I was on about when I first talked about
unprincipled deadlocks, I think that acceptance of that idea came
much later from other people, because it's damn complicated.
BTW, it would good to explain somewhere in comments or a README the term
"unprincipled deadlock". It's been a useful concept, and hard to grasp.
If you defined it once, with examples and everything, then you could
just have "See .../README" in other places that need to refer it.
It's not worth it to add
some weird "Dining philosophers" exponential backoff thing to make
sure that the IGNORE variant when used with exclusion constraints can
never deadlock in an unprincipled way, but if it is worth it then it
seems unreasonable to suppose that this patch needs to solve that
pre-existing problem. No?
The point of solving the existing problem is that it allows us to split
the patch, for easier review.
Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock for
every insertion? That seems bad for performance reasons. Also, are we happy
with adding the new fields to the proc array? Another alternative that was
discussed was storing the speculative insertion token on the heap tuple
itself. (See
/messages/by-id/52D00D2D.6030307@vmware.com)Whatever works, really. I can't say that the performance implications
of acquiring that hwlock are at the forefront of my mind. I never
found that to be a big problem on an 8 core box, relative to vanilla
INSERTs, FWIW - lock contention is not normal, and may be where any
heavweight lock costs would really be encountered.
Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-).
I don't see how storing the promise token in heap tuples buys us not
having to involve heavyweight locking of tokens. (I think you may have
made a thinko in suggesting otherwise)
It wouldn't get rid of heavyweight locks, but it would allow getting rid
of the procarray changes. The inserter could generate a token, then
acquire the hw-lock on the token, and lastly insert the heap tuple with
the correct token.
Are we happy with the way super-deletion works? Currently, the xmin field is
set to invalid to mark the tuple as super-deleted. That required checks in
HeapTupleSatisfies* functions. One alternative would be to set xmax equal to
xmin, and teach the code currently calls XactLockTableWait() to check if
xmax=xmin, and not consider the tuple as conflicting.That couldn't work without further HeapTupleSatisfiesDirty() logic.
Why not?
Besides, why should one transaction be entitled to insert a
conflicting value tuple just because a still running transaction
deleted it (having also inserted the tuple itself)? Didn't one
prototype version of value locking #2 have that as a bug [1]? In fact,
originally, didn't the "xmin set to invalid" thing come immediately
from realizing that that wasn't workable?
Ah, right. So the problem was that some code might assume that if you
insert a row, delete it in the same transaction, and then insert the
same value again, the 2nd insertion will always succeed because the
previous insertion effectively acquired a value-lock on the key.
Ok, so we can't unconditionally always ignore tuples with xmin==xmax. We
would need an infomask bit to indicate super-deletion, and only ignore
it if the bit is set.
I'm starting to think that we should bite the bullet and consume an
infomask bit for this. The infomask bits are a scarce resource, but we
should use them when it makes sense. It would be good for forensic
purposes too, to leave a trace that a super-deletion happened.
I too think the tqual.c changes are ugly. I can't see a way around
using a token lock, though - I would only consider (and only consider)
refining the interface by which a waiter becomes aware that it must
wait on the outcome of the inserting xact's speculative
insertion/value lock (and in particular, that is should not wait on
xact outcome). We clearly need something to wait on that isn't an
XID...heavyweight locking of a token value is the obvious way of doing
that.
Yeah.
Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
Maybe he is right...if that can be made to be reliable (always
WAL-logged), it could be marginally better than setting xmin to
invalidTransactionId.
I'm not a big fan of that. The xmin-invalid bit is currently always just
a hint.
But only marginally; it seems like your
discomfort is basically inherent to playing these new games with
visibility, which is inherent to this design.
No, I get that we're playing games with visibility. I want to find the
best way to implement those games.
As I said, I am having a
hard time seeing a way to do anything more than polish what we have
here. That's probably fine, though...it hasn't proven to be too
problematic (exclusion constraints aside).Not having to change HeapTupleSatisfiesMVCC() and so on (to check if
xmin = InvalidTransactionId) is not obviously a useful goal here,
since ISTM that any alternative would have to *logically* do the same
thing. If I'm off the mark about your thinking here, please correct
me....are you just worried about extra cycles in the
HeapTupleSatisfies* routines?
Extra cycles yes, but even more importantly, code readability and
maintainability.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-02-16 10:00:24 +0200, Heikki Linnakangas wrote:
On 02/16/2015 02:44 AM, Peter Geoghegan wrote:
Are we happy with acquiring the SpeculativeInsertLock heavy-weight lock for
every insertion? That seems bad for performance reasons. Also, are we happy
with adding the new fields to the proc array? Another alternative that was
discussed was storing the speculative insertion token on the heap tuple
itself. (See
/messages/by-id/52D00D2D.6030307@vmware.com)Whatever works, really. I can't say that the performance implications
of acquiring that hwlock are at the forefront of my mind. I never
found that to be a big problem on an 8 core box, relative to vanilla
INSERTs, FWIW - lock contention is not normal, and may be where any
heavweight lock costs would really be encountered.Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-).
I don't think it actually uses the fast path, does it? IIRC that's
restricted to LOCKTAG_RELATION when done via LockAcquireExtended + open
coded for the VirtualXactLock table.
I'm not super worried atm either. Worth checking, but probably not worth
obsessing over.
Besides, why should one transaction be entitled to insert a
conflicting value tuple just because a still running transaction
deleted it (having also inserted the tuple itself)? Didn't one
prototype version of value locking #2 have that as a bug [1]? In fact,
originally, didn't the "xmin set to invalid" thing come immediately
from realizing that that wasn't workable?Ah, right. So the problem was that some code might assume that if you insert
a row, delete it in the same transaction, and then insert the same value
again, the 2nd insertion will always succeed because the previous insertion
effectively acquired a value-lock on the key.Ok, so we can't unconditionally always ignore tuples with xmin==xmax. We
would need an infomask bit to indicate super-deletion, and only ignore it if
the bit is set.I'm starting to think that we should bite the bullet and consume an infomask
bit for this. The infomask bits are a scarce resource, but we should use
them when it makes sense. It would be good for forensic purposes too, to
leave a trace that a super-deletion happened.
Well, we IIRC don't have any left right now. We could reuse
MOVED_IN|MOVED_OUT, as that never was set in the past. But it'd
essentially use two infomask bits forever...
Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
Maybe he is right...if that can be made to be reliable (always
WAL-logged), it could be marginally better than setting xmin to
invalidTransactionId.I'm not a big fan of that. The xmin-invalid bit is currently always just a
hint.
We already rely on XMIN_INVALID being set correctly for
freezing. C.f. HeapTupleHeaderXminFrozen, HeapTupleHeaderXminInvalid, et
al. So it'd not necessarily end up being that bad. And the super
deletion could easily just set it in the course of it's WAL logging.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 16, 2015 at 12:00 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
So INSERT ON CONFLICT IGNORE on a table with an exclusion constraint might
fail. I don't like that. The point of having the command in the first place
is to deal with concurrency issues. If it sometimes doesn't work, it's
broken.
I don't like it either, although I think it wouldn't come up very
often with exclusion constraints - basically, it would rarely be
noticed due to the different use cases. To be honest, in suggesting
this idea I was hedging against us not figuring out a solution to that
problem in time. You didn't like my suggestion of dropping IGNORE
entirely, either. I'll do my best to come up with something, but I'm
uncomfortable that at this late stage, ON CONFLICT IGNORE support for
exclusion constraints seems like a risk to the entire project.
I ask that if push comes to shove you show some flexibility here. I'll
try my best to ensure that you don't have to even consider committing
something with a notable omission. You don't have to give me an answer
to this now.
The idea of comparing the TIDs of the tuples as a tie-breaker seems most
promising to me. If the conflicting tuple's TID is smaller than the inserted
tuple's, super-delete and wait. Otherwise, wait without super-deletion.
That's really simple. Do you see a problem with that?
No. I'll work on that, and see how it stands up to stress testing.
Come to think of it, that does seem most promising.
BTW, it would good to explain somewhere in comments or a README the term
"unprincipled deadlock". It's been a useful concept, and hard to grasp. If
you defined it once, with examples and everything, then you could just have
"See .../README" in other places that need to refer it.
Agreed. I have described those in the revised executor README, in case
you missed that. Do you think they ought to have their own section?
Note that hash indexes have "unprincipled deadlock" issues, but no one
has bothered to fix them.
Whatever works, really. I can't say that the performance implications
of acquiring that hwlock are at the forefront of my mind. I never
found that to be a big problem on an 8 core box, relative to vanilla
INSERTs, FWIW - lock contention is not normal, and may be where any
heavweight lock costs would really be encountered.Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-).
Seems that way. But even if that wasn't true, it wouldn't matter,
since I don't see that we have a choice.
I don't see how storing the promise token in heap tuples buys us not
having to involve heavyweight locking of tokens. (I think you may have
made a thinko in suggesting otherwise)It wouldn't get rid of heavyweight locks, but it would allow getting rid of
the procarray changes. The inserter could generate a token, then acquire the
hw-lock on the token, and lastly insert the heap tuple with the correct
token.
Do you really think that's worth the disk overhead? I generally agree
with the "zero overhead" principle, which is that anyone not using the
feature shouldn't pay no price for it (or vanishingly close to no
price). Can't say that I have a good sense of the added distributed
cost (if any) to be paid by adding new fields to the PGPROC struct
(since the PGXACT struct was added in 9.2). Is your only concern that
the PGPROC array will now have more fields, making it more
complicated? Surely that's a price worth paying to avoid making these
heap tuples artificially somewhat larger. You're probably left with
tuples that are at least 8 bytes larger, once alignment is taken into
consideration. That doesn't seem great.
That couldn't work without further HeapTupleSatisfiesDirty() logic.
Why not?
Just meant that it wasn't sufficient to check xmin == xmax, while
allowing that something like that could work with extra work (e.g. the
use of infomask bits)...
Ok, so we can't unconditionally always ignore tuples with xmin==xmax. We
would need an infomask bit to indicate super-deletion, and only ignore it if
the bit is set.
...which is what you say here.
I'm starting to think that we should bite the bullet and consume an infomask
bit for this. The infomask bits are a scarce resource, but we should use
them when it makes sense. It would be good for forensic purposes too, to
leave a trace that a super-deletion happened.I too think the tqual.c changes are ugly. I can't see a way around
using a token lock, though - I would only consider (and only consider)
refining the interface by which a waiter becomes aware that it must
wait on the outcome of the inserting xact's speculative
insertion/value lock (and in particular, that is should not wait on
xact outcome). We clearly need something to wait on that isn't an
XID...heavyweight locking of a token value is the obvious way of doing
that.Yeah.
Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
Maybe he is right...if that can be made to be reliable (always
WAL-logged), it could be marginally better than setting xmin to
invalidTransactionId.I'm not a big fan of that. The xmin-invalid bit is currently always just a
hint.
Well, Andres makes the point that that isn't quite so. TBH, I have a
hard time justifying the use of MOVED_IN|MOVED_OUT...it's not as if
there is a correctness problem with either setting xmin to
InvalidTransactionId, or setting the xmin-invalid hint bit (with
appropriate precautions so that it's not really just a hint). And as
was pointed out, there is something to be said for preserving tuple
header xmin where possible, for forensic reasons.
But only marginally; it seems like your
discomfort is basically inherent to playing these new games with
visibility, which is inherent to this design.No, I get that we're playing games with visibility. I want to find the best
way to implement those games.
That's useful information. Thanks.
are you just worried about extra cycles in the
HeapTupleSatisfies* routines?Extra cycles yes, but even more importantly, code readability and
maintainability.
Sure.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Feb 16, 2015 at 4:11 PM, Peter Geoghegan <pg@heroku.com> wrote:
Jim Nasby said something about setting the HEAP_XMIN_INVALID hint bit.
Maybe he is right...if that can be made to be reliable (always
WAL-logged), it could be marginally better than setting xmin to
invalidTransactionId.I'm not a big fan of that. The xmin-invalid bit is currently always just a
hint.Well, Andres makes the point that that isn't quite so.
Hmm. So the tqual.c routines actually check "if
(HeapTupleHeaderXminInvalid(tuple))". Which is:
#define HeapTupleHeaderXminInvalid(tup) \
( \
((tup)->t_infomask & (HEAP_XMIN_COMMITTED|HEAP_XMIN_INVALID)) == \
HEAP_XMIN_INVALID \
)
What appears to happen if I try to only use the HEAP_XMIN_INVALID bit
(and not explicitly set xmin to InvalidTransactionId, and not
explicitly check that xmin isn't InvalidTransactionId in each
HeapTupleSatisfies* routine) is that after a little while, Jeff Janes'
tool shows up spurious unique violations, as if some tuple has become
visible when it shouldn't have. I guess this is because the
HEAP_XMIN_COMMITTED hint bit can still be set, which in effect
invalidates the HEAP_XMIN_INVALID hint bit.
It takes about 2 minutes before this happens for the first time when
fsync = off, following a fresh initdb, which is unacceptable. I'm not
sure if it's worth trying to figure out how HEAP_XMIN_COMMITTED gets
set. Not that I'm 100% sure that that's what this really is; it just
seems very likely.
Attached broken patch (broken_visible.patch) shows the changes made to
reveal this problem. Only the changes to tqual.c and heap_delete()
should matter here, since I did not test recovery.
I also thought about unifying the check for "if
(TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))" with the conventional
HeapTupleHeaderXminInvalid() macro, and leaving everything else as-is.
This is no good either, and for similar reasons - control often won't
reach the macro, which is behind a check of "if
(!HeapTupleHeaderXminCommitted(tuple))".
The best I think we can hope for is having a dedicated "if
(TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))" macro HeapTupleHeaderSuperDeleted() to do the
work each time, which does not need to be checked so often. A second
attached patch (compact_tqual_works.patch - which is non-broken,
AFAICT) shows how this is possible, while also moving the check
further down within each tqual.c routine (which seems more in keeping
with the fact that super deletion is a relatively obscure concept). I
haven't been able to break this variant using my existing test suite,
so this seems promising as a way of reducing tqual.c clutter. However,
as I said the other day, this is basically just polish.
--
Peter Geoghegan
Attachments:
broken_visible.patchtext/x-patch; charset=US-ASCII; name=broken_visible.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0aa3e57..b777c56 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2899,7 +2899,7 @@ l1:
}
else
{
- HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ HeapTupleHeaderSetXminInvalid(tp.t_data);
}
/* Make sure there is no forward chain link in t_ctid */
@@ -7382,12 +7382,12 @@ heap_xlog_delete(XLogReaderState *record)
htup->t_infomask &= ~(HEAP_XMAX_BITS | HEAP_MOVED);
htup->t_infomask2 &= ~HEAP_KEYS_UPDATED;
HeapTupleHeaderClearHotUpdated(htup);
+ if (xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE)
+ HeapTupleHeaderSetXminInvalid(htup);
+
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
- else
- HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 99bb417..fd857b1 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -170,13 +170,6 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
- /*
- * Never return "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
-
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -735,15 +728,6 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
snapshot->speculativeToken = 0;
- /*
- * Never return "super-deleted" tuples
- *
- * XXX: Comment this code out and you'll get conflicts within
- * ExecLockUpdateTuple(), which result in an infinite loop.
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -960,13 +944,6 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
- /*
- * Never return "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
-
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1171,13 +1148,6 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
- * Immediately VACUUM "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return HEAPTUPLE_DEAD;
-
- /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
compact_tqual_works.patchtext/x-patch; charset=US-ASCII; name=compact_tqual_works.patchDownload
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 99bb417..aed0eeb 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -170,13 +170,6 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
- /*
- * Never return "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
-
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -269,6 +262,9 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */
@@ -735,16 +731,6 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
snapshot->speculativeToken = 0;
- /*
- * Never return "super-deleted" tuples
- *
- * XXX: Comment this code out and you'll get conflicts within
- * ExecLockUpdateTuple(), which result in an infinite loop.
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
-
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -861,6 +847,9 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */
@@ -960,13 +949,6 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
- /*
- * Never return "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return false;
-
if (!HeapTupleHeaderXminCommitted(tuple))
{
if (HeapTupleHeaderXminInvalid(tuple))
@@ -1067,6 +1049,9 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/*
* By here, the inserting transaction has committed - have to check
* when...
@@ -1171,13 +1156,6 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
Assert(htup->t_tableOid != InvalidOid);
/*
- * Immediately VACUUM "super-deleted" tuples
- */
- if (TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
- InvalidTransactionId))
- return HEAPTUPLE_DEAD;
-
- /*
* Has inserting transaction committed?
*
* If the inserting transaction aborted, then the tuple was never visible
@@ -1270,6 +1248,9 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
*/
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return HEAPTUPLE_DEAD;
+
/*
* Okay, the inserter committed, so it was good at some point. Now what
* about the deleting transaction?
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d2ad910..5906df4 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -305,6 +305,15 @@ struct HeapTupleHeaderData
)
/*
+ * Was tuple "super deleted" following unsuccessful speculative insertion (i.e.
+ * conflict was detected at insertion time)?
+ */
+#define HeapTupleHeaderSuperDeleted(tup) \
+( \
+ (!TransactionIdIsValid(HeapTupleHeaderGetRawXmin(tup))) \
+)
+
+/*
* HeapTupleHeaderGetRawXmax gets you the raw Xmax field. To find out the Xid
* that updated a tuple, you might need to resolve the MultiXactId if certain
* bits are set. HeapTupleHeaderGetUpdateXid checks those bits and takes care
On Mon, Feb 16, 2015 at 6:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
The best I think we can hope for is having a dedicated "if
(TransactionIdEquals(HeapTupleHeaderGetRawXmin(tuple),
InvalidTransactionId))" macro HeapTupleHeaderSuperDeleted() to do the
work each time, which does not need to be checked so often. A second
attached patch (compact_tqual_works.patch - which is non-broken,
AFAICT) shows how this is possible, while also moving the check
further down within each tqual.c routine (which seems more in keeping
with the fact that super deletion is a relatively obscure concept).
I attach the patch series of V2.3. Highlights:
* Overhaul of speculative insertion related changes within tqual.c.
Refactored for readability as outlined in my earlier comments quoted
above. Assertions added, serving to show exactly where super deleted
tuples are and are not expected.
* Formally forbid INSERT ... ON CONFLICT into system catalogs. If
nothing else, this obviates the need for historic snapshots to care
about super deleted tuples.
* Minor setrefs.c tweaks. Minor ExecInitModifyTable() tweaks, too.
* Fix for minor bitrot against master branch.
* Further comments on the speculativeInsertionToken per-backend variable.
* Livelock insurance for exclusion constraints.
Importantly, Heikki wanted us to break out the patch to fix the
current problem of theoretical deadlock risks [1]/messages/by-id/54DFC6F8.5050108@vmware.com ahead of committing
ON CONFLICT UPDATE/IGNORE. Heikki acknowledged that there were still
theoretical livelock risks in his reworked minimal patch. After
careful consideration, I have not broken out the changes to do this
incrementally along the lines that Heikki suggested.
Heikki seemed to think that the deadlock problems were not really
worth fixing independently of ON CONFLICT UPDATE support, but rather
represented a useful way of committing code incrementally. Do I have
that right? Certainly, anyone would agree that unprincipled deadlocks
(for regular inserters with exclusion constraints) are better than
livelocks. Heikki did not address the livelock risks with his minimal
reworked patch, which I've done here for ON CONFLICT.
The way I chose to break the livelock (what I call "livelock
insurance") does indeed involve comparing XIDs, which Heikki thought
most promising. But it also involves having the oldest XID wait on
another session's speculative token in the second phase, which
ordinarily does not occur in the second
phase/check_exclusion_or_unique_constraint() call. The idea is that
one session is guaranteed to be the waiter that has a second iteration
within its second-phase check_exclusion_or_unique_constraint() call,
where (following the super deletion of conflict TIDs by the other
conflicting session or sessions) reliably finds that it can proceed
(those other sessions are denied the opportunity to reach their second
phase by our second phase waiter's still-not-super-deleted tuple).
However, it's difficult to see how to map this on to general exclusion
constraint insertion + enforcement. In Heikki's recent sketch of this
[1]: /messages/by-id/54DFC6F8.5050108@vmware.com
deferred until a later patch, and therefore my scheme here cannot work
(recall that for plain inserts with exclusion constraints, we insert
first and check last). I have a hard time justifying adding the
pre-check for plain exclusion constraint inserters given the total
lack of complaints from the field regarding unprincipled deadlocks,
and given the fact that it would probably make the code more
complicated than it needs to be. It is critical that there be a
pre-check to prevent livelock, though, because otherwise the
restarting sessions can go straight to their "second" phase
(check_exclusion_or_unique_constraint() call), without ever realizing
that they should not do so. Therefore, as I said, I have not broken
out the code in line with Heikki's suggestion.
It's possible that I have it wrong here - I was wrong to dismiss
Heikki's contention that the livelock hazards were fixable without too
much pain - but I don't think so.
It seems like the livelock insurance is pretty close to or actually
free, and doesn't imply that a "second phase wait for token" needs to
happen too often. With heavy contention on a small number of possible
tuples (100), and 8 clients deleting and inserting, I can see it
happening only once every couple of hundred milliseconds on my laptop.
It's not hard to imagine why the code didn't obviously appear to be
broken before now, as the window for an actual livelock must have been
small. Also, deadlocks bring about more deadlocks (since the deadlock
detector kicks in), whereas livelocks do not bring about more
livelocks.
I haven't been able to reproduce earlier apparent bugs with exclusion
constraints [2]/messages/by-id/CAM3SWZTkHOwyA5A9ib=uVf0Vs326yoCBdpp_NYkDjM2_-ScxFA@mail.gmail.com recently. I can only speculate that they were fixed.
Does anyone with a big server care to run the procedure outlined for
exclusion constraints in the jjanes_upsert tool [3]https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan? It would be nice
to have additional confidence that the exclusion constraint stuff is
robust.
[1]: /messages/by-id/54DFC6F8.5050108@vmware.com
[2]: /messages/by-id/CAM3SWZTkHOwyA5A9ib=uVf0Vs326yoCBdpp_NYkDjM2_-ScxFA@mail.gmail.com
[3]: https://github.com/petergeoghegan/jjanes_upsert -- Peter Geoghegan
--
Peter Geoghegan
Attachments:
0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchtext/x-patch; charset=US-ASCII; name=0006-User-visible-documentation-for-INSERT-.-ON-CONFLICT-.patchDownload
From f8021860735c488be5eb8ef09f7460a0e0a7ee98 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Fri, 26 Sep 2014 20:59:04 -0700
Subject: [PATCH 6/6] User-visible documentation for INSERT ... ON CONFLICT
{UPDATE | IGNORE}
INSERT ... ON CONFLICT {UPDATE | IGNORE} is documented as a new clause
of the INSERT command. Some potentially surprising interactions with
triggers are noted -- BEFORE INSERT per-row triggers must fire without
the INSERT path necessarily being taken, for example.
All the existing features that INSERT ... ON CONFLICT {UPDATE | IGNORE}
interacts with have these interactions noted. This includes
postgres_fdw, updatable views, table inheritance, RLS and partial unique
indexes.
Finally, a user-level description of the new "MVCC violation" that the
ON CONFLICT UPDATE variant sometimes requires has been added to "Chapter
13 - Concurrency Control", beside existing commentary on READ COMMITTED
mode's special handling of concurrent updates. The new "MVCC violation"
introduced seems somewhat distinct from the existing one (i.e. READ
COMMITTED's handling of when an UPDATE affects a concurrently
updated/deleted tuple, which internally uses a mechanism called
EvalPlanQual()), because in READ COMMITTED mode it is no longer
necessary for any row version to be conventionally visible to the
command's MVCC snapshot for an UPDATE of the row to occur (or for the
row to be locked, should the UPDATE's WHERE clause not be satisfied).
---
doc/src/sgml/ddl.sgml | 23 +++
doc/src/sgml/fdwhandler.sgml | 8 +
doc/src/sgml/keywords.sgml | 7 +
doc/src/sgml/mvcc.sgml | 24 +++
doc/src/sgml/plpgsql.sgml | 14 +-
doc/src/sgml/postgres-fdw.sgml | 8 +
doc/src/sgml/protocol.sgml | 13 +-
doc/src/sgml/ref/alter_policy.sgml | 7 +-
doc/src/sgml/ref/create_policy.sgml | 37 +++-
doc/src/sgml/ref/create_rule.sgml | 7 +-
doc/src/sgml/ref/create_table.sgml | 5 +-
doc/src/sgml/ref/create_trigger.sgml | 5 +-
doc/src/sgml/ref/create_view.sgml | 33 ++-
doc/src/sgml/ref/insert.sgml | 375 ++++++++++++++++++++++++++++++++--
doc/src/sgml/ref/set_constraints.sgml | 6 +-
doc/src/sgml/trigger.sgml | 49 ++++-
16 files changed, 570 insertions(+), 51 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 570a003..7b43a10 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -2428,9 +2428,27 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
<para>
+ There is limited inheritance support for <command>INSERT</command>
+ commands with <literal>ON CONFLICT</> clauses. Tables with
+ children are not generally accepted as targets. One notable
+ exception is that such tables are accepted as targets for
+ <command>INSERT</command> commands with <literal>ON CONFLICT
+ IGNORE</> clauses, provided a unique index inference clause was
+ omitted (which implies that there is no concern about
+ <emphasis>which</> unique index any would-be conflict might arise
+ from). However, tables that happen to be inheritance children are
+ accepted as targets for all variants of <command>INSERT</command>
+ with <literal>ON CONFLICT</>.
+ </para>
+
+ <para>
All check constraints and not-null constraints on a parent table are
automatically inherited by its children. Other types of constraints
(unique, primary key, and foreign key constraints) are not inherited.
+ Therefore, <command>INSERT</command> with <literal>ON CONFLICT</>
+ unique index inference considers only unique constraints/indexes
+ directly associated with the child
+ table.
</para>
<para>
@@ -2515,6 +2533,11 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
+ <literal>INSERT</literal> with an <literal>ON CONFLICT
+ UPDATE</literal> clause does not support the
+ <literal>ONLY</literal> notation, and so in effect tables with
+ inheritance children are not supported for the <literal>ON
+ CONFLICT</literal> variant.
Commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
typically only work on individual, physical tables and do not
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index c1daa4b..0c3dcb5 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -1014,6 +1014,14 @@ GetForeignServerByName(const char *name, bool missing_ok);
source provides.
</para>
+ <para>
+ <command>INSERT</> with an <literal>ON CONFLICT</> clause is not supported
+ with a unique index inference specification (this implies that <literal>ON
+ CONFLICT UPDATE</> is never supported, since the specification is
+ mandatory there). When planning an <command>INSERT</>,
+ <function>PlanForeignModify</> should reject these cases.
+ </para>
+
</sect1>
</chapter>
diff --git a/doc/src/sgml/keywords.sgml b/doc/src/sgml/keywords.sgml
index b0dfd5f..ea58211 100644
--- a/doc/src/sgml/keywords.sgml
+++ b/doc/src/sgml/keywords.sgml
@@ -854,6 +854,13 @@
<entry></entry>
</row>
<row>
+ <entry><token>CONFLICT</token></entry>
+ <entry>non-reserved</entry>
+ <entry></entry>
+ <entry></entry>
+ <entry></entry>
+ </row>
+ <row>
<entry><token>CONNECT</token></entry>
<entry></entry>
<entry>reserved</entry>
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index a0d6867..5e310d7 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -326,6 +326,30 @@
</para>
<para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</> clause is
+ another special case. In Read Committed mode, the implementation will
+ either insert or update each row proposed for insertion, with either one of
+ those two outcomes guaranteed. This is a useful guarantee for many
+ use-cases, but it implies that further liberties must be taken with
+ snapshot isolation. Should a conflict originate in another transaction
+ whose effects are not visible to the <command>INSERT</command>, the
+ <command>UPDATE</command> may affect that row, even though it may be the
+ case that <emphasis>no</> version of that row is conventionally visible to
+ the command. In the same vein, if the secondary search condition of the
+ command (an explicit <literal>WHERE</> clause) is supplied, it is only
+ evaluated on the most recent row version, which is not necessarily the
+ version conventionally visible to the command (if indeed there is a row
+ version conventionally visible to the command at all).
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT IGNORE</> clause may
+ have insertion not proceed for a row due to the outcome of another
+ transaction whose effects are not visible to the <command>INSERT</command>
+ snapshot. Again, this is only the case in Read Committed mode.
+ </para>
+
+ <para>
Because of the above rule, it is possible for an updating command to see an
inconsistent snapshot: it can see the effects of concurrent updating
commands on the same rows it is trying to update, but it
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index 69a0885..59a5945 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -2607,7 +2607,11 @@ END;
<para>
This example uses exception handling to perform either
- <command>UPDATE</> or <command>INSERT</>, as appropriate:
+ <command>UPDATE</> or <command>INSERT</>, as appropriate. It is
+ recommended that applications use <command>INSERT</> with
+ <literal>ON CONFLICT UPDATE</> rather than actually emulating this
+ pattern. This example serves only to illustrate use of
+ <application>PL/pgSQL</application> control flow structures:
<programlisting>
CREATE TABLE db (a INT PRIMARY KEY, b TEXT);
@@ -3771,9 +3775,11 @@ RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id;
<command>INSERT</> and <command>UPDATE</> operations, the return value
should be <varname>NEW</>, which the trigger function may modify to
support <command>INSERT RETURNING</> and <command>UPDATE RETURNING</>
- (this will also affect the row value passed to any subsequent triggers).
- For <command>DELETE</> operations, the return value should be
- <varname>OLD</>.
+ (this will also affect the row value passed to any subsequent triggers,
+ or passed to a special <varname>EXCLUDED</> alias reference within
+ an <command>INSERT</> statement with an <literal>ON CONFLICT UPDATE</>
+ clause). For <command>DELETE</> operations, the return
+ value should be <varname>OLD</>.
</para>
<para>
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 43adb61..fa39661 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -69,6 +69,14 @@
</para>
<para>
+ Note that <filename>postgres_fdw</> currently lacks support for
+ <command>INSERT</command> statements with an <literal>ON CONFLICT
+ UPDATE</> clause. However, the <literal>ON CONFLICT IGNORE</>
+ clause is supported, provided a unique index inference specification
+ is omitted.
+ </para>
+
+ <para>
It is generally recommended that the columns of a foreign table be declared
with exactly the same data types, and collations if applicable, as the
referenced columns of the remote table. Although <filename>postgres_fdw</>
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 3a753a0..ac13d32 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2998,9 +2998,16 @@ CommandComplete (B)
<literal>INSERT <replaceable>oid</replaceable>
<replaceable>rows</replaceable></literal>, where
<replaceable>rows</replaceable> is the number of rows
- inserted. <replaceable>oid</replaceable> is the object ID
- of the inserted row if <replaceable>rows</replaceable> is 1
- and the target table has OIDs;
+ inserted. However, if and only if <literal>ON CONFLICT
+ UPDATE</> is specified, then the tag is <literal>UPSERT
+ <replaceable>oid</replaceable>
+ <replaceable>rows</replaceable></literal>, where
+ <replaceable>rows</replaceable> is the number of rows inserted
+ <emphasis>or updated</emphasis>.
+ <replaceable>oid</replaceable> is the object ID of the
+ inserted row if <replaceable>rows</replaceable> is 1 and the
+ target table has OIDs, and (for the <literal>UPSERT</literal>
+ tag), the row was actually inserted rather than updated;
otherwise <replaceable>oid</replaceable> is 0.
</para>
diff --git a/doc/src/sgml/ref/alter_policy.sgml b/doc/src/sgml/ref/alter_policy.sgml
index 6d03db5..65cd85c 100644
--- a/doc/src/sgml/ref/alter_policy.sgml
+++ b/doc/src/sgml/ref/alter_policy.sgml
@@ -93,8 +93,11 @@ ALTER POLICY <replaceable class="parameter">name</replaceable> ON <replaceable c
The USING expression for the policy. This expression will be added as a
security-barrier qualification to queries which use the table
automatically. If multiple policies are being applied for a given
- table then they are all combined and added using OR. The USING
- expression applies to records which are being retrieved from the table.
+ table then they are all combined and added using OR (except as noted in
+ the <xref linkend="sql-createpolicy"> documentation for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ The USING expression applies to records which are being retrieved from the
+ table.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_policy.sgml b/doc/src/sgml/ref/create_policy.sgml
index 868a6c1..f17192e 100644
--- a/doc/src/sgml/ref/create_policy.sgml
+++ b/doc/src/sgml/ref/create_policy.sgml
@@ -70,11 +70,12 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
Policies can be applied for specific commands or for specific roles. The
default for newly created policies is that they apply for all commands and
roles, unless otherwise specified. If multiple policies apply to a given
- query, they will be combined using OR. Further, for commands which can have
- both USING and WITH CHECK policies (ALL and UPDATE), if no WITH CHECK policy
- is defined then the USING policy will be used for both what rows are visible
- (normal USING case) and which rows will be allowed to be added (WITH CHECK
- case).
+ query, they will be combined using OR (except as noted for
+ <command>INSERT</command> with <literal> ON CONFLICT UPDATE</literal>).
+ Further, for commands which can have both USING and WITH CHECK policies (ALL
+ and UPDATE), if no WITH CHECK policy is defined then the USING policy will
+ be used for both what rows are visible (normal USING case) and which rows
+ will be allowed to be added (WITH CHECK case).
</para>
<para>
@@ -255,6 +256,19 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
as it only ever applies in cases where records are being added to the
relation.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>INSERT</literal> policy WITH
+ CHECK expression also passes for both any existing tuple in the target
+ table that necessitates that the <literal>UPDATE</literal> path be
+ taken, and the final tuple added back into the relation.
+ <literal>INSERT</literal> policies are separately combined using
+ <literal>OR</literal>, and this distinct set of policy expressions must
+ always pass, regardless of whether any or all <literal>UPDATE</literal>
+ policies also pass (in the same tuple check). However, successfully
+ inserted tuples are not subject to <literal>UPDATE</literal> policy
+ enforcement.
+ </para>
</listitem>
</varlistentry>
@@ -263,7 +277,9 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
<listitem>
<para>
Using <literal>UPDATE</literal> for a policy means that it will apply
- to <literal>UPDATE</literal> commands. As <literal>UPDATE</literal>
+ to <literal>UPDATE</literal> commands (or auxiliary <literal>ON
+ CONFLICT UPDATE</literal> clauses of <literal>INSERT</literal>
+ commands). As <literal>UPDATE</literal>
involves pulling an existing record and then making changes to some
portion (but possibly not all) of the record, the
<literal>UPDATE</literal> policy accepts both a USING expression and
@@ -279,6 +295,15 @@ CREATE POLICY <replaceable class="parameter">name</replaceable> ON <replaceable
used for both <literal>USING</literal> and
<literal>WITH CHECK</literal> cases.
</para>
+ <para>
+ Note that <literal>INSERT</literal> with <literal>ON CONFLICT
+ UPDATE</literal> requires that an <literal>UPDATE</literal> policy
+ USING expression always be treated as a WITH CHECK
+ expression. This <literal>UPDATE</literal> policy must
+ always pass, regardless of whether any
+ <literal>INSERT</literal> policy also passes in the same
+ tuple check.
+ </para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_rule.sgml b/doc/src/sgml/ref/create_rule.sgml
index 677766a..34a4ae1 100644
--- a/doc/src/sgml/ref/create_rule.sgml
+++ b/doc/src/sgml/ref/create_rule.sgml
@@ -136,7 +136,12 @@ CREATE [ OR REPLACE ] RULE <replaceable class="parameter">name</replaceable> AS
<para>
The event is one of <literal>SELECT</literal>,
<literal>INSERT</literal>, <literal>UPDATE</literal>, or
- <literal>DELETE</literal>.
+ <literal>DELETE</literal>. Note that an
+ <command>INSERT</command> containing an <literal>ON
+ CONFLICT</literal> clause cannot be used on tables that have
+ either <literal>INSERT</literal> or <literal>UPDATE</literal>
+ rules. Consider using an updatable view instead, which have
+ limited support for <literal>ON CONFLICT IGNORE</literal> only.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 299cce8..a9c1124 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -708,7 +708,10 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
<literal>EXCLUDE</>, and
<literal>REFERENCES</> (foreign key) constraints accept this
clause. <literal>NOT NULL</> and <literal>CHECK</> constraints are not
- deferrable.
+ deferrable. Note that constraints that were created with this
+ clause cannot be used as arbiters of whether or not to take the
+ alternative path with an <command>INSERT</command> statement
+ that includes an <literal>ON CONFLICT UPDATE</> clause.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index aae0b41..1b75b1a 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -76,7 +76,10 @@ CREATE [ CONSTRAINT ] TRIGGER <replaceable class="PARAMETER">name</replaceable>
executes once for any given operation, regardless of how many rows
it modifies (in particular, an operation that modifies zero rows
will still result in the execution of any applicable <literal>FOR
- EACH STATEMENT</literal> triggers).
+ EACH STATEMENT</literal> triggers). Note that since
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is considered an <command>INSERT</command> statement, no
+ <command>UPDATE</command> statement level trigger will be fired.
</para>
<para>
diff --git a/doc/src/sgml/ref/create_view.sgml b/doc/src/sgml/ref/create_view.sgml
index 5dadab1..599c1cb 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -286,8 +286,9 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
<para>
Simple views are automatically updatable: the system will allow
<command>INSERT</>, <command>UPDATE</> and <command>DELETE</> statements
- to be used on the view in the same way as on a regular table. A view is
- automatically updatable if it satisfies all of the following conditions:
+ to be used on the view in the same way as on a regular table (aside from
+ the limitations on ON CONFLICT noted below). A view is automatically
+ updatable if it satisfies all of the following conditions:
<itemizedlist>
<listitem>
@@ -383,6 +384,34 @@ CREATE VIEW vista AS SELECT text 'Hello World' AS hello;
not need any permissions on the underlying base relations (see
<xref linkend="rules-privileges">).
</para>
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT</> clause
+ is only supported on updatable views under specific circumstances.
+ If a set of columns/expressions has been provided with which to
+ infer a unique index to consider as the arbiter of whether the
+ statement ultimately takes an alternative path - if a would-be
+ duplicate violation in some particular unique index is tacitly
+ taken as provoking an alternative <command>UPDATE</command> or
+ <literal>IGNORE</> path - then updatable views are not supported.
+ Since this specification is already mandatory for
+ <command>INSERT</command> with <literal>ON CONFLICT UPDATE</>,
+ this implies that only the <literal>ON CONFLICT IGNORE</> variant
+ is supported, and only when there is no such specification. For
+ example:
+ </para>
+ <para>
+<programlisting>
+-- Unsupported:
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'foo') ON CONFLICT (key)
+ UPDATE SET val = EXCLUDED.val;
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'bar') ON CONFLICT (key)
+ IGNORE;
+
+-- Supported (note the omission of "key" column):
+INSERT INTO my_updatable_view(key, val) VALUES(1, 'baz') ON CONFLICT
+ IGNORE;
+</programlisting>
+ </para>
</refsect2>
</refsect1>
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index a3cccb9..a53b0bf 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -24,6 +24,14 @@ PostgreSQL documentation
[ WITH [ RECURSIVE ] <replaceable class="parameter">with_query</replaceable> [, ...] ]
INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] ) [, ...] | <replaceable class="PARAMETER">query</replaceable> }
+ [ ON CONFLICT [ ( { <replaceable class="parameter">column_name_index</replaceable> | ( <replaceable class="parameter">expression_index</replaceable> ) } [, ...] [ WHERE <replaceable class="PARAMETER">index_condition</replaceable> ] ) ]
+ { IGNORE | UPDATE
+ SET { <replaceable class="PARAMETER">column_name</replaceable> = { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } |
+ ( <replaceable class="PARAMETER">column_name</replaceable> [, ...] ) = ( { <replaceable class="PARAMETER">expression</replaceable> | DEFAULT } [, ...] )
+ } [, ...]
+ [ WHERE <replaceable class="PARAMETER">condition</replaceable> ]
+ }
+ ]
[ RETURNING * | <replaceable class="parameter">output_expression</replaceable> [ [ AS ] <replaceable class="parameter">output_name</replaceable> ] [, ...] ]
</synopsis>
</refsynopsisdiv>
@@ -32,9 +40,15 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<title>Description</title>
<para>
- <command>INSERT</command> inserts new rows into a table.
- One can insert one or more rows specified by value expressions,
- or zero or more rows resulting from a query.
+ <command>INSERT</command> inserts new rows into a table. One can
+ insert one or more rows specified by value expressions, or zero or
+ more rows resulting from a query. An alternative path
+ (<literal>IGNORE</literal> or <literal>UPDATE</literal>) can
+ optionally be specified, to be taken in the event of detecting that
+ proceeding with insertion would result in a conflict (i.e. a
+ conflicting tuple already exists). The alternative path is
+ considered individually for each row proposed for insertion, and is
+ taken (or not taken) once per row.
</para>
<para>
@@ -59,25 +73,216 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</para>
<para>
+ The optional <literal>ON CONFLICT</> clause specifies a path to
+ take as an alternative to raising a conflict related error.
+ <literal>ON CONFLICT IGNORE</> simply avoids inserting any
+ individual row when it is determined that a conflict related error
+ would otherwise need to be raised. <literal>ON CONFLICT UPDATE</>
+ has the system take an <command>UPDATE</command> path in respect of
+ such rows instead. <literal>ON CONFLICT UPDATE</> guarantees an
+ atomic <command>INSERT</command> or <command>UPDATE</command>
+ outcome - provided there is no incidental error, one of those two
+ outcomes is guaranteed, even under high concurrency.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> optionally accepts a
+ <literal>WHERE</> clause <replaceable>condition</>. When provided,
+ the statement only proceeds with updating if the
+ <replaceable>condition</> is satisfied. Otherwise, unlike a
+ conventional <command>UPDATE</command>, the row is still locked for
+ update. Note that the <replaceable>condition</> is evaluated last,
+ after a conflict has been identified as a candidate to update.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> is effectively an auxiliary query of
+ its parent <command>INSERT</command>. Two special aliases are
+ visible when <literal>ON CONFLICT UPDATE</> is specified -
+ <varname>TARGET</> and <varname>EXCLUDED</>. The first alias is a
+ standard, generic alias for the target relation, while the second
+ alias refers to rows originally proposed for insertion. Both
+ aliases can be used in the auxiliary query targetlist and
+ <literal>WHERE</> clause, while the <varname>TARGET</> alias can be
+ used anywhere within the entire statement (e.g., within the
+ <literal>RETURNING</> clause). This allows expressions (in
+ particular, assignments) to reference rows originally proposed for
+ insertion. Note that the effects of all per-row <literal>BEFORE
+ INSERT</> triggers are carried forward. This is particularly
+ useful for multi-insert <literal>ON CONFLICT UPDATE</> statements;
+ when inserting or updating multiple rows, constants or parameter
+ values need only appear once.
+ </para>
+
+ <para>
+ There are several restrictions on the <literal>ON CONFLICT
+ UPDATE</> clause that do not apply to <command>UPDATE</command>
+ statements. Subqueries may not appear in either the
+ <command>UPDATE</command> targetlist, nor its <literal>WHERE</>
+ clause (although simple multi-assignment expressions are
+ supported). <literal>WHERE CURRENT OF</> cannot be used. In
+ general, only columns in the target table, and excluded values
+ originally proposed for insertion may be referenced. Operators and
+ functions may be used freely, though.
+ </para>
+
+ <para>
+ <command>INSERT</command> with an <literal>ON CONFLICT UPDATE</>
+ clause is a <quote>deterministic</quote> statement. This means
+ that the command will not be allowed to affect any single existing
+ row more than once; a cardinality violation error will be raised
+ when this situation arises. Rows proposed for insertion should not
+ duplicate each other in terms of attributes constrained by the
+ conflict-arbitrating unique index. Note that the ordinary rules
+ for unique indexes with regard to null apply analogously to whether
+ or not an arbitrating unique index indicates if the alternative
+ path should be taken. This means that when a null value appears in
+ any uniquely constrained tuple's attribute in an
+ <command>INSERT</command> statement with <literal>ON CONFLICT
+ UPDATE</literal>, rows proposed for insertion will never take the
+ alternative path (provided that a <literal>BEFORE ROW
+ INSERT</literal> trigger does not make null values non-null before
+ insertion); the statement will always insert, assuming there is no
+ unrelated error. Note that merely locking a row (by having it not
+ satisfy the <literal>WHERE</> clause <replaceable>condition</>)
+ does not count towards whether or not the row has been affected
+ multiple times (and whether or not a cardinality violation error is
+ raised). However, the implementation checks for cardinality
+ violations after locking the row, and before updating (or
+ considering updating), so a cardinality violation may be raised
+ despite the fact that the row would not otherwise have gone on to
+ be updated if and only if the existing row was updated by the
+ <literal>ON CONFLICT UPDATE</literal> command at least once
+ already.
+ </para>
+
+ <para>
+ <literal>ON CONFLICT UPDATE</> requires a <emphasis>unique index
+ inference</emphasis> specification, which consists of one or more
+ <replaceable class="PARAMETER">column_name_index</replaceable>
+ columns and/or <replaceable
+ class="PARAMETER">expression_index</replaceable> expressions on
+ columns, appearing between parenthesis. These are used to infer a
+ single unique index to limit pre-checking for conflicts to (if no
+ appropriate index is available, an error is raised). A subset of
+ the table to limit the check for conflicts to can optionally also
+ be specified using <replaceable
+ class="PARAMETER">index_condition</replaceable>. Note that any
+ available unique index must only cover at least that subset in
+ order to be arbitrate taking the alternative path; it need not
+ match exactly, and so a non-partial unique index that otherwise
+ matches is applicable. <literal>ON CONFLICT IGNORE</> makes an
+ inference specification optional; omitting the specification
+ indicates a total indifference to where any conflict could occur,
+ which isn't always appropriate. At times, it may be desirable for
+ <literal>ON CONFLICT IGNORE</> to <emphasis>not</emphasis> suppress
+ a conflict related error associated with an index where that isn't
+ explicitly anticipated. Note that <literal>ON CONFLICT UPDATE</>
+ assignment may result in a uniqueness violation, just as with a
+ conventional <command>UPDATE</command>.
+ </para>
+
+ <para>
+ Columns and/or expressions appearing in a unique index inference
+ specification must match all the columns/expressions of some
+ existing unique index on <replaceable
+ class="PARAMETER">table_name</replaceable> - there can be no
+ columns/expressions from the unique index that do not appear in the
+ inference specification, nor can there be any columns/expressions
+ appearing in the inference specification that do not appear in the
+ unique index definition. However, the order of the
+ columns/expressions in the index definition, or whether or not the
+ index definition specified <literal>NULLS FIRST</> or
+ <literal>NULLS LAST</>, or the internal sort order of each column
+ (whether <literal>DESC</> or <literal>ASC</> were specified) are
+ all irrelevant. Deferred unique constraints are not supported as
+ arbiters of whether an alternative <literal>ON CONFLICT</> path
+ should be taken.
+ </para>
+
+ <para>
+ The definition of a conflict for the purposes of <literal>ON
+ CONFLICT</> is somewhat subtle, although the exact definition is
+ seldom of great interest. A conflict is either a unique violation
+ from a unique constraint (or unique index), or an exclusion
+ violation from an exclusion constraint. Only unique indexes can be
+ inferred with a unique index inference specification, which is
+ required for the <command>UPDATE</command> variant, so in effect
+ only unique constraints (and unique indexes) are supported by the
+ <command>UPDATE</command> variant. In contrast to the rules around
+ certain other SQL clauses, like the <literal>DISTINCT</literal>
+ clause, the definition of a duplicate (a conflict) is based on
+ whatever unique indexes happen to be defined on columns on the
+ table. This means that if a user-defined type has multiple sort
+ orders, and the "equals" operator of any of those available sort
+ orders happens to be inconsistent (which goes against an unenforced
+ convention of <productname>PostgreSQL</productname>), the exact
+ behavior depends on the choice of operator class when the unique
+ index was created initially, and not any other consideration such
+ as the default operator class for the type of each indexed column.
+ If there are multiple unique indexes available that seem like
+ equally suitable candidates, but with inconsistent definitions of
+ "equals", then the system chooses whatever it estimates to be the
+ cheapest one to use as an arbiter of taking the alternative
+ <command>UPDATE</command>/<literal>IGNORE</literal> path.
+ </para>
+
+ <para>
+ The optional <replaceable
+ class="PARAMETER">index_condition</replaceable> can be used to
+ allow the inference specification to infer that a partial unique
+ index can be used. Any unique index that otherwise satisfies the
+ inference specification, while also covering at least all the rows
+ in the table covered by <replaceable
+ class="PARAMETER">index_condition</replaceable> may be used. It is
+ recommended that the partial index predicate of the unique index
+ intended to be used as the arbiter of taking the alternative path
+ be matched exactly, but this is not required. Note that an error
+ will be raised if an arbiter unique index is chosen that does not
+ cover the tuple or tuples ultimately proposed for insertion.
+ However, an overly specific <replaceable
+ class="PARAMETER">index_condition</replaceable> does not imply that
+ arbitrating conflicts will be limited to the subset of rows covered
+ by the inferred unique index corresponding to <replaceable
+ class="PARAMETER">index_condition</replaceable>.
+ </para>
+
+ <para>
The optional <literal>RETURNING</> clause causes <command>INSERT</>
- to compute and return value(s) based on each row actually inserted.
+ to compute and return value(s) based on each row actually inserted
+ (or updated, if an <literal>ON CONFLICT UPDATE</> clause was used).
This is primarily useful for obtaining values that were supplied by
defaults, such as a serial sequence number. However, any expression
using the table's columns is allowed. The syntax of the
<literal>RETURNING</> list is identical to that of the output list
- of <command>SELECT</>.
+ of <command>SELECT</>. Only rows that were successfully inserted
+ or updated will be returned. If a row was locked but not updated
+ because an <literal>ON CONFLICT UPDATE</> <literal>WHERE</> clause
+ did not pass, the row will not be returned. Since
+ <literal>RETURNING</> is not part of the <command>UPDATE</>
+ auxiliary query, the special <literal>ON CONFLICT UPDATE</> aliases
+ (<varname>TARGET</> and <varname>EXCLUDED</>) may not be
+ referenced; only the row as it exists after updating (or
+ inserting) is returned.
</para>
<para>
You must have <literal>INSERT</literal> privilege on a table in
- order to insert into it. If a column list is specified, you only
- need <literal>INSERT</literal> privilege on the listed columns.
- Use of the <literal>RETURNING</> clause requires <literal>SELECT</>
- privilege on all columns mentioned in <literal>RETURNING</>.
- If you use the <replaceable
- class="PARAMETER">query</replaceable> clause to insert rows from a
- query, you of course need to have <literal>SELECT</literal> privilege on
- any table or column used in the query.
+ order to insert into it, as well as <literal>UPDATE
+ privilege</literal> if and only if <literal>ON CONFLICT UPDATE</>
+ is specified. If a column list is specified, you only need
+ <literal>INSERT</literal> privilege on the listed columns.
+ Similarly, when <literal>ON CONFLICT UPDATE</> is specified, you
+ only need <literal>UPDATE</> privilege on the column(s) that are
+ listed to be updated, as well as SELECT privilege on any column
+ whose values are read in the <literal>ON CONFLICT UPDATE</>
+ expressions or <replaceable>condition</>. Use of the
+ <literal>RETURNING</> clause requires <literal>SELECT</> privilege
+ on all columns mentioned in <literal>RETURNING</>. If you use the
+ <replaceable class="PARAMETER">query</replaceable> clause to insert
+ rows from a query, you of course need to have
+ <literal>SELECT</literal> privilege on any table or column used in
+ the query.
</para>
</refsect1>
@@ -121,7 +326,54 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
The name of a column in the table named by <replaceable class="PARAMETER">table_name</replaceable>.
The column name can be qualified with a subfield name or array
subscript, if needed. (Inserting into only some fields of a
- composite column leaves the other fields null.)
+ composite column leaves the other fields null.) When
+ referencing a column with <literal>ON CONFLICT UPDATE</>, do not
+ include the table's name in the specification of a target
+ column. For example, <literal>INSERT ... ON CONFLICT UPDATE tab
+ SET TARGET.col = 1</> is invalid.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">column_name_index</replaceable></term>
+ <listitem>
+ <para>
+ The name of a <replaceable
+ class="PARAMETER">table_name</replaceable> column (with several
+ columns potentially named). These are used to infer a
+ particular unique index defined on <replaceable
+ class="PARAMETER">table_name</replaceable>. This requires
+ <literal>ON CONFLICT UPDATE</> and <literal>ON CONFLICT
+ IGNORE</> to assume that all expected sources of uniqueness
+ violations originate within the columns/rows constrained by the
+ unique index. When this is omitted, (which is forbidden with
+ the <literal>ON CONFLICT UPDATE</> variant), the system checks
+ for sources of uniqueness violations ahead of time in all unique
+ indexes. Otherwise, only a single specified unique index is
+ checked ahead of time, and uniqueness violation errors can
+ appear for conflicts originating in any other unique index. If
+ a unique index cannot be inferred, an error is raised.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">expression_index</replaceable></term>
+ <listitem>
+ <para>
+ Equivalent to <replaceable
+ class="PARAMETER">column_name_index</replaceable>, but used to
+ infer a particular expressional index instead.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><replaceable class="PARAMETER">index_condition</replaceable></term>
+ <listitem>
+ <para>
+ Used to allow inference of partial unique indexes.
</para>
</listitem>
</varlistentry>
@@ -167,12 +419,25 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
</varlistentry>
<varlistentry>
+ <term><replaceable class="PARAMETER">condition</replaceable></term>
+ <listitem>
+ <para>
+ An expression that returns a value of type <type>boolean</type>.
+ Only rows for which this expression returns <literal>true</>
+ will be updated, although all rows will be locked when the
+ <literal>ON CONFLICT UPDATE</> path is taken.
+ </para>
+ </listitem>
+ </varlistentry>
+ <varlistentry>
+
<term><replaceable class="PARAMETER">output_expression</replaceable></term>
<listitem>
<para>
An expression to be computed and returned by the <command>INSERT</>
- command after each row is inserted. The expression can use any
- column names of the table named by <replaceable class="PARAMETER">table_name</replaceable>.
+ command after each row is inserted (not updated). The
+ expression can use any column names of the table named by
+ <replaceable class="PARAMETER">table_name</replaceable>.
Write <literal>*</> to return all columns of the inserted row(s).
</para>
</listitem>
@@ -198,20 +463,29 @@ INSERT INTO <replaceable class="PARAMETER">table_name</replaceable> [ ( <replace
<screen>
INSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
</screen>
+ However, in the event of an <literal>ON CONFLICT UPDATE</> clause
+ (but <emphasis>not</emphasis> in the event of an <literal>ON
+ CONFLICT IGNORE</> clause), the command tag reports the number of
+ rows inserted or updated together, of the form
+<screen>
+UPSERT <replaceable>oid</replaceable> <replaceable class="parameter">count</replaceable>
+</screen>
The <replaceable class="parameter">count</replaceable> is the number
of rows inserted. If <replaceable class="parameter">count</replaceable>
is exactly one, and the target table has OIDs, then
<replaceable class="parameter">oid</replaceable> is the
- <acronym>OID</acronym> assigned to the inserted row. Otherwise
- <replaceable class="parameter">oid</replaceable> is zero.
+ <acronym>OID</acronym>
+ assigned to the inserted row (but not if there is only a single
+ updated row). Otherwise <replaceable
+ class="parameter">oid</replaceable> is zero..
</para>
<para>
If the <command>INSERT</> command contains a <literal>RETURNING</>
clause, the result will be similar to that of a <command>SELECT</>
statement containing the columns and values defined in the
- <literal>RETURNING</> list, computed over the row(s) inserted by the
- command.
+ <literal>RETURNING</> list, computed over the row(s) inserted or
+ updated by the command.
</para>
</refsect1>
@@ -311,7 +585,63 @@ WITH upd AS (
RETURNING *
)
INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
-</programlisting></para>
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Assumes a unique
+ index has been defined that constrains values appearing in the
+ <literal>did</literal> column. Note that an <varname>EXCLUDED</>
+ expression is used to reference values originally proposed for
+ insertion:
+<programlisting>
+ INSERT INTO distributors (did, dname)
+ VALUES (5, 'Gizmo transglobal'), (6, 'Associated Computing, inc')
+ ON CONFLICT (did) UPDATE SET dname = EXCLUDED.dname
+</programlisting>
+ </para>
+ <para>
+ Insert a distributor, or do nothing for rows proposed for insertion
+ when an existing, excluded row (a row with a matching constrained
+ column or columns after before row insert triggers fire) exists.
+ Example assumes a unique index has been defined that constrains
+ values appearing in the <literal>did</literal> column (although
+ since the <literal>IGNORE</> variant was used, the specification of
+ columns to infer a unique index from is not mandatory):
+<programlisting>
+ INSERT INTO distributors (did, dname) VALUES (7, 'Redline GmbH')
+ ON CONFLICT (did) IGNORE
+</programlisting>
+ </para>
+ <para>
+ Insert or update new distributors as appropriate. Example assumes
+ a unique index has been defined that constrains values appearing in
+ the <literal>did</literal> column. <literal>WHERE</> clause is
+ used to limit the rows actually updated (any existing row not
+ updated will still be locked, though):
+<programlisting>
+ -- Don't update existing distributors based in a certain ZIP code
+ INSERT INTO distributors (did, dname) VALUES (8, 'Anvil Distribution')
+ ON CONFLICT (did) UPDATE
+ SET dname = EXCLUDED.dname || ' (formerly ' || TARGET.dname || ')'
+ WHERE TARGET.zipcode != '21201'
+</programlisting>
+ </para>
+ <para>
+ Insert new distributor if possible; otherwise
+ <literal>IGNORE</literal>. Example assumes a unique index has been
+ defined that constrains values appearing in the
+ <literal>did</literal> column on a subset of rows where the
+ <literal>is_active</literal> boolean column evaluates to
+ <literal>true</literal>:
+<programlisting>
+ -- This statement could infer a partial unique index on did
+ -- with a predicate of WHERE is_active, but it could also
+ -- just use a regular unique constraint on did if that was
+ -- all that was available.
+ INSERT INTO distributors (did, dname) VALUES (9, 'Antwerp Design')
+ ON CONFLICT (did WHERE is_active) IGNORE
+</programlisting>
+ </para>
</refsect1>
<refsect1>
@@ -321,7 +651,8 @@ INSERT INTO employees_log SELECT *, current_timestamp FROM upd;
<command>INSERT</command> conforms to the SQL standard, except that
the <literal>RETURNING</> clause is a
<productname>PostgreSQL</productname> extension, as is the ability
- to use <literal>WITH</> with <command>INSERT</>.
+ to use <literal>WITH</> with <command>INSERT</>, and the ability to
+ specify an alternative path with <literal>ON CONFLICT</>.
Also, the case in
which a column name list is omitted, but not all the columns are
filled from the <literal>VALUES</> clause or <replaceable>query</>,
diff --git a/doc/src/sgml/ref/set_constraints.sgml b/doc/src/sgml/ref/set_constraints.sgml
index 7c31871..1e0a2f8 100644
--- a/doc/src/sgml/ref/set_constraints.sgml
+++ b/doc/src/sgml/ref/set_constraints.sgml
@@ -69,7 +69,11 @@ SET CONSTRAINTS { ALL | <replaceable class="parameter">name</replaceable> [, ...
<para>
Currently, only <literal>UNIQUE</>, <literal>PRIMARY KEY</>,
<literal>REFERENCES</> (foreign key), and <literal>EXCLUDE</>
- constraints are affected by this setting.
+ constraints are affected by this setting. Note that constraints
+ that were created with this clause cannot be used as arbiters of
+ whether or not to take the alternative path with an
+ <command>INSERT</command> statement that includes an <literal>ON
+ CONFLICT UPDATE</> clause.
<literal>NOT NULL</> and <literal>CHECK</> constraints are
always checked immediately when a row is inserted or modified
(<emphasis>not</> at the end of the statement).
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f94aea1..5141690 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -40,14 +40,17 @@
On tables and foreign tables, triggers can be defined to execute either
before or after any <command>INSERT</command>, <command>UPDATE</command>,
or <command>DELETE</command> operation, either once per modified row,
- or once per <acronym>SQL</acronym> statement.
- <command>UPDATE</command> triggers can moreover be set to fire only if
- certain columns are mentioned in the <literal>SET</literal> clause of the
- <command>UPDATE</command> statement.
- Triggers can also fire for <command>TRUNCATE</command> statements.
- If a trigger event occurs, the trigger's function is called at the
- appropriate time to handle the event. Foreign tables do not support the
- TRUNCATE statement at all.
+ or once per <acronym>SQL</acronym> statement. If an
+ <command>INSERT</command> contains an <literal>ON CONFLICT UPDATE</>
+ clause, it is possible that the effects of a BEFORE insert trigger and
+ a BEFORE update trigger can both be applied twice, if a reference to
+ an <varname>EXCLUDED</> column appears. <command>UPDATE</command>
+ triggers can moreover be set to fire only if certain columns are
+ mentioned in the <literal>SET</literal> clause of the
+ <command>UPDATE</command> statement. Triggers can also fire for
+ <command>TRUNCATE</command> statements. If a trigger event occurs,
+ the trigger's function is called at the appropriate time to handle the
+ event. Foreign tables do not support the TRUNCATE statement at all.
</para>
<para>
@@ -119,6 +122,36 @@
</para>
<para>
+ If an <command>INSERT</command> contains an <literal>ON CONFLICT
+ UPDATE</> clause, it is possible that the effects of all row-level
+ <literal>BEFORE</> <command>INSERT</command> triggers and all
+ row-level BEFORE <command>UPDATE</command> triggers can both be
+ applied in a way that is apparent from the final state of the updated
+ row, if an <varname>EXCLUDED</> column is referenced. There need not
+ be an <varname>EXCLUDED</> column reference for both sets of BEFORE
+ row-level triggers to execute, though. The possibility of surprising
+ outcomes should be considered when there are both <literal>BEFORE</>
+ <command>INSERT</command> and <literal>BEFORE</>
+ <command>UPDATE</command> row-level triggers that both affect a row
+ being inserted/updated (this can still be problematic if the
+ modifications are more or less equivalent if they're not also
+ idempotent). Note that statement-level <command>UPDATE</command>
+ triggers are executed when <literal>ON CONFLICT UPDATE</> is
+ specified, regardless of whether or not any rows were affected by
+ the <command>UPDATE</command>. An <command>INSERT</command> with
+ an <literal>ON CONFLICT UPDATE</> clause will execute
+ statement-level <literal>BEFORE</> <command>INSERT</command>
+ triggers first, then statement-level <literal>BEFORE</>
+ <command>UPDATE</command> triggers, followed by statement-level
+ <literal>AFTER</> <command>UPDATE</command> triggers and finally
+ statement-level <literal>AFTER</> <command>INSERT</command>
+ triggers. <literal>ON CONFLICT UPDATE</> is not supported on
+ views (Only <literal>ON CONFLICT IGNORE</> is supported on
+ updatable views); therefore, unpredictable interactions with
+ <literal>INSTEAD OF</> triggers are not possible.
+ </para>
+
+ <para>
Trigger functions invoked by per-statement triggers should always
return <symbol>NULL</symbol>. Trigger functions invoked by per-row
triggers can return a table row (a value of
--
1.9.1
0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchtext/x-patch; charset=US-ASCII; name=0005-Internal-documentation-for-INSERT-.-ON-CONFLICT-UPDA.patchDownload
From 80c9126bc316cbf17e628dd58abaa1f8f646eb58 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:16:11 -0700
Subject: [PATCH 5/6] Internal documentation for INSERT ... ON CONFLICT {UPDATE
| IGNORE}
Includes documentation for executor README. A high-level handling of
approach #2 to value locking also appears there, since in contrast with
design #1, that is something that lives in the head of the executor.
---
src/backend/executor/README | 128 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 128 insertions(+)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 8afa1e3..b5a5c33 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -200,3 +200,131 @@ is no explicit prohibition on SRFs in UPDATE, but the net effect will be
that only the first result row of an SRF counts, because all subsequent
rows will result in attempts to re-update an already updated target row.
This is historical behavior and seems not worth changing.)
+
+Speculative insertion
+---------------------
+
+Speculative insertion is a process that the executor manages for the benefit of
+INSERT...ON CONFLICT UPDATE/IGNORE. Supported indexes include nbtree unique
+indexes (nbtree is currently the only amcanunique index access method), or
+exclusion constraint indexes (exclusion constraints are considered a
+generalization of unique constraints). Only ON CONFLICT IGNORE is supported
+with exclusion constraints.
+
+The primary user-visible goal for INSERT...ON CONFLICT UPDATE is to guarantee
+either an insert or update under normal operating conditions in READ COMMITTED
+mode (where serialization failures are just as unacceptable as they are with
+regular UPDATEs). A would-be conflict (and the associated index) are the
+arbiters of whether or not the alternative (UPDATE/IGNORE) path is taken. The
+implementation more or less tries to update or insert until one or the other of
+those two outcomes occurs successfully. There are some non-obvious hazards
+involved that are carefully avoided. These hazards relate to concurrent
+activity causing conflicts for the implementation, which must be handled.
+
+The index is the authoritative source of truth for whether there is or is not a
+conflict, for unique index enforcement in general, and for speculative
+insertion in particular. The heap must still be considered, though, not least
+since it alone has authoritative visibility information. Through looping, we
+hope to overcome the disconnect between the heap and the arbiter index. We
+must lock the row, and then verify that there is no conflict. Only then do we
+UPDATE. Theoretically, some individual session could loop forever, although
+under high concurrency one session always proceeds.
+
+There are 2 sources of conflicts for ON CONFLICT UPDATE:
+
+1. Conflicts from going to update (having found a conflict during the
+pre-check), and finding the tuple changed (which may or may not involve new,
+distinct constrained values in later tuple versions -- for simplicity, we don't
+bother with considering that). This is not a conflict that the IGNORE variant
+considers.
+
+2. Conflicts from inserting a tuple (having not found a conflict during the
+pre-check), and only then finding a conflict at insertion time (when inserting
+index tuples, and finding a conflicting one when a buffer lock is held on an
+index page in the ordinary course of insertion). This can happen if a
+concurrent insertion occurs after the pre-check, but before physical index
+tuple insertion.
+
+The first step in the loop is to perform a pre-check. The indexes are scanned
+for existing conflicting values. At this point, we may have to wait until the
+end of another xact (or xact's promise token -- more on that later), iff it
+isn't immediately conclusive that there is or is not a conflict (when we finish
+the pre-check, there is a preliminary conclusion about there either being or
+not being a conflict -- but the conclusion only holds if there are no
+subsequent concurrent conflicts). If a conclusively committed conflict tuple
+is detected during the first step, the executor goes to lock and update the row
+(for ON CONFLICT UPDATE -- otherwise, for ON CONFLICT IGNORE, we're done). The
+TID to lock (and potentially UPDATE) can only be determined during the first
+step. If locking the row finds a concurrent conflict (which may be from a
+concurrent UPDATE that hasn't even physically inspected the arbiter index yet)
+then we restart the loop from the very beginning. We restart from scratch
+because all bets are off; it's possible that the process will find no conflict
+the second time around, and will successfully insert, or will UPDATE another
+tuple that is not even part of the same UPDATE chain as first time around.
+
+The second step (skipped when a conflict is found) is to insert a heap tuple
+and related index tuples opportunistically. This uses the same mechanism as
+deferred unique indexes, and so we never wait for a possibly conflicting xact
+to commit or abort (unlike with conventional unique index insertion) -- we
+simply detect a possible conflict.
+
+When opportunistically inserting during the second step, we are not logically
+inserting a tuple as such. Rather, the process is somewhat similar to the
+conventional unique index insertion steps taken within the nbtree AM, where we
+must briefly lock the *value* being inserted: in that codepath, the value
+proposed for insertion is for an instant locked *in the abstract*, by way of a
+buffer lock on "the first leaf page the value could be on". Then, having
+established the right to physically insert, do so (or throw an error). For
+speculative insertion, if no conflict occurs during the insertion (which is
+usually the case, since it was just determined in the first step that there was
+no conflict), then we're done. Otherwise, we must restart (and likely find the
+same conflict tuple during the first step of the new iteration). But a
+counter-intuitive step must be taken first (which is what makes this whole
+dance similar to conventional nbtree "value locking").
+
+We must "super delete" the tuple when the opportunistic insertion finds a
+conflict. This means that it immediately becomes invisible to all snapshot
+types, and immediately becomes reclaimable by VACUUM. Other backends
+(speculative inserters or ordinary inserters) know to not wait on our
+transaction end when they encounter an optimistically inserted "promise tuple".
+Rather, they wait on a corresponding promise token lock, which we hold only for
+as long as opportunistically inserting. We release the lock when done
+opportunistically inserting (and after "super deleting", if that proved
+necessary), releasing our waiters (who will ordinarily re-find our promise
+tuple as a bona fide tuple, or occasionally will find that they can insert
+after all). It's important that other xacts not wait on the end of our xact
+until we've established that we've successfully and conclusively inserted
+logically (or established that there was an insertion conflict, and cleaned up
+after it by "super deleting"). Otherwise, concurrent speculative inserters
+could be involved in "unprincipled deadlocks": deadlocks where there is no
+user-visible mutual dependency, and yet an implementation related mutual
+dependency is unexpectedly introduced. The user might be left with no
+reasonable way of avoiding these deadlocks, which would not be okay.
+
+Speculative insertion and EvalPlanQual()
+----------------------------------------
+
+Updating the tuple involves locking it first (to establish a definitive tuple
+to consider evaluating the additional UPDATE qual against). The EvalPlanQual()
+mechanism (or, rather, some associated infrastructure) is reused for the
+benefit of auxiliary UPDATE expression evaluation.
+
+Locking first deviates from how conventional UPDATEs work, but allows the
+implementation to consider the possibility of conflicts first, and then, having
+reached a definitive conclusion, separately evaluate.
+
+ExecLockUpdateTuple() is somewhat similar to EvalPlanQual(), except it locks
+the TID reported as conflicting, and upon successfully locking, installs that
+into the UPDATE's EPQ slot. There is no UPDATE chain to walk -- rather, new
+tuples to check the qual against come from continuous attempts at locking a
+tuple conclusively (avoiding conflicts). The qual (if any) is then evaluated.
+Note that at READ COMMITTED, it's possible that *no* version of the tuple is
+visible, and yet it may still be updated. Similarly, since we do not walk the
+UPDATE chain, concurrent READ COMMITTED INSERT ... ON CONFLICT UPDATE sessions
+always attempt to lock the conclusively visible tuple, without regard to any
+other tuple version (repeatable read isolation level and up must consider MVCC
+visibility, though). A further implication of this is that the
+MVCC-snapshot-visible row version is denied the opportunity to prevent the
+UPDATE from taking place, should it not pass our qual (while a later version
+does pass it). This is fundamentally similar to updating a tuple when no
+version is visible, though.
--
1.9.1
0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0004-Tests-for-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 3a793993eb04b8589a67fe09df12aaabf3bb7b3c Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:11:15 -0700
Subject: [PATCH 4/6] Tests for INSERT ... ON CONFLICT {UPDATE | IGNORE}
Add dedicated isolation tests for both UPDATE and IGNORE variants,
illustrating the "MVCC violation" that allows a READ COMMITTED
transaction's UPDATE to succeed in updating a tuple with no version
visible to its command's MVCC snapshot. Add regression tests, which for
the most part are intended to exercise interactions with other features
(e.g. updatable views, inheritance, triggers, RLS).
Add a few general purpose smoke tests too, testing everything from
EXPLAIN output to unique index inference (expression indexes, partial
indexes, etc).
---
contrib/postgres_fdw/expected/postgres_fdw.out | 7 +
contrib/postgres_fdw/sql/postgres_fdw.sql | 3 +
.../isolation/expected/insert-conflict-ignore.out | 23 ++
.../expected/insert-conflict-update-2.out | 23 ++
.../expected/insert-conflict-update-3.out | 26 +++
.../isolation/expected/insert-conflict-update.out | 23 ++
src/test/isolation/isolation_schedule | 4 +
.../isolation/specs/insert-conflict-ignore.spec | 41 ++++
.../isolation/specs/insert-conflict-update-2.spec | 41 ++++
.../isolation/specs/insert-conflict-update-3.spec | 69 ++++++
.../isolation/specs/insert-conflict-update.spec | 40 ++++
src/test/regress/expected/insert_conflict.out | 241 +++++++++++++++++++++
src/test/regress/expected/privileges.out | 7 +-
src/test/regress/expected/rowsecurity.out | 90 ++++++++
src/test/regress/expected/rules.out | 21 ++
src/test/regress/expected/subselect.out | 22 ++
src/test/regress/expected/triggers.out | 102 ++++++++-
src/test/regress/expected/updatable_views.out | 4 +
src/test/regress/expected/update.out | 27 +++
src/test/regress/expected/with.out | 74 +++++++
src/test/regress/input/constraints.source | 5 +
src/test/regress/output/constraints.source | 15 +-
src/test/regress/parallel_schedule | 1 +
src/test/regress/serial_schedule | 1 +
src/test/regress/sql/insert_conflict.sql | 192 ++++++++++++++++
src/test/regress/sql/privileges.sql | 5 +-
src/test/regress/sql/rowsecurity.sql | 73 +++++++
src/test/regress/sql/rules.sql | 14 ++
src/test/regress/sql/subselect.sql | 14 ++
src/test/regress/sql/triggers.sql | 69 +++++-
src/test/regress/sql/updatable_views.sql | 2 +
src/test/regress/sql/update.sql | 14 ++
src/test/regress/sql/with.sql | 37 ++++
33 files changed, 1322 insertions(+), 8 deletions(-)
create mode 100644 src/test/isolation/expected/insert-conflict-ignore.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-2.out
create mode 100644 src/test/isolation/expected/insert-conflict-update-3.out
create mode 100644 src/test/isolation/expected/insert-conflict-update.out
create mode 100644 src/test/isolation/specs/insert-conflict-ignore.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-2.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update-3.spec
create mode 100644 src/test/isolation/specs/insert-conflict-update.spec
create mode 100644 src/test/regress/expected/insert_conflict.out
create mode 100644 src/test/regress/sql/insert_conflict.sql
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 583cce7..5133386 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -2327,6 +2327,13 @@ INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
ERROR: duplicate key value violates unique constraint "t1_pkey"
DETAIL: Key ("C 1")=(11) already exists.
CONTEXT: Remote SQL command: INSERT INTO "S 1"."T 1"("C 1", c2, c3, c4, c5, c6, c7, c8) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
+ERROR: relation "ft1" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
ERROR: new row for relation "T 1" violates check constraint "c2positive"
DETAIL: Failing row contains (1111, -2, null, null, null, null, ft1 , null).
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 83e8fa7..e01d34e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -372,6 +372,9 @@ UPDATE ft2 SET c2 = c2 + 600 WHERE c1 % 10 = 8 AND c1 < 1200 RETURNING *;
ALTER TABLE "S 1"."T 1" ADD CONSTRAINT c2positive CHECK (c2 >= 0);
INSERT INTO ft1(c1, c2) VALUES(11, 12); -- duplicate key
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT IGNORE; -- works
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) IGNORE; -- unsupported
+INSERT INTO ft1(c1, c2) VALUES(11, 12) ON CONFLICT (c1, c2) UPDATE SET c3 = 'ffg'; -- unsupported
INSERT INTO ft1(c1, c2) VALUES(1111, -2); -- c2positive
UPDATE ft1 SET c2 = -c2 WHERE c1 = 1; -- c2positive
diff --git a/src/test/isolation/expected/insert-conflict-ignore.out b/src/test/isolation/expected/insert-conflict-ignore.out
new file mode 100644
index 0000000..e6cc2a1
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-ignore.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: ignore1 ignore2 c1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step c1: COMMIT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore1
+step c2: COMMIT;
+
+starting permutation: ignore1 ignore2 a1 select2 c2
+step ignore1: INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE;
+step ignore2: INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; <waiting ...>
+step a1: ABORT;
+step ignore2: <... completed>
+step select2: SELECT * FROM ints;
+key val
+
+1 ignore2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-2.out b/src/test/isolation/expected/insert-conflict-update-2.out
new file mode 100644
index 0000000..6a5ddfe
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-2.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key payload
+
+FOOFOO insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update-3.out b/src/test/isolation/expected/insert-conflict-update-3.out
new file mode 100644
index 0000000..29dd8b0
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update-3.out
@@ -0,0 +1,26 @@
+Parsed test spec with 2 sessions
+
+starting permutation: update2 insert1 c2 select1surprise c1
+step update2: UPDATE colors SET is_active = true WHERE key = 1;
+step insert1:
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key; <waiting ...>
+step c2: COMMIT;
+step insert1: <... completed>
+key color is_active
+
+1 Red f
+2 Green f
+3 Blue f
+step select1surprise: SELECT * FROM colors ORDER BY key;
+key color is_active
+
+1 Brown t
+2 Green f
+3 Blue f
+step c1: COMMIT;
diff --git a/src/test/isolation/expected/insert-conflict-update.out b/src/test/isolation/expected/insert-conflict-update.out
new file mode 100644
index 0000000..6976124
--- /dev/null
+++ b/src/test/isolation/expected/insert-conflict-update.out
@@ -0,0 +1,23 @@
+Parsed test spec with 2 sessions
+
+starting permutation: insert1 insert2 c1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step c1: COMMIT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert1 updated by insert2
+step c2: COMMIT;
+
+starting permutation: insert1 insert2 a1 select2 c2
+step insert1: INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1';
+step insert2: INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; <waiting ...>
+step a1: ABORT;
+step insert2: <... completed>
+step select2: SELECT * FROM upsert;
+key val
+
+1 insert2
+step c2: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index c055a53..50948a2 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -16,6 +16,10 @@ test: fk-deadlock2
test: eval-plan-qual
test: lock-update-delete
test: lock-update-traversal
+test: insert-conflict-ignore
+test: insert-conflict-update
+test: insert-conflict-update-2
+test: insert-conflict-update-3
test: delete-abort-savept
test: delete-abort-savept-2
test: aborted-keyrevoke
diff --git a/src/test/isolation/specs/insert-conflict-ignore.spec b/src/test/isolation/specs/insert-conflict-ignore.spec
new file mode 100644
index 0000000..fde43b3
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-ignore.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT IGNORE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions during INSERT...ON CONFLICT IGNORE.
+#
+# The convention here is that session 1 always ends up inserting, and session 2
+# always ends up ignoring.
+
+setup
+{
+ CREATE TABLE ints (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE ints;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore1" { INSERT INTO ints(key, val) VALUES(1, 'ignore1') ON CONFLICT IGNORE; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "ignore2" { INSERT INTO ints(key, val) VALUES(1, 'ignore2') ON CONFLICT IGNORE; }
+step "select2" { SELECT * FROM ints; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# Regular case where one session block-waits on another to determine if it
+# should proceed with an insert or ignore.
+permutation "ignore1" "ignore2" "c1" "select2" "c2"
+permutation "ignore1" "ignore2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-2.spec b/src/test/isolation/specs/insert-conflict-update-2.spec
new file mode 100644
index 0000000..3e6e944
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-2.spec
@@ -0,0 +1,41 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test shows a plausible scenario in which the user might wish to UPDATE a
+# value that is also constrained by the unique index that is the arbiter of
+# whether the alternative path should be taken.
+
+setup
+{
+ CREATE TABLE upsert (key text not null, payload text);
+ CREATE UNIQUE INDEX ON upsert(lower(key));
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, payload) VALUES('FooFoo', 'insert1') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, payload) VALUES('FOOFOO', 'insert2') ON CONFLICT (lower(key)) UPDATE set key = EXCLUDED.key, payload = TARGET.payload || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. The user can still usefully UPDATE
+# a column constrained by a unique index, as the example illustrates.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/isolation/specs/insert-conflict-update-3.spec b/src/test/isolation/specs/insert-conflict-update-3.spec
new file mode 100644
index 0000000..94ae3df
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update-3.spec
@@ -0,0 +1,69 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# Other INSERT...ON CONFLICT UPDATE isolation tests illustrate the "MVCC
+# violation" added to facilitate the feature, whereby a
+# not-visible-to-our-snapshot tuple can be updated by our command all the same.
+# This is generally needed to provide a guarantee of a successful INSERT or
+# UPDATE in READ COMMITTED mode. This MVCC violation is quite distinct from
+# the putative "MVCC violation" that has existed in PostgreSQL for many years,
+# the EvalPlanQual() mechanism, because that mechanism always starts from a
+# tuple that is visible to the command's MVCC snapshot. This test illustrates
+# a slightly distinct user-visible consequence of the same MVCC violation
+# generally associated with INSERT...ON CONFLICT UPDATE. The impact of the
+# MVCC violation goes a little beyond updating MVCC-invisible tuples.
+#
+# With INSERT...ON CONFLICT UPDATE, the UPDATE predicate is only evaluated
+# once, on this conclusively-locked tuple, and not any other version of the
+# same tuple. It is therefore possible (in READ COMMITTED mode) that the
+# predicate "fail to be satisfied" according to the command's MVCC snapshot.
+# It might simply be that there is no row version visible, but it's also
+# possible that there is some row version visible, but only as a version that
+# doesn't satisfy the predicate. If, however, the conclusively-locked version
+# satisfies the predicate, that's good enough, and the tuple is updated. The
+# MVCC-snapshot-visible row version is denied the opportunity to prevent the
+# UPDATE from taking place, because we don't walk the UPDATE chain in the usual
+# way.
+
+setup
+{
+ CREATE TABLE colors (key int4 PRIMARY KEY, color text, is_active boolean);
+ INSERT INTO colors (key, color, is_active) VALUES(1, 'Red', false);
+ INSERT INTO colors (key, color, is_active) VALUES(2, 'Green', false);
+ INSERT INTO colors (key, color, is_active) VALUES(3, 'Blue', false);
+}
+
+teardown
+{
+ DROP TABLE colors;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" {
+ WITH t AS (
+ INSERT INTO colors(key, color, is_active)
+ VALUES(1, 'Brown', true), (2, 'Gray', true)
+ ON CONFLICT (key) UPDATE
+ SET color = EXCLUDED.color
+ WHERE TARGET.is_active)
+ SELECT * FROM colors ORDER BY key;}
+step "select1surprise" { SELECT * FROM colors ORDER BY key; }
+step "c1" { COMMIT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "update2" { UPDATE colors SET is_active = true WHERE key = 1; }
+step "c2" { COMMIT; }
+
+# Perhaps surprisingly, the session 1 MVCC-snapshot-visible tuple (the tuple
+# with the pre-populated color 'Red') is denied the opportunity to prevent the
+# UPDATE from taking place -- only the conclusively-locked tuple version
+# matters, and so the tuple with key value 1 was updated to 'Brown' (but not
+# tuple with key value 2, since nothing changed there):
+permutation "update2" "insert1" "c2" "select1surprise" "c1"
diff --git a/src/test/isolation/specs/insert-conflict-update.spec b/src/test/isolation/specs/insert-conflict-update.spec
new file mode 100644
index 0000000..6529a0c
--- /dev/null
+++ b/src/test/isolation/specs/insert-conflict-update.spec
@@ -0,0 +1,40 @@
+# INSERT...ON CONFLICT UPDATE test
+#
+# This test tries to expose problems with the interaction between concurrent
+# sessions.
+
+setup
+{
+ CREATE TABLE upsert (key int primary key, val text);
+}
+
+teardown
+{
+ DROP TABLE upsert;
+}
+
+session "s1"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert1" { INSERT INTO upsert(key, val) VALUES(1, 'insert1') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert1'; }
+step "c1" { COMMIT; }
+step "a1" { ABORT; }
+
+session "s2"
+setup
+{
+ BEGIN ISOLATION LEVEL READ COMMITTED;
+}
+step "insert2" { INSERT INTO upsert(key, val) VALUES(1, 'insert2') ON CONFLICT (key) UPDATE set val = TARGET.val || ' updated by insert2'; }
+step "select2" { SELECT * FROM upsert; }
+step "c2" { COMMIT; }
+step "a2" { ABORT; }
+
+# One session (session 2) block-waits on another (session 1) to determine if it
+# should proceed with an insert or update. Notably, this entails updating a
+# tuple while there is no version of that tuple visible to the updating
+# session's snapshot. This is permitted only in READ COMMITTED mode.
+permutation "insert1" "insert2" "c1" "select2" "c2"
+permutation "insert1" "insert2" "a1" "select2" "c2"
diff --git a/src/test/regress/expected/insert_conflict.out b/src/test/regress/expected/insert_conflict.out
new file mode 100644
index 0000000..c192bd3
--- /dev/null
+++ b/src/test/regress/expected/insert_conflict.out
@@ -0,0 +1,241 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+ QUERY PLAN
+----------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+(3 rows)
+
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+ QUERY PLAN
+----------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+ Filter: (fruit <> 'Cawesh'::text)
+(4 rows)
+
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+ QUERY PLAN
+----------------------------------------------------------
+ Insert on insertconflicttest target
+ -> Result
+ -> Conflict Update on insertconflicttest target
+ Filter: ((excluded.fruit) <> 'Elderberry'::text)
+(4 rows)
+
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+ QUERY PLAN
+--------------------------------------------------
+ [ +
+ { +
+ "Plan": { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Insert", +
+ "Relation Name": "insertconflicttest", +
+ "Alias": "target", +
+ "Arbiter Index": "key_index", +
+ "Plans": [ +
+ { +
+ "Node Type": "Result", +
+ "Parent Relationship": "Member" +
+ }, +
+ { +
+ "Node Type": "ModifyTable", +
+ "Operation": "Conflict Update", +
+ "Parent Relationship": "Member", +
+ "Relation Name": "insertconflicttest",+
+ "Alias": "target", +
+ "Filter": "(fruit <> 'Lime'::text)" +
+ } +
+ ] +
+ } +
+ } +
+ ]
+(1 row)
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+ERROR: ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from
+LINE 1: ...nsert into insertconflicttest values (1, 'Apple') on conflic...
+ ^
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+ERROR: invalid reference to FROM-clause entry for table "insertconflicttest"
+LINE 1: ...(1, 'Apple') on conflict (key) update set fruit = insertconf...
+ ^
+HINT: Perhaps you meant to reference the table alias "excluded".
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index key_index;
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index comp_key_index;
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_key_index;
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+drop index expr_comp_key_index;
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+ERROR: duplicate key value violates unique constraint "fruit_index"
+DETAIL: Key (fruit)=(Peach) already exists.
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+drop index key_index;
+drop index fruit_index;
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+ERROR: could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+ERROR: partial arbiter unique index has predicate that does not cover tuple proposed for insertion
+DETAIL: ON CONFLICT inference clause implies that the tuple proposed for insertion must be covered by predicate for partial index "partial_key_index".
+drop index partial_key_index;
+-- Cleanup
+drop table insertconflicttest;
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+create table capitals (
+ state char(2)
+) inherits (cities);
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+-- Tests proper for inheritance:
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+ERROR: relation "cities" has inheritance children
+HINT: Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.
+-- Succeeds:
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 74b0450..bc44c45 100644
--- a/src/test/regress/expected/privileges.out
+++ b/src/test/regress/expected/privileges.out
@@ -269,7 +269,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
ERROR: permission denied for relation atest2
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -367,6 +367,11 @@ UPDATE atest5 SET one = 8; -- fail
ERROR: permission denied for relation atest5
UPDATE atest5 SET three = 5, one = 2; -- fail
ERROR: permission denied for relation atest5
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+ERROR: permission denied for relation atest5
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
+ERROR: permission denied for relation atest5
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
GRANT SELECT (one,two,blue) ON atest6 TO regressuser4;
diff --git a/src/test/regress/expected/rowsecurity.out b/src/test/regress/expected/rowsecurity.out
index 21817d8..07cb54f 100644
--- a/src/test/regress/expected/rowsecurity.out
+++ b/src/test/regress/expected/rowsecurity.out
@@ -1179,6 +1179,96 @@ NOTICE: f_leak => yyyyyy
(3 rows)
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------
+ 33 | 22 | 1 | rls_regress_user1 | okay science fiction
+(1 row)
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+SET SESSION AUTHORIZATION rls_regress_user1;
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------
+ 2 | 11 | 2 | rls_regress_user1 | my first novel
+(1 row)
+
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+------------------
+ 78 | 11 | 1 | rls_regress_user1 | some other novel
+(1 row)
+
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+ERROR: new row violates WITH CHECK OPTION for "document"
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+ did | cid | dlevel | dauthor | dtitle
+-----+-----+--------+-------------------+----------------------------------
+ 88 | 33 | 1 | rls_regress_user1 | technology book, can only insert
+(1 row)
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index d50b103..c634579 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1123,6 +1123,10 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
SELECT * FROM shoelace_obsolete WHERE sl_avail = 0;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
+ERROR: INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
sl_name | sl_avail | sl_color | sl_len | sl_unit | sl_len_cm
------------+----------+------------+--------+----------+-----------
@@ -2351,6 +2355,23 @@ DETAIL: Key (id3a, id3c)=(1, 13) is not present in table "rule_and_refint_t2".
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+ERROR: insert or update on table "rule_and_refint_t3" violates foreign key constraint "rule_and_refint_t3_id3a_fkey"
+DETAIL: Key (id3a, id3b)=(1, 13) is not present in table "rule_and_refint_t1".
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
+ERROR: relation "shoelace" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
where (((rule_and_refint_t3.id3a = new.id3a)
diff --git a/src/test/regress/expected/subselect.out b/src/test/regress/expected/subselect.out
index b14410f..9ba3a44 100644
--- a/src/test/regress/expected/subselect.out
+++ b/src/test/regress/expected/subselect.out
@@ -639,6 +639,28 @@ from
(0 rows)
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 1: ...conflict (key) update set val = 'unsupported ' || (select f1...
+ ^
+select * from upsert;
+ key | val
+-----+-----
+ 1 | val
+(1 row)
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: on conflict (key) update set val = (select u from aa)
+ ^
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
create temp table outer_7597 (f1 int4, f2 int4);
diff --git a/src/test/regress/expected/triggers.out b/src/test/regress/expected/triggers.out
index f1a5fde..77dfa06 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -274,7 +274,7 @@ drop sequence ttdummy_seq;
-- tests for per-statement triggers
--
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
BEGIN
@@ -291,6 +291,14 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
--
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
+NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+NOTICE: trigger_func(after_ins_stmt) called: action = INSERT, when = AFTER, level = STATEMENT
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
INSERT INTO main_table DEFAULT VALUES;
@@ -305,6 +313,8 @@ NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, lev
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
NOTICE: trigger_func(after_upd_stmt) called: action = UPDATE, when = AFTER, level = STATEMENT
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
NOTICE: trigger_func(before_ins_stmt) called: action = INSERT, when = BEFORE, level = STATEMENT
@@ -1731,3 +1741,93 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (1,black)
+WARNING: after insert (new): (1,black)
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (2,red)
+WARNING: before insert (new, modified): (3,"red trig modified")
+WARNING: after insert (new): (3,"red trig modified")
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (3,orange)
+WARNING: before update (old): (3,"red trig modified")
+WARNING: before update (new): (3,"updated red trig modified")
+WARNING: after update (old): (3,"updated red trig modified")
+WARNING: after update (new): (3,"updated red trig modified")
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (4,green)
+WARNING: before insert (new, modified): (5,"green trig modified")
+WARNING: after insert (new): (5,"green trig modified")
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (5,purple)
+WARNING: before update (old): (5,"green trig modified")
+WARNING: before update (new): (5,"updated green trig modified")
+WARNING: after update (old): (5,"updated green trig modified")
+WARNING: after update (new): (5,"updated green trig modified")
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (6,white)
+WARNING: before insert (new, modified): (7,"white trig modified")
+WARNING: after insert (new): (7,"white trig modified")
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (7,pink)
+WARNING: before update (old): (7,"white trig modified")
+WARNING: before update (new): (7,"updated white trig modified")
+WARNING: after update (old): (7,"updated white trig modified")
+WARNING: after update (new): (7,"updated white trig modified")
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+WARNING: before insert (new): (8,yellow)
+WARNING: before insert (new, modified): (9,"yellow trig modified")
+WARNING: after insert (new): (9,"yellow trig modified")
+select * from upsert;
+ key | color
+-----+-----------------------------
+ 1 | black
+ 3 | updated red trig modified
+ 5 | updated green trig modified
+ 7 | updated white trig modified
+ 9 | yellow trig modified
+(5 rows)
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/expected/updatable_views.out b/src/test/regress/expected/updatable_views.out
index 80c5706..22b5bc1 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -215,6 +215,10 @@ INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
DETAIL: View columns that are not columns of their base relation are not updatable.
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
+ERROR: relation "rw_view15" is not an ordinary table
+HINT: Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
ERROR: cannot insert into column "upper" of view "rw_view15"
diff --git a/src/test/regress/expected/update.out b/src/test/regress/expected/update.out
index 1de2a86..58714ac 100644
--- a/src/test/regress/expected/update.out
+++ b/src/test/regress/expected/update.out
@@ -147,4 +147,31 @@ SELECT a, b, char_length(c) FROM update_test;
42 | 12 | 10000
(4 rows)
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a NOT IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE EXISTS(SELECT b FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a IN (SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ALL(SELECT a FROM update_test);
+ ^
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 2: WHERE a = ANY(SELECT a FROM update_test);
+ ^
DROP TABLE update_test;
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index 06b372b..81d664e 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -1806,6 +1806,80 @@ SELECT * FROM y;
-400
(22 rows)
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+ k | v | a
+---+--------+---
+ 0 | insert | 0
+ 0 | insert | 0
+(2 rows)
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+ k | v
+----+------------------
+ 0 | insert
+ 1 | 1 v, now update
+ 2 | insert
+ 3 | insert
+ 4 | 4 v, now update
+ 5 | insert
+ 6 | insert
+ 7 | 7 v, now update
+ 8 | insert
+ 9 | insert
+ 10 | 10 v, now update
+ 11 | insert
+ 12 | insert
+ 13 | 13 v, now update
+ 14 | insert
+ 15 | insert
+ 16 | 16 v, now update
+(17 rows)
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ...ICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a ...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+ERROR: cannot use subquery in ON CONFLICT UPDATE
+LINE 3: ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM a...
+ ^
+DROP TABLE z;
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
INSERT INTO y SELECT generate_series(1, 3);
diff --git a/src/test/regress/input/constraints.source b/src/test/regress/input/constraints.source
index 8ec0054..46bce36 100644
--- a/src/test/regress/input/constraints.source
+++ b/src/test/regress/input/constraints.source
@@ -292,6 +292,11 @@ INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+
SELECT '' AS five, * FROM UNIQUE_TBL;
DROP TABLE UNIQUE_TBL;
diff --git a/src/test/regress/output/constraints.source b/src/test/regress/output/constraints.source
index 0d32a9eab..add3f0c 100644
--- a/src/test/regress/output/constraints.source
+++ b/src/test/regress/output/constraints.source
@@ -421,16 +421,23 @@ INSERT INTO UNIQUE_TBL VALUES (4, 'four');
INSERT INTO UNIQUE_TBL VALUES (5, 'one');
INSERT INTO UNIQUE_TBL (t) VALUES ('six');
INSERT INTO UNIQUE_TBL (t) VALUES ('seven');
+INSERT INTO UNIQUE_TBL VALUES (5, 'five-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'five-upsert-update';
+INSERT INTO UNIQUE_TBL VALUES (6, 'six-upsert-insert') ON CONFLICT (i) UPDATE SET t = 'six-upsert-update';
+-- should fail
+INSERT INTO UNIQUE_TBL VALUES (1, 'a'), (2, 'b'), (2, 'b') ON CONFLICT (i) UPDATE SET t = 'fails';
+ERROR: ON CONFLICT UPDATE command could not lock/update self-inserted tuple
+HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
SELECT '' AS five, * FROM UNIQUE_TBL;
- five | i | t
-------+---+-------
+ five | i | t
+------+---+--------------------
| 1 | one
| 2 | two
| 4 | four
- | 5 | one
| | six
| | seven
-(6 rows)
+ | 5 | five-upsert-update
+ | 6 | six-upsert-insert
+(7 rows)
DROP TABLE UNIQUE_TBL;
CREATE TABLE UNIQUE_TBL (i int, t text,
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index e0ae2f2..528d3b7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -36,6 +36,7 @@ test: geometry horology regex oidjoins type_sanity opr_sanity
# These four each depend on the previous one
# ----------
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/serial_schedule b/src/test/regress/serial_schedule
index 7f762bd..b7c8f53 100644
--- a/src/test/regress/serial_schedule
+++ b/src/test/regress/serial_schedule
@@ -50,6 +50,7 @@ test: oidjoins
test: type_sanity
test: opr_sanity
test: insert
+test: insert_conflict
test: create_function_1
test: create_type
test: create_table
diff --git a/src/test/regress/sql/insert_conflict.sql b/src/test/regress/sql/insert_conflict.sql
new file mode 100644
index 0000000..472d4ab
--- /dev/null
+++ b/src/test/regress/sql/insert_conflict.sql
@@ -0,0 +1,192 @@
+--
+-- insert...on conflict update unique index inference
+--
+create table insertconflicttest(key int4, fruit text);
+
+--
+-- Single key tests
+--
+create unique index key_index on insertconflicttest(key);
+
+--
+-- Explain tests
+--
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit;
+-- Should display qual actually attributable to internal sequential scan:
+explain (costs off) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Cawesh';
+-- With EXCLUDED.* expression in scan node:
+explain (costs off) insert into insertconflicttest values(0, 'Crowberry') on conflict (key) update set fruit = excluded.fruit where excluded.fruit != 'Elderberry';
+-- Does the same, but JSON format shows "Arbiter Index":
+explain (costs off, format json) insert into insertconflicttest values (0, 'Bilberry') on conflict (key) update set fruit = excluded.fruit where target.fruit != 'Lime' returning *;
+
+-- Fails (no unique index inference specification, required for update variant):
+insert into insertconflicttest values (1, 'Apple') on conflict update set fruit = excluded.fruit;
+
+-- inference succeeds:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (2, 'Orange') on conflict (key, key, key) update set fruit = excluded.fruit;
+
+-- Succeed, since multi-assignment does not involve subquery:
+INSERT INTO insertconflicttest
+VALUES (1, 'Apple'), (2, 'Orange')
+ON CONFLICT (key) UPDATE SET (fruit, key) = (EXCLUDED.fruit, EXCLUDED.key);
+-- Don't accept original table name -- only TARGET.* alias:
+insert into insertconflicttest values (1, 'Apple') on conflict (key) update set fruit = insertconflicttest.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (3, 'Kiwi') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (4, 'Mango') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (5, 'Lemon') on conflict (fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (6, 'Passionfruit') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+
+drop index key_index;
+
+--
+-- Composite key tests
+--
+create unique index comp_key_index on insertconflicttest(key, fruit);
+
+-- inference succeeds:
+insert into insertconflicttest values (7, 'Raspberry') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (8, 'Lime') on conflict (fruit, key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (9, 'Banana') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (10, 'Blueberry') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (11, 'Cherry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (12, 'Date') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index comp_key_index;
+
+--
+-- Partial index tests, no inference predicate specificied
+--
+create unique index part_comp_key_index on insertconflicttest(key, fruit) where key < 5;
+create unique index expr_part_comp_key_index on insertconflicttest(key, lower(fruit)) where key < 5;
+
+-- inference fails:
+insert into insertconflicttest values (13, 'Grape') on conflict (key, fruit) update set fruit = excluded.fruit;
+insert into insertconflicttest values (14, 'Raisin') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (15, 'Cranberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (16, 'Melon') on conflict (key, key, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (17, 'Mulberry') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (18, 'Pineapple') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+
+drop index part_comp_key_index;
+drop index expr_part_comp_key_index;
+
+--
+-- Expression index tests
+--
+create unique index expr_key_index on insertconflicttest(lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (20, 'Quince') on conflict (lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (21, 'Pomegranate') on conflict (lower(fruit), lower(fruit)) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (22, 'Apricot') on conflict (upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index expr_key_index;
+
+--
+-- Expression index tests (with regular column)
+--
+create unique index expr_comp_key_index on insertconflicttest(key, lower(fruit));
+
+-- inference succeeds:
+insert into insertconflicttest values (24, 'Plum') on conflict (key, lower(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (25, 'Peach') on conflict (lower(fruit), key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (26, 'Fig') on conflict (lower(fruit), key, lower(fruit), key) update set fruit = excluded.fruit;
+
+-- inference fails:
+insert into insertconflicttest values (27, 'Prune') on conflict (key, upper(fruit)) update set fruit = excluded.fruit;
+insert into insertconflicttest values (28, 'Redcurrant') on conflict (fruit, key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (29, 'Nectarine') on conflict (key) update set fruit = excluded.fruit;
+
+drop index expr_comp_key_index;
+
+--
+-- Non-spurious duplicate violation tests
+--
+create unique index key_index on insertconflicttest(key);
+create unique index fruit_index on insertconflicttest(fruit);
+
+-- succeeds, since UPDATE happens to update "fruit" to existing value:
+insert into insertconflicttest values (26, 'Fig') on conflict (key) update set fruit = excluded.fruit;
+-- fails, since UPDATE is to row with key value 26, and we're updating "fruit"
+-- to a value that happens to exist in another row ('peach'):
+insert into insertconflicttest values (26, 'Peach') on conflict (key) update set fruit = excluded.fruit;
+-- succeeds, since "key" isn't repeated/referenced in UPDATE, and "fruit"
+-- arbitrates that statement updates existing "Fig" row:
+insert into insertconflicttest values (25, 'Fig') on conflict (fruit) update set fruit = excluded.fruit;
+
+drop index key_index;
+drop index fruit_index;
+
+--
+-- Test partial unique index inference
+--
+create unique index partial_key_index on insertconflicttest(key) where fruit like '%berry';
+
+-- Succeeds
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' and fruit = 'inconsequential') ignore;
+
+-- fails
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key) update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (key where fruit like '%berry' or fruit = 'consequential') ignore;
+insert into insertconflicttest values (23, 'Blackberry') on conflict (fruit where fruit like '%berry') update set fruit = excluded.fruit;
+insert into insertconflicttest values (23, 'Uncovered by Index') on conflict (key where fruit like '%berry') ignore;
+
+drop index partial_key_index;
+
+-- Cleanup
+drop table insertconflicttest;
+
+-- ******************************************************************
+-- * *
+-- * Test inheritance (example taken from tutorial) *
+-- * *
+-- ******************************************************************
+create table cities (
+ name text,
+ population float8,
+ altitude int -- (in ft)
+);
+
+create table capitals (
+ state char(2)
+) inherits (cities);
+
+-- Create unique indexes. Due to a general limitation of inheritance,
+-- uniqueness is only enforced per-relation
+create unique index cities_names_unique on cities (name);
+create unique index capitals_names_unique on capitals (name);
+
+-- prepopulate the tables.
+insert into cities values ('San Francisco', 7.24E+5, 63);
+insert into cities values ('Las Vegas', 2.583E+5, 2174);
+insert into cities values ('Mariposa', 1200, 1953);
+
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA');
+insert into capitals values ('Madison', 1.913E+5, 845, 'WI');
+
+-- Tests proper for inheritance:
+
+-- fails:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) update set altitude = excluded.altitude;
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict (name) ignore;
+
+-- Succeeds:
+
+-- There is at least limited support for relations with children:
+insert into cities values ('Las Vegas', 2.583E+5, 2174) on conflict ignore;
+-- No children, and so no restrictions:
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) update set altitude = excluded.altitude;
+insert into capitals values ('Sacramento', 3.694E+5, 30, 'CA') on conflict (name) ignore;
+
+-- clean up
+drop table capitals;
+drop table cities;
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index f97a75a..861eac6 100644
--- a/src/test/regress/sql/privileges.sql
+++ b/src/test/regress/sql/privileges.sql
@@ -194,7 +194,7 @@ SELECT * FROM atestv2; -- fail (even though regressuser2 can access underlying a
-- Test column level permissions
SET SESSION AUTHORIZATION regressuser1;
-CREATE TABLE atest5 (one int, two int, three int);
+CREATE TABLE atest5 (one int, two int unique, three int);
CREATE TABLE atest6 (one int, two int, blue int);
GRANT SELECT (one), INSERT (two), UPDATE (three) ON atest5 TO regressuser4;
GRANT ALL (one) ON atest5 TO regressuser3;
@@ -245,6 +245,9 @@ INSERT INTO atest5 VALUES (5,5,5); -- fail
UPDATE atest5 SET three = 10; -- ok
UPDATE atest5 SET one = 8; -- fail
UPDATE atest5 SET three = 5, one = 2; -- fail
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set three = 10; -- ok
+INSERT INTO atest5(two) VALUES (6) ON CONFLICT (two) UPDATE set one = 8; -- fails (due to UPDATE)
+INSERT INTO atest5(three) VALUES (4) ON CONFLICT (two) UPDATE set three = 10; -- fails (due to INSERT)
SET SESSION AUTHORIZATION regressuser1;
REVOKE ALL (one) ON atest5 FROM regressuser4;
diff --git a/src/test/regress/sql/rowsecurity.sql b/src/test/regress/sql/rowsecurity.sql
index ed7adbf..5c660d5 100644
--- a/src/test/regress/sql/rowsecurity.sql
+++ b/src/test/regress/sql/rowsecurity.sql
@@ -436,6 +436,79 @@ DELETE FROM only t1 WHERE f_leak(b) RETURNING oid, *, t1;
DELETE FROM t1 WHERE f_leak(b) RETURNING oid, *, t1;
--
+-- INSERT ... ON CONFLICT UPDATE and Row-level security
+--
+
+-- Would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Can't insert new violating tuple, either:
+INSERT INTO document VALUES (22, 11, 2, 'rls_regress_user2', 'mediocre novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- INSERT path is taken here, so UPDATE targelist doesn't matter:
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+-- Update path will now taken for same query, so UPDATE targelist now matters
+-- (this is the same query as the last, but now fails):
+INSERT INTO document VALUES (33, 22, 1, 'rls_regress_user1', 'okay science fiction')
+ ON CONFLICT (did) UPDATE SET dauthor = 'rls_regress_user3' RETURNING *;
+
+SET SESSION AUTHORIZATION rls_regress_user0;
+DROP POLICY p1 ON document;
+
+CREATE POLICY p1 ON document FOR SELECT USING (true);
+CREATE POLICY p2 ON document FOR INSERT WITH CHECK (dauthor = current_user);
+CREATE POLICY p3 ON document FOR UPDATE
+ USING (cid = (SELECT cid from category WHERE cname = 'novel'))
+ WITH CHECK (dauthor = current_user);
+
+SET SESSION AUTHORIZATION rls_regress_user1;
+
+-- Again, would fail with unique violation, but with ON CONFLICT fails as row is
+-- locked for update (notably, existing/target row is not leaked, which is what
+-- failed to satisfy WITH CHECK options - not row proposed for insertion by
+-- user):
+INSERT INTO document VALUES (8, 44, 1, 'rls_regress_user1', 'my fourth manga')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+
+-- Again, can't insert new violating tuple, either (unsuccessfully inserted tuple
+-- values are reported here, though)
+--
+-- Violates actual CHECK OPTION within UPDATE:
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user2', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = EXCLUDED.dauthor;
+
+-- Violates USING qual for UPDATE policy p3, interpreted here as CHECK OPTION.
+--
+-- UPDATE path is taken, but UPDATE fails purely because *existing* row to be
+-- updated is not a "novel"/cid 11 (row is not leaked, even though we have
+-- SELECT privileges sufficient to see the row in this instance):
+INSERT INTO document VALUES (33, 11, 1, 'rls_regress_user1', 'Some novel, replaces sci-fi')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle;
+-- Fine (we UPDATE):
+INSERT INTO document VALUES (2, 11, 1, 'rls_regress_user1', 'my first novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle RETURNING *;
+-- Fine (we INSERT, so "cid = 33" isn't evaluated):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (same query, but we UPDATE, so "cid = 33" is evaluated at end of
+-- UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, cid = 33 RETURNING *;
+-- Fail (we UPDATE, so dauthor assignment is evaluated at end of UPDATE):
+INSERT INTO document VALUES (78, 11, 1, 'rls_regress_user1', 'some other novel')
+ ON CONFLICT (did) UPDATE SET dtitle = EXCLUDED.dtitle, dauthor = 'rls_regress_user2';
+-- Don't fail because INSERT doesn't satisfy WITH CHECK option that originated
+-- as a barrier/USING() qual from the UPDATE. Note that the UPDATE path
+-- *isn't* taken, and so UPDATE-related policy does not apply:
+INSERT INTO document VALUES (88, 33, 1, 'rls_regress_user1', 'technology book, can only insert')
+ ON CONFLICT (did) UPDATE SET dtitle = upper(EXCLUDED.dtitle) RETURNING *;
+
+--
-- ROLE/GROUP
--
SET SESSION AUTHORIZATION rls_regress_user0;
diff --git a/src/test/regress/sql/rules.sql b/src/test/regress/sql/rules.sql
index 1e15f84..7cb5f39 100644
--- a/src/test/regress/sql/rules.sql
+++ b/src/test/regress/sql/rules.sql
@@ -680,6 +680,9 @@ SELECT * FROM shoelace_log ORDER BY sl_name;
insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0);
insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0);
+-- Unsupported (even though a similar updatable view construct is)
+insert into shoelace values ('sl10', 1000, 'magenta', 40.0, 'inch', 0.0)
+ on conflict ignore;
SELECT * FROM shoelace_obsolete ORDER BY sl_len_cm;
SELECT * FROM shoelace_candelete;
@@ -844,6 +847,17 @@ insert into rule_and_refint_t3 values (1, 12, 11, 'row3');
insert into rule_and_refint_t3 values (1, 12, 12, 'row4');
insert into rule_and_refint_t3 values (1, 11, 13, 'row5');
insert into rule_and_refint_t3 values (1, 13, 11, 'row6');
+-- Ordinary table
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict ignore;
+-- rule not fired, so fk violation
+insert into rule_and_refint_t3 values (1, 13, 11, 'row6')
+ on conflict (id3a, id3b, id3c) update
+ set id3b = excluded.id3b;
+-- rule fired, so unsupported (only updatable views have limited support)
+insert into shoelace values ('sl9', 0, 'pink', 35.0, 'inch', 0.0)
+ on conflict (id1a, id1b) update
+ set sl_avail = excluded.sl_avail;
create rule rule_and_refint_t3_ins as on insert to rule_and_refint_t3
where (exists (select 1 from rule_and_refint_t3
diff --git a/src/test/regress/sql/subselect.sql b/src/test/regress/sql/subselect.sql
index 4be2e40..2be9cb7 100644
--- a/src/test/regress/sql/subselect.sql
+++ b/src/test/regress/sql/subselect.sql
@@ -374,6 +374,20 @@ from
int4_tbl i4 on dummy = i4.f1;
--
+-- Test case for subselect within UPDATE of INSERT...ON CONFLICT UPDATE
+--
+create temp table upsert(key int4 primary key, val text);
+insert into upsert values(1, 'val') on conflict (key) update set val = 'not seen';
+insert into upsert values(1, 'val') on conflict (key) update set val = 'unsupported ' || (select f1 from int4_tbl where f1 != 0 limit 1)::text;
+
+select * from upsert;
+
+with aa as (select 'int4_tbl' u from int4_tbl limit 1)
+insert into upsert values (1, 'x'), (999, 'y')
+on conflict (key) update set val = (select u from aa)
+returning *;
+
+--
-- Test case for cross-type partial matching in hashed subplan (bug #7597)
--
diff --git a/src/test/regress/sql/triggers.sql b/src/test/regress/sql/triggers.sql
index 0ea2c31..323ca1a 100644
--- a/src/test/regress/sql/triggers.sql
+++ b/src/test/regress/sql/triggers.sql
@@ -208,7 +208,7 @@ drop sequence ttdummy_seq;
CREATE TABLE log_table (tstamp timestamp default timeofday()::timestamp);
-CREATE TABLE main_table (a int, b int);
+CREATE TABLE main_table (a int unique, b int);
COPY main_table (a,b) FROM stdin;
5 10
@@ -237,6 +237,12 @@ FOR EACH STATEMENT EXECUTE PROCEDURE trigger_func('after_ins_stmt');
CREATE TRIGGER after_upd_stmt_trig AFTER UPDATE ON main_table
EXECUTE PROCEDURE trigger_func('after_upd_stmt');
+-- Both insert and update statement level triggers (before and after) should
+-- fire. Doesn't fire UPDATE before trigger, but only because one isn't
+-- defined.
+INSERT INTO main_table (a, b) VALUES (5, 10) ON CONFLICT (a)
+ UPDATE SET b = EXCLUDED.b;
+
CREATE TRIGGER after_upd_row_trig AFTER UPDATE ON main_table
FOR EACH ROW EXECUTE PROCEDURE trigger_func('after_upd_row');
@@ -246,6 +252,9 @@ UPDATE main_table SET a = a + 1 WHERE b < 30;
-- UPDATE that effects zero rows should still call per-statement trigger
UPDATE main_table SET a = a + 2 WHERE b > 100;
+-- constraint now unneeded
+ALTER TABLE main_table DROP CONSTRAINT main_table_a_key;
+
-- COPY should fire per-row and per-statement INSERT triggers
COPY main_table (a, b) FROM stdin;
30 40
@@ -1173,3 +1182,61 @@ select * from self_ref_trigger;
drop table self_ref_trigger;
drop function self_ref_trigger_ins_func();
drop function self_ref_trigger_del_func();
+
+--
+-- Verify behavior of before and after triggers with INSERT...ON CONFLICT
+-- UPDATE
+--
+create table upsert (key int4 primary key, color text);
+
+create function upsert_before_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'before update (old): %', old.*::text;
+ raise warning 'before update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'before insert (new): %', new.*::text;
+ if new.key % 2 = 0 then
+ new.key := new.key + 1;
+ new.color := new.color || ' trig modified';
+ raise warning 'before insert (new, modified): %', new.*::text;
+ end if;
+ end if;
+ return new;
+end;
+$$;
+create trigger upsert_before_trig before insert or update on upsert
+ for each row execute procedure upsert_before_func();
+
+create function upsert_after_func()
+ returns trigger language plpgsql as
+$$
+begin
+ if (TG_OP = 'UPDATE') then
+ raise warning 'after update (old): %', new.*::text;
+ raise warning 'after update (new): %', new.*::text;
+ elsif (TG_OP = 'INSERT') then
+ raise warning 'after insert (new): %', new.*::text;
+ end if;
+ return null;
+end;
+$$;
+create trigger upsert_after_trig after insert or update on upsert
+ for each row execute procedure upsert_after_func();
+
+insert into upsert values(1, 'black') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(2, 'red') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(3, 'orange') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(4, 'green') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(5, 'purple') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(6, 'white') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(7, 'pink') on conflict (key) update set color = 'updated ' || target.color;
+insert into upsert values(8, 'yellow') on conflict (key) update set color = 'updated ' || target.color;
+
+select * from upsert;
+
+drop table upsert;
+drop function upsert_before_func();
+drop function upsert_after_func();
diff --git a/src/test/regress/sql/updatable_views.sql b/src/test/regress/sql/updatable_views.sql
index 60c7e29..48dd9a9 100644
--- a/src/test/regress/sql/updatable_views.sql
+++ b/src/test/regress/sql/updatable_views.sql
@@ -69,6 +69,8 @@ DELETE FROM rw_view14 WHERE a=3; -- should be OK
-- Partially updatable view
INSERT INTO rw_view15 VALUES (3, 'ROW 3'); -- should fail
INSERT INTO rw_view15 (a) VALUES (3); -- should be OK
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT IGNORE; -- succeeds
+INSERT INTO rw_view15 (a) VALUES (3) ON CONFLICT (a) IGNORE; -- fails, unsupported
ALTER VIEW rw_view15 ALTER COLUMN upper SET DEFAULT 'NOT SET';
INSERT INTO rw_view15 (a) VALUES (4); -- should fail
UPDATE rw_view15 SET upper='ROW 3' WHERE a=3; -- should fail
diff --git a/src/test/regress/sql/update.sql b/src/test/regress/sql/update.sql
index e71128c..903f3fb 100644
--- a/src/test/regress/sql/update.sql
+++ b/src/test/regress/sql/update.sql
@@ -74,4 +74,18 @@ UPDATE update_test AS t SET b = update_test.b + 10 WHERE t.a = 10;
UPDATE update_test SET c = repeat('x', 10000) WHERE c = 'car';
SELECT a, b, char_length(c) FROM update_test;
+ALTER TABLE update_test ADD constraint uuu UNIQUE(a);
+
+-- fail, update predicates are disallowed:
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a NOT IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE EXISTS(SELECT b FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a IN (SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ALL(SELECT a FROM update_test);
+INSERT INTO update_test VALUES(31, 77) ON CONFLICT (a) UPDATE SET b = 16
+WHERE a = ANY(SELECT a FROM update_test);
+
DROP TABLE update_test;
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index c716369..8d49384 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -795,6 +795,43 @@ SELECT * FROM t LIMIT 10;
SELECT * FROM y;
+-- data-modifying WITH containing INSERT...ON CONFLICT UPDATE
+CREATE TABLE z AS SELECT i AS k, (i || ' v')::text v FROM generate_series(1, 16, 3) i;
+ALTER TABLE z ADD UNIQUE (k);
+
+WITH t AS (
+ INSERT INTO z SELECT i, 'insert'
+ FROM generate_series(0, 16) i
+ ON CONFLICT (k) UPDATE SET v = TARGET.v || ', now update'
+ RETURNING *
+)
+SELECT * FROM t JOIN y ON t.k = y.a ORDER BY a, k;
+
+-- New query/snapshot demonstrates side-effects of previous query.
+SELECT * FROM z ORDER BY k;
+
+--
+-- All these cases should fail, due to restrictions imposed upon the UPDATE
+-- portion of the query.
+--
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = ' update' WHERE target.k = (SELECT a FROM aa);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+WITH aa AS (SELECT 'a' a, 'b' b UNION ALL SELECT 'a' a, 'b' b)
+INSERT INTO z VALUES(1, 'insert')
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 'a' LIMIT 1);
+WITH aa AS (SELECT 1 a, 2 b)
+INSERT INTO z VALUES(1, (SELECT b || ' insert' FROM aa WHERE a = 1 ))
+ON CONFLICT (k) UPDATE SET v = (SELECT b || ' update' FROM aa WHERE a = 1 LIMIT 1);
+
+DROP TABLE z;
+
-- check that run to completion happens in proper ordering
TRUNCATE TABLE y;
--
1.9.1
0003-RLS-support-for-ON-CONFLICT-UPDATE.patchtext/x-patch; charset=US-ASCII; name=0003-RLS-support-for-ON-CONFLICT-UPDATE.patchDownload
From 633e0dec759f114788599289612e6ba41d9a5652 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 6 Jan 2015 16:32:21 -0800
Subject: [PATCH 3/6] RLS support for ON CONFLICT UPDATE
Row-Level Security policies may apply to UPDATE commands or INSERT
commands only. UPDATE RLS policies can have both USING() security
barrier quals, and CHECK options (INSERT RLS policies may only have
CHECK options, though). It is necessary to carefully consider the
behavior of RLS policies in the context of INSERT with ON CONFLICT
UPDATE, since ON CONFLICT UPDATE is more or less a new top-level
command, conceptually quite different to two separate statements (an
INSERT and an UPDATE).
The approach taken is to "bunch together" both sets of policies, and to
enforce them in 3 different places against three different slots (3
different stages of query processing in the executor).
Note that UPDATE policy USING() barrier quals are always treated as
CHECK options. It is thought that silently failing when USING() barrier
quals are not satisfied is a more surprising outcome, even if it is
closer to the existing behavior of UPDATE statements. This is because
the user's intent to UPDATE one particular row based on simple criteria
is quite clear with ON CONFLICT UPDATE.
The 3 places that RLS policies are enforced are:
* Against row actually inserted, after insertion proceeds successfully
(INSERT-applicable policies only).
* Against row in target table that caused conflict. The implementation
is careful not to leak the contents of that row in diagnostic
messages (INSERT-applicable *and* UPDATE-applicable policies).
* Against the version of the row added by to the relation after
ExecUpdate() is called (INSERT-applicable *and* UPDATE-applicable
policies).
Documentation and tests follow in later commits.
---
src/backend/executor/execMain.c | 25 ++++++---
src/backend/executor/nodeModifyTable.c | 53 ++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 1 +
src/backend/nodes/equalfuncs.c | 1 +
src/backend/nodes/outfuncs.c | 1 +
src/backend/nodes/readfuncs.c | 1 +
src/backend/rewrite/rewriteHandler.c | 2 +
src/backend/rewrite/rowsecurity.c | 94 +++++++++++++++++++++++++++++-----
src/include/executor/executor.h | 3 +-
src/include/nodes/parsenodes.h | 1 +
10 files changed, 158 insertions(+), 24 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 3d7761d..56fa3bd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -1697,7 +1697,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
*/
void
ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate)
+ TupleTableSlot *slot, bool detail,
+ bool onlyInsert, EState *estate)
{
Relation rel = resultRelInfo->ri_RelationDesc;
TupleDesc tupdesc = RelationGetDescr(rel);
@@ -1722,6 +1723,15 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
ExprState *wcoExpr = (ExprState *) lfirst(l2);
/*
+ * INSERT ... ON CONFLICT UPDATE callers may require that not all WITH
+ * CHECK OPTIONs associated with resultRelInfo are enforced at all
+ * stages of query processing. (UPDATE-related policies are not
+ * enforced in respect of a successfully inserted tuple).
+ */
+ if (onlyInsert && wco->commandType == CMD_UPDATE)
+ continue;
+
+ /*
* WITH CHECK OPTION checks are intended to ensure that the new tuple
* is visible (in the case of a view) or that it passes the
* 'with-check' policy (in the case of row security).
@@ -1732,16 +1742,17 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
*/
if (!ExecQual((List *) wcoExpr, econtext, false))
{
- char *val_desc;
+ char *val_desc = NULL;
Bitmapset *modifiedCols;
modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
- val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
- slot,
- tupdesc,
- modifiedCols,
- 64);
+ if (detail)
+ val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
+ slot,
+ tupdesc,
+ modifiedCols,
+ 64);
ereport(ERROR,
(errcode(ERRCODE_WITH_CHECK_OPTION_VIOLATION),
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index ec1ef07..f3f0750 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -462,7 +462,8 @@ vlock:
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, spec == SPEC_INSERT,
+ estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -956,7 +957,7 @@ lreplace:;
/* Check any WITH CHECK OPTION constraints */
if (resultRelInfo->ri_WithCheckOptions != NIL)
- ExecWithCheckOptions(resultRelInfo, slot, estate);
+ ExecWithCheckOptions(resultRelInfo, slot, true, false, estate);
/* Process RETURNING if present */
if (resultRelInfo->ri_projectReturning)
@@ -1138,6 +1139,54 @@ ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+ /*
+ * For RLS with ON CONFLICT UPDATE, security quals are always
+ * treated as WITH CHECK options, even when there were separate
+ * security quals and explicit WITH CHECK options (ordinarily,
+ * security quals are only treated as WITH CHECK options when there
+ * are no explicit WITH CHECK options). Also, CHECK OPTIONs
+ * (originating either explicitly, or implicitly as security quals)
+ * for both UPDATE and INSERT policies (or ALL policies) are
+ * checked (as CHECK OPTIONs) at three different points for three
+ * distinct but related tuples/slots in the context of ON CONFLICT
+ * UPDATE. There are three relevant ExecWithCheckOptions() calls:
+ *
+ * * After successful insertion, within ExecInsert(), against the
+ * inserted tuple. This only includes INSERT-applicable policies.
+ *
+ * * Here, after row locking but before calling ExecUpdate(), on
+ * the existing tuple in the target relation (which we cannot leak
+ * details of). This is conceptually like a security barrier qual
+ * for the purposes of the auxiliary update, although unlike
+ * regular updates that require security barrier quals we prefer to
+ * raise an error (by treating the security barrier quals as CHECK
+ * OPTIONS) rather than silently not affect rows, because the
+ * intent to update seems clear and unambiguous for ON CONFLICT
+ * UPDATE. This includes both INSERT-applicable and
+ * UPDATE-applicable policies.
+ *
+ * * On the final tuple created by the update within ExecUpdate (if
+ * any). This is also subject to INSERT policy enforcement, unlike
+ * conventional ExecUpdate() calls for UPDATE statements -- it
+ * includes both INSERT-applicable and UPDATE-applicable policies.
+ */
+ if (resultRelInfo->ri_WithCheckOptions != NIL)
+ {
+ TupleTableSlot *opts;
+
+ /* Construct temp slot for locked tuple from target */
+ opts = MakeSingleTupleTableSlot(slot->tts_tupleDescriptor);
+ ExecStoreTuple(copyTuple, opts, InvalidBuffer, false);
+
+ /*
+ * Check, but without leaking contents of tuple; user only
+ * supplied one conflicting value or composition of values, and
+ * not the entire tuple.
+ */
+ ExecWithCheckOptions(resultRelInfo, opts, false, false,
+ estate);
+ }
+
if (!TupIsNull(slot))
*returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
planSlot, &onConflict->mt_epqstate,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index ed86b8f..816db90 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2075,6 +2075,7 @@ _copyWithCheckOption(const WithCheckOption *from)
COPY_STRING_FIELD(viewname);
COPY_NODE_FIELD(qual);
+ COPY_SCALAR_FIELD(commandType);
COPY_SCALAR_FIELD(cascaded);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 24e58fa..4057c27 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2384,6 +2384,7 @@ _equalWithCheckOption(const WithCheckOption *a, const WithCheckOption *b)
{
COMPARE_STRING_FIELD(viewname);
COMPARE_NODE_FIELD(qual);
+ COMPARE_SCALAR_FIELD(commandType);
COMPARE_SCALAR_FIELD(cascaded);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index a4ddb9c..93f0442 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2337,6 +2337,7 @@ _outWithCheckOption(StringInfo str, const WithCheckOption *node)
WRITE_STRING_FIELD(viewname);
WRITE_NODE_FIELD(qual);
+ WRITE_ENUM_FIELD(commandType, CmdType);
WRITE_BOOL_FIELD(cascaded);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 48a7206..9f3e0c8 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -272,6 +272,7 @@ _readWithCheckOption(void)
READ_STRING_FIELD(viewname);
READ_NODE_FIELD(qual);
+ READ_ENUM_FIELD(commandType, CmdType);
READ_BOOL_FIELD(cascaded);
READ_DONE();
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 12b0b06..8f544e1 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1767,6 +1767,7 @@ fireRIRrules(Query *parsetree, List *activeRIRs, bool forUpdatePushedDown)
List *quals = NIL;
wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->commandType = parsetree->commandType;
quals = lcons(wco->qual, quals);
activeRIRs = lcons_oid(RelationGetRelid(rel), activeRIRs);
@@ -2935,6 +2936,7 @@ rewriteTargetView(Query *parsetree, Relation view)
wco->viewname = pstrdup(RelationGetRelationName(view));
wco->qual = NULL;
wco->cascaded = cascaded;
+ wco->commandType = viewquery->commandType;
parsetree->withCheckOptions = lcons(wco,
parsetree->withCheckOptions);
diff --git a/src/backend/rewrite/rowsecurity.c b/src/backend/rewrite/rowsecurity.c
index 7669130..09f1ac3 100644
--- a/src/backend/rewrite/rowsecurity.c
+++ b/src/backend/rewrite/rowsecurity.c
@@ -56,12 +56,14 @@
#include "utils/syscache.h"
#include "tcop/utility.h"
-static List *pull_row_security_policies(CmdType cmd, Relation relation,
- Oid user_id);
+static List *pull_row_security_policies(CmdType cmd, bool onConflict,
+ Relation relation, Oid user_id);
static void process_policies(List *policies, int rt_index,
Expr **final_qual,
Expr **final_with_check_qual,
- bool *hassublinks);
+ bool *hassublinks,
+ Expr **spec_with_check_eval,
+ bool onConflict);
static bool check_role_for_policy(ArrayType *policy_roles, Oid user_id);
/*
@@ -88,6 +90,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
Expr *rowsec_with_check_expr = NULL;
Expr *hook_expr = NULL;
Expr *hook_with_check_expr = NULL;
+ Expr *hook_spec_with_check_expr = NULL;
List *rowsec_policies;
List *hook_policies = NIL;
@@ -149,8 +152,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Grab the built-in policies which should be applied to this relation. */
rel = heap_open(rte->relid, NoLock);
- rowsec_policies = pull_row_security_policies(root->commandType, rel,
- user_id);
+ rowsec_policies = pull_row_security_policies(root->commandType,
+ root->specClause == SPEC_INSERT,
+ rel, user_id);
/*
* Check if this is only the default-deny policy.
@@ -168,7 +172,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Now that we have our policies, build the expressions from them. */
process_policies(rowsec_policies, rt_index, &rowsec_expr,
- &rowsec_with_check_expr, &hassublinks);
+ &rowsec_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
/*
* Also, allow extensions to add their own policies.
@@ -198,7 +204,9 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
/* Build the expression from any policies returned. */
process_policies(hook_policies, rt_index, &hook_expr,
- &hook_with_check_expr, &hassublinks);
+ &hook_with_check_expr, &hassublinks,
+ &hook_spec_with_check_expr,
+ root->specClause == SPEC_INSERT);
}
/*
@@ -230,6 +238,7 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) rowsec_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
@@ -244,6 +253,23 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
wco->viewname = RelationGetRelationName(rel);
wco->qual = (Node *) hook_with_check_expr;
wco->cascaded = false;
+ wco->commandType = root->commandType;
+ root->withCheckOptions = lcons(wco, root->withCheckOptions);
+ }
+
+ /*
+ * Also add the expression, if any, returned from the extension that
+ * applies to auxiliary UPDATE within ON CONFLICT UPDATE.
+ */
+ if (hook_spec_with_check_expr)
+ {
+ WithCheckOption *wco;
+
+ wco = (WithCheckOption *) makeNode(WithCheckOption);
+ wco->viewname = RelationGetRelationName(rel);
+ wco->qual = (Node *) hook_spec_with_check_expr;
+ wco->cascaded = false;
+ wco->commandType = CMD_UPDATE;
root->withCheckOptions = lcons(wco, root->withCheckOptions);
}
}
@@ -288,7 +314,8 @@ prepend_row_security_policies(Query* root, RangeTblEntry* rte, int rt_index)
*
*/
static List *
-pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
+pull_row_security_policies(CmdType cmd, bool onConflict, Relation relation,
+ Oid user_id)
{
List *policies = NIL;
ListCell *item;
@@ -322,7 +349,9 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
if (policy->polcmd == ACL_INSERT_CHR
&& check_role_for_policy(policy->roles, user_id))
policies = lcons(policy, policies);
- break;
+ if (!onConflict)
+ break;
+ /* FALL THRU */
case CMD_UPDATE:
if (policy->polcmd == ACL_UPDATE_CHR
&& check_role_for_policy(policy->roles, user_id))
@@ -384,26 +413,41 @@ pull_row_security_policies(CmdType cmd, Relation relation, Oid user_id)
*/
static void
process_policies(List *policies, int rt_index, Expr **qual_eval,
- Expr **with_check_eval, bool *hassublinks)
+ Expr **with_check_eval, bool *hassublinks,
+ Expr **spec_with_check_eval, bool onConflict)
{
ListCell *item;
List *quals = NIL;
List *with_check_quals = NIL;
+ List *conflict_update_quals = NIL;
/*
* Extract the USING and WITH CHECK quals from each of the policies
- * and add them to our lists.
+ * and add them to our lists. CONFLICT UPDATE quals are always treated
+ * as CHECK OPTIONS.
*/
foreach(item, policies)
{
RowSecurityPolicy *policy = (RowSecurityPolicy *) lfirst(item);
if (policy->qual != NULL)
- quals = lcons(copyObject(policy->qual), quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ quals = lcons(copyObject(policy->qual), quals);
+ else
+ conflict_update_quals = lcons(copyObject(policy->qual), quals);
+ }
if (policy->with_check_qual != NULL)
- with_check_quals = lcons(copyObject(policy->with_check_qual),
- with_check_quals);
+ {
+ if (!onConflict || policy->polcmd != ACL_UPDATE_CHR)
+ with_check_quals = lcons(copyObject(policy->with_check_qual),
+ with_check_quals);
+ else
+ conflict_update_quals =
+ lcons(copyObject(policy->with_check_qual),
+ conflict_update_quals);
+ }
if (policy->hassublinks)
*hassublinks = true;
@@ -420,6 +464,10 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
/*
* If we end up with only USING quals, then use those as
* WITH CHECK quals also.
+ *
+ * For the INSERT with ON CONFLICT UPDATE case, we always enforce that the
+ * UPDATE's USING quals are treated like WITH CHECK quals, enforced against
+ * the target relation's tuple in multiple places.
*/
if (with_check_quals == NIL)
with_check_quals = copyObject(quals);
@@ -453,6 +501,24 @@ process_policies(List *policies, int rt_index, Expr **qual_eval,
else
*with_check_eval = (Expr*) linitial(with_check_quals);
+ /*
+ * For INSERT with ON CONFLICT UPDATE, *both* sets of WITH CHECK options
+ * (from any INSERT policy and any UPDATE policy) are enforced.
+ *
+ * These are handled separately because enforcement of each type of WITH
+ * CHECK option is based on the point in query processing of INSERT ... ON
+ * CONFLICT UPDATE. The INSERT path does not enforce UPDATE related CHECK
+ * OPTIONs.
+ */
+ if (conflict_update_quals != NIL)
+ {
+ if (list_length(conflict_update_quals) > 1)
+ *spec_with_check_eval = makeBoolExpr(AND_EXPR,
+ conflict_update_quals, -1);
+ else
+ *spec_with_check_eval = (Expr*) linitial(conflict_update_quals);
+ }
+
return;
}
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index accdc83..a59e857 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -195,7 +195,8 @@ extern bool ExecContextForcesOids(PlanState *planstate, bool *hasoids);
extern void ExecConstraints(ResultRelInfo *resultRelInfo,
TupleTableSlot *slot, EState *estate);
extern void ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
- TupleTableSlot *slot, EState *estate);
+ TupleTableSlot *slot, bool detail, bool onlyInsert,
+ EState *estate);
extern ExecRowMark *ExecFindRowMark(EState *estate, Index rti);
extern ExecAuxRowMark *ExecBuildAuxRowMark(ExecRowMark *erm, List *targetlist);
extern TupleTableSlot *EvalPlanQual(EState *estate, EPQState *epqstate,
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index c03c9ca..19d2484 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -868,6 +868,7 @@ typedef struct WithCheckOption
NodeTag type;
char *viewname; /* name of view that specified the WCO */
Node *qual; /* constraint qual to check */
+ CmdType commandType; /* select|insert|update|delete */
bool cascaded; /* true = WITH CASCADED CHECK OPTION */
} WithCheckOption;
--
1.9.1
0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchtext/x-patch; charset=US-ASCII; name=0002-Support-INSERT-.-ON-CONFLICT-UPDATE-IGNORE.patchDownload
From 6ccbe3d95dc85322ac32e74779ab9b2b828568db Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Wed, 27 Aug 2014 15:01:32 -0700
Subject: [PATCH 2/6] Support INSERT ... ON CONFLICT {UPDATE | IGNORE}
This non-standard INSERT clause allows DML statement authors to specify
that in the event of each of any of the tuples being inserted
duplicating an existing tuple in terms of a value or set of values
constrained by a unique index, an alternative path may be taken. The
statement may alternatively IGNORE the tuple being inserted without
raising an error, or go to UPDATE the existing tuple whose value is
duplicated by a value within one single tuple proposed for insertion.
The implementation loops until either an insert or an UPDATE/IGNORE
occurs. No existing tuple may be affected more than once per INSERT.
This is implemented using a new infrastructure called "speculative
insertion". (The approach to "Value locking" presenting here follows
design #2, as described on the value locking Postgres Wiki page).
Alternatively, we may go to UPDATE, using the EvalPlanQual() mechanism
to execute a special auxiliary plan.
READ COMMITTED isolation level is permitted to UPDATE a tuple even where
no version is visible to the command's MVCC snapshot. Similarly, any
query predicate associated with the UPDATE portion of the new statement
need only satisfy an already locked, conclusively committed and visible
conflict tuple. When the predicate isn't satisfied, the tuple is still
locked, which implies that at READ COMMITTED, a tuple may be locked
without any version being visible to the command's MVCC snapshot.
Users specify a single unique index to take the alternative path on,
which is inferred from a set of user-supplied column names (or
expressions). This is mandatory for the ON CONFLICT UPDATE variant,
which should address concerns about spuriously taking an incorrect
alternative ON CONFLICT path (i.e. the wrong unique index is used for
arbitration of whether or not to take the alternative path) due to there
being more than one would-be unique violation. Previous revisions of
the patch didn't mandate this. However, we may still IGNORE based on
the first would-be unique violation detected, on the assumption that it
doesn't particularly matter where it originated from for that variant
(iff the user didn't make a point of indicated his or her intent).
The auxiliary ModifyTable plan used by the UPDATE portion of the new
statement is not formally a subplan of its parent INSERT ModifyTable
plan. Rather, it's an independently planned subquery, whose execution
is tightly driven by its parent. Special auxiliary state pertaining to
the auxiliary UPDATE is tracked by its parent through all stages of
query execution.
The implementation imposes some restrictions on child auxiliary UPDATE
plans, which make the plans comport with their parent to the extent
required during the executor stage. One user-visible consequences of
this is that the special auxiliary UPDATE query cannot have subselects
within its targetlist or WHERE clause. UPDATEs may not reference any
other table, and UPDATE FROM is disallowed. INSERT's RETURNING clause
projects tuples successfully inserted and updated. If an ON CONFLICT
UPDATE's WHERE clause is not satisfied in respect of some slot/tuple,
the post-update tuple is not projected (since an UPDATE didn't occur
- although the row is still locked).
Note that pg_stat_statements does not fingerprint ExludedExpr, because
it cannot appear in the post-parse-analysis, pre-rewrite Query tree.
(pg_stat_statements does not fingerprint every primnode anyway, mostly
because some are only expected in utility statements). Other existing
Node handling sites that don't expect to see primnodes that appear only
after rewriting (ExcludedExpr may be in its own subcategory here in that
it is the only such non-utility related Node) do not have an
ExcludedExpr case added either.
---
contrib/pg_stat_statements/pg_stat_statements.c | 5 +
contrib/postgres_fdw/deparse.c | 7 +-
contrib/postgres_fdw/postgres_fdw.c | 16 +-
contrib/postgres_fdw/postgres_fdw.h | 2 +-
src/backend/access/heap/heapam.c | 97 ++++-
src/backend/access/nbtree/nbtinsert.c | 32 +-
src/backend/catalog/index.c | 59 ++-
src/backend/catalog/indexing.c | 2 +-
src/backend/commands/constraint.c | 7 +-
src/backend/commands/copy.c | 7 +-
src/backend/commands/explain.c | 93 ++++-
src/backend/executor/execMain.c | 14 +-
src/backend/executor/execQual.c | 54 +++
src/backend/executor/execUtils.c | 321 ++++++++++++++--
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeModifyTable.c | 462 +++++++++++++++++++++++-
src/backend/nodes/copyfuncs.c | 55 +++
src/backend/nodes/equalfuncs.c | 43 +++
src/backend/nodes/nodeFuncs.c | 74 ++++
src/backend/nodes/outfuncs.c | 18 +
src/backend/nodes/readfuncs.c | 19 +
src/backend/optimizer/path/indxpath.c | 57 +++
src/backend/optimizer/path/tidpath.c | 8 +-
src/backend/optimizer/plan/createplan.c | 16 +-
src/backend/optimizer/plan/planner.c | 53 +++
src/backend/optimizer/plan/setrefs.c | 32 +-
src/backend/optimizer/plan/subselect.c | 6 +
src/backend/optimizer/util/plancat.c | 222 +++++++++++-
src/backend/parser/analyze.c | 86 ++++-
src/backend/parser/gram.y | 75 +++-
src/backend/parser/parse_clause.c | 262 ++++++++++++--
src/backend/parser/parse_expr.c | 6 +-
src/backend/parser/parse_node.c | 8 +-
src/backend/rewrite/rewriteHandler.c | 127 ++++++-
src/backend/storage/ipc/procarray.c | 109 ++++++
src/backend/storage/lmgr/lmgr.c | 80 ++++
src/backend/tcop/pquery.c | 16 +-
src/backend/utils/adt/lockfuncs.c | 1 +
src/backend/utils/adt/ruleutils.c | 39 ++
src/backend/utils/time/tqual.c | 46 ++-
src/bin/psql/common.c | 5 +-
src/include/access/heapam.h | 3 +-
src/include/access/heapam_xlog.h | 2 +
src/include/access/htup_details.h | 12 +
src/include/catalog/index.h | 2 +
src/include/executor/executor.h | 21 +-
src/include/nodes/execnodes.h | 19 +
src/include/nodes/nodes.h | 18 +
src/include/nodes/parsenodes.h | 40 +-
src/include/nodes/plannodes.h | 3 +
src/include/nodes/primnodes.h | 47 +++
src/include/optimizer/paths.h | 1 +
src/include/optimizer/plancat.h | 2 +
src/include/optimizer/planmain.h | 3 +-
src/include/parser/kwlist.h | 2 +
src/include/parser/parse_clause.h | 2 +
src/include/parser/parse_node.h | 1 +
src/include/storage/lmgr.h | 5 +
src/include/storage/lock.h | 10 +
src/include/storage/proc.h | 13 +
src/include/storage/procarray.h | 7 +
src/include/utils/snapshot.h | 11 +
62 files changed, 2704 insertions(+), 170 deletions(-)
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 95616b3..414ec83 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -2198,6 +2198,11 @@ JumbleQuery(pgssJumbleState *jstate, Query *query)
JumbleRangeTable(jstate, query->rtable);
JumbleExpr(jstate, (Node *) query->jointree);
JumbleExpr(jstate, (Node *) query->targetList);
+ APP_JUMB(query->specClause);
+ JumbleExpr(jstate, (Node *) query->arbiterExpr);
+ JumbleExpr(jstate, query->arbiterWhere);
+ if (query->onConflict)
+ JumbleQuery(jstate, (Query *) query->onConflict);
JumbleExpr(jstate, (Node *) query->returningList);
JumbleExpr(jstate, (Node *) query->groupClause);
JumbleExpr(jstate, query->havingQual);
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 59cb053..ca51586 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -847,8 +847,8 @@ appendWhereClause(StringInfo buf,
void
deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
- List **retrieved_attrs)
+ List *targetAttrs, bool ignore,
+ List *returningList, List **retrieved_attrs)
{
AttrNumber pindex;
bool first;
@@ -892,6 +892,9 @@ deparseInsertSql(StringInfo buf, PlannerInfo *root,
else
appendStringInfoString(buf, " DEFAULT VALUES");
+ if (ignore)
+ appendStringInfoString(buf, " ON CONFLICT IGNORE");
+
deparseReturningList(buf, root, rtindex, rel,
rel->trigdesc && rel->trigdesc->trig_insert_after_row,
returningList, retrieved_attrs);
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index d76e739..1539899 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -1167,6 +1167,7 @@ postgresPlanForeignModify(PlannerInfo *root,
List *targetAttrs = NIL;
List *returningList = NIL;
List *retrieved_attrs = NIL;
+ bool ignore = false;
initStringInfo(&sql);
@@ -1201,7 +1202,7 @@ postgresPlanForeignModify(PlannerInfo *root,
int col;
col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
+ while ((col = bms_next_member(rte->updatedCols, col)) >= 0)
{
/* bit numbers are offset by FirstLowInvalidHeapAttributeNumber */
AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
@@ -1218,6 +1219,17 @@ postgresPlanForeignModify(PlannerInfo *root,
if (plan->returningLists)
returningList = (List *) list_nth(plan->returningLists, subplan_index);
+ if (root->parse->arbiterExpr)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT unique index inference")));
+ else if (plan->spec == SPEC_INSERT)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("postgres_fdw does not support ON CONFLICT UPDATE")));
+ else if (plan->spec == SPEC_IGNORE)
+ ignore = true;
+
/*
* Construct the SQL command string.
*/
@@ -1225,7 +1237,7 @@ postgresPlanForeignModify(PlannerInfo *root,
{
case CMD_INSERT:
deparseInsertSql(&sql, root, resultRelation, rel,
- targetAttrs, returningList,
+ targetAttrs, ignore, returningList,
&retrieved_attrs);
break;
case CMD_UPDATE:
diff --git a/contrib/postgres_fdw/postgres_fdw.h b/contrib/postgres_fdw/postgres_fdw.h
index 950c6f7..3763a57 100644
--- a/contrib/postgres_fdw/postgres_fdw.h
+++ b/contrib/postgres_fdw/postgres_fdw.h
@@ -60,7 +60,7 @@ extern void appendWhereClause(StringInfo buf,
List **params);
extern void deparseInsertSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
- List *targetAttrs, List *returningList,
+ List *targetAttrs, bool ignore, List *returningList,
List **retrieved_attrs);
extern void deparseUpdateSql(StringInfo buf, PlannerInfo *root,
Index rtindex, Relation rel,
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 46060bc..0aa3e57 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -2048,6 +2048,9 @@ FreeBulkInsertState(BulkInsertState bistate)
* This causes rows to be frozen, which is an MVCC violation and
* requires explicit options chosen by user.
*
+ * If HEAP_INSERT_SPECULATIVE is specified, the MyProc->specInsert fields
+ * are filled.
+ *
* Note that these options will be applied when inserting into the heap's
* TOAST table, too, if the tuple requires any out-of-line data.
*
@@ -2196,6 +2199,13 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
END_CRIT_SECTION();
+ /*
+ * Let others know that we speculatively inserted this tuple, before
+ * releasing the buffer lock.
+ */
+ if (options & HEAP_INSERT_SPECULATIVE)
+ SetSpeculativeInsertionTid(relation->rd_node, &heaptup->t_self);
+
UnlockReleaseBuffer(buffer);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
@@ -2616,11 +2626,17 @@ xmax_infomask_changed(uint16 new_infomask, uint16 old_infomask)
* (the last only for HeapTupleSelfUpdated, since we
* cannot obtain cmax from a combocid generated by another transaction).
* See comments for struct HeapUpdateFailureData for additional info.
+ *
+ * If 'killspeculative' is true, caller requires that we "super-delete" a tuple
+ * we just inserted in the same command. Instead of the normal visibility
+ * checks, we check that the tuple was inserted by the current transaction and
+ * given command id. Also, instead of setting its xmax, we set xmin to
+ * invalid, making it immediately appear as dead to everyone.
*/
HTSU_Result
heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd)
+ HeapUpdateFailureData *hufd, bool killspeculative)
{
HTSU_Result result;
TransactionId xid = GetCurrentTransactionId();
@@ -2678,7 +2694,18 @@ heap_delete(Relation relation, ItemPointer tid,
tp.t_self = *tid;
l1:
- result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ if (!killspeculative)
+ {
+ result = HeapTupleSatisfiesUpdate(&tp, cid, buffer);
+ }
+ else
+ {
+ if (tp.t_data->t_choice.t_heap.t_xmin != xid ||
+ tp.t_data->t_choice.t_heap.t_field3.t_cid != cid)
+ elog(ERROR, "attempted to super-delete a tuple from other CID");
+ result = HeapTupleMayBeUpdated;
+ }
+
if (result == HeapTupleInvisible)
{
@@ -2823,12 +2850,15 @@ l1:
* using our own TransactionId below, since some other backend could
* incorporate our XID into a MultiXact immediately afterwards.)
*/
- MultiXactIdSetOldestMember();
+ if (!killspeculative)
+ {
+ MultiXactIdSetOldestMember();
- compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
- tp.t_data->t_infomask, tp.t_data->t_infomask2,
- xid, LockTupleExclusive, true,
- &new_xmax, &new_infomask, &new_infomask2);
+ compute_new_xmax_infomask(HeapTupleHeaderGetRawXmax(tp.t_data),
+ tp.t_data->t_infomask, tp.t_data->t_infomask2,
+ xid, LockTupleExclusive, true,
+ &new_xmax, &new_infomask, &new_infomask2);
+ }
START_CRIT_SECTION();
@@ -2855,8 +2885,23 @@ l1:
tp.t_data->t_infomask |= new_infomask;
tp.t_data->t_infomask2 |= new_infomask2;
HeapTupleHeaderClearHotUpdated(tp.t_data);
- HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
- HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ /*
+ * When killing a speculatively-inserted tuple, we set xmin to invalid
+ * instead of setting xmax, to make the tuple clearly invisible to
+ * everyone. In particular, we want HeapTupleSatisfiesDirty() to regard
+ * the tuple as dead, so that another backend inserting a duplicate key
+ * value won't unnecessarily wait for our transaction to finish.
+ */
+ if (!killspeculative)
+ {
+ HeapTupleHeaderSetXmax(tp.t_data, new_xmax);
+ HeapTupleHeaderSetCmax(tp.t_data, cid, iscombo);
+ }
+ else
+ {
+ HeapTupleHeaderSetXmin(tp.t_data, InvalidTransactionId);
+ }
+
/* Make sure there is no forward chain link in t_ctid */
tp.t_data->t_ctid = tp.t_self;
@@ -2872,7 +2917,11 @@ l1:
if (RelationIsAccessibleInLogicalDecoding(relation))
log_heap_new_cid(relation, &tp);
- xlrec.flags = all_visible_cleared ? XLOG_HEAP_ALL_VISIBLE_CLEARED : 0;
+ xlrec.flags = 0;
+ if (all_visible_cleared)
+ xlrec.flags |= XLOG_HEAP_ALL_VISIBLE_CLEARED;
+ if (killspeculative)
+ xlrec.flags |= XLOG_HEAP_KILLED_SPECULATIVE_TUPLE;
xlrec.infobits_set = compute_infobits(tp.t_data->t_infomask,
tp.t_data->t_infomask2);
xlrec.offnum = ItemPointerGetOffsetNumber(&tp.t_self);
@@ -2977,7 +3026,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
result = heap_delete(relation, tid,
GetCurrentCommandId(true), InvalidSnapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd, false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -4070,14 +4119,16 @@ get_mxact_status_for_lock(LockTupleMode mode, bool is_update)
*
* Function result may be:
* HeapTupleMayBeUpdated: lock was successfully acquired
+ * HeapTupleInvisible: lock failed because tuple instantaneously invisible
* HeapTupleSelfUpdated: lock failed because tuple updated by self
* HeapTupleUpdated: lock failed because tuple updated by other xact
* HeapTupleWouldBlock: lock couldn't be acquired and wait_policy is skip
*
- * In the failure cases, the routine fills *hufd with the tuple's t_ctid,
- * t_xmax (resolving a possible MultiXact, if necessary), and t_cmax
- * (the last only for HeapTupleSelfUpdated, since we
- * cannot obtain cmax from a combocid generated by another transaction).
+ * In the failure cases other than HeapTupleInvisible, the routine fills
+ * *hufd with the tuple's t_ctid, t_xmax (resolving a possible MultiXact,
+ * if necessary), and t_cmax (the last only for HeapTupleSelfUpdated,
+ * since we cannot obtain cmax from a combocid generated by another
+ * transaction).
* See comments for struct HeapUpdateFailureData for additional info.
*
* See README.tuplock for a thorough explanation of this mechanism.
@@ -4115,8 +4166,15 @@ l3:
if (result == HeapTupleInvisible)
{
- UnlockReleaseBuffer(*buffer);
- elog(ERROR, "attempted to lock invisible tuple");
+ LockBuffer(*buffer, BUFFER_LOCK_UNLOCK);
+
+ /*
+ * This is possible, but only when locking a tuple for speculative
+ * insertion. We return this value here rather than throwing an error
+ * in order to give that case the opportunity to throw a more specific
+ * error.
+ */
+ return HeapTupleInvisible;
}
else if (result == HeapTupleBeingUpdated)
{
@@ -7326,7 +7384,10 @@ heap_xlog_delete(XLogReaderState *record)
HeapTupleHeaderClearHotUpdated(htup);
fix_infomask_from_infobits(xlrec->infobits_set,
&htup->t_infomask, &htup->t_infomask2);
- HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ if (!(xlrec->flags & XLOG_HEAP_KILLED_SPECULATIVE_TUPLE))
+ HeapTupleHeaderSetXmax(htup, xlrec->xmax);
+ else
+ HeapTupleHeaderSetXmin(htup, InvalidTransactionId);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Mark the page as a candidate for pruning */
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 932c6f7..1a4e18d 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -51,7 +51,8 @@ static Buffer _bt_newroot(Relation rel, Buffer lbuf, Buffer rbuf);
static TransactionId _bt_check_unique(Relation rel, IndexTuple itup,
Relation heapRel, Buffer buf, OffsetNumber offset,
ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique);
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken);
static void _bt_findinsertloc(Relation rel,
Buffer *bufptr,
OffsetNumber *offsetptr,
@@ -159,17 +160,27 @@ top:
*/
if (checkUnique != UNIQUE_CHECK_NO)
{
- TransactionId xwait;
+ TransactionId xwait;
+ uint32 speculativeToken;
offset = _bt_binsrch(rel, buf, natts, itup_scankey, false);
xwait = _bt_check_unique(rel, itup, heapRel, buf, offset, itup_scankey,
- checkUnique, &is_unique);
+ checkUnique, &is_unique, &speculativeToken);
if (TransactionIdIsValid(xwait))
{
/* Have to wait for the other guy ... */
_bt_relbuf(rel, buf);
- XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+ /*
+ * If it's a speculative insertion, wait for it to finish (ie.
+ * to go ahead with the insertion, or kill the tuple). Otherwise
+ * wait for the transaction to finish as usual.
+ */
+ if (speculativeToken)
+ SpeculativeInsertionWait(xwait, speculativeToken);
+ else
+ XactLockTableWait(xwait, rel, &itup->t_tid, XLTW_InsertIndex);
+
/* start over... */
_bt_freestack(stack);
goto top;
@@ -211,9 +222,12 @@ top:
* also point to end-of-page, which means that the first tuple to check
* is the first tuple on the next page.
*
- * Returns InvalidTransactionId if there is no conflict, else an xact ID
- * we must wait for to see if it commits a conflicting tuple. If an actual
- * conflict is detected, no return --- just ereport().
+ * Returns InvalidTransactionId if there is no conflict, else an xact ID we
+ * must wait for to see if it commits a conflicting tuple. If an actual
+ * conflict is detected, no return --- just ereport(). If an xact ID is
+ * returned, and the conflicting tuple still has a speculative insertion in
+ * progress, *speculativeToken is set to non-zero, and the caller can wait for
+ * the verdict on the insertion using SpeculativeInsertionWait().
*
* However, if checkUnique == UNIQUE_CHECK_PARTIAL, we always return
* InvalidTransactionId because we don't want to wait. In this case we
@@ -223,7 +237,8 @@ top:
static TransactionId
_bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
Buffer buf, OffsetNumber offset, ScanKey itup_scankey,
- IndexUniqueCheck checkUnique, bool *is_unique)
+ IndexUniqueCheck checkUnique, bool *is_unique,
+ uint32 *speculativeToken)
{
TupleDesc itupdesc = RelationGetDescr(rel);
int natts = rel->rd_rel->relnatts;
@@ -340,6 +355,7 @@ _bt_check_unique(Relation rel, IndexTuple itup, Relation heapRel,
if (nbuf != InvalidBuffer)
_bt_relbuf(rel, nbuf);
/* Tell _bt_doinsert to wait... */
+ *speculativeToken = SnapshotDirty.speculativeToken;
return xwait;
}
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index f85ed93..e986d7e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -1662,6 +1662,10 @@ BuildIndexInfo(Relation index)
/* other info */
ii->ii_Unique = indexStruct->indisunique;
ii->ii_ReadyForInserts = IndexIsReady(indexStruct);
+ /* assume not doing speculative insertion for now */
+ ii->ii_UniqueOps = NULL;
+ ii->ii_UniqueProcs = NULL;
+ ii->ii_UniqueStrats = NULL;
/* initialize index-build state to default */
ii->ii_Concurrent = false;
@@ -1671,6 +1675,53 @@ BuildIndexInfo(Relation index)
}
/* ----------------
+ * AddUniqueSpeculative
+ * Append extra state to IndexInfo record
+ *
+ * For unique indexes, we usually don't want to add info to the IndexInfo for
+ * checking uniqueness, since the B-Tree AM handles that directly. However, in
+ * the case of speculative insertion, external support is required.
+ *
+ * Do this processing here rather than in BuildIndexInfo() to save the common
+ * non-speculative cases the overhead they'd otherwise incur.
+ * ----------------
+ */
+void
+AddUniqueSpeculative(Relation index, IndexInfo *ii)
+{
+ int ncols = index->rd_rel->relnatts;
+ int i;
+
+ /*
+ * fetch info for checking unique indexes
+ */
+ Assert(ii->ii_Unique);
+
+ if (index->rd_rel->relam != BTREE_AM_OID)
+ elog(ERROR, "unexpected non-btree speculative unique index");
+
+ ii->ii_UniqueOps = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueProcs = (Oid *) palloc(sizeof(Oid) * ncols);
+ ii->ii_UniqueStrats = (uint16 *) palloc(sizeof(uint16) * ncols);
+
+ /*
+ * We have to look up the operator's strategy number. This
+ * provides a cross-check that the operator does match the index.
+ */
+ /* We need the func OIDs and strategy numbers too */
+ for (i = 0; i < ncols; i++)
+ {
+ ii->ii_UniqueStrats[i] = BTEqualStrategyNumber;
+ ii->ii_UniqueOps[i] =
+ get_opfamily_member(index->rd_opfamily[i],
+ index->rd_opcintype[i],
+ index->rd_opcintype[i],
+ ii->ii_UniqueStrats[i]);
+ ii->ii_UniqueProcs[i] = get_opcode(ii->ii_UniqueOps[i]);
+ }
+}
+
+/* ----------------
* FormIndexDatum
* Construct values[] and isnull[] arrays for a new index tuple.
*
@@ -2606,10 +2657,10 @@ IndexCheckExclusion(Relation heapRelation,
/*
* Check that this tuple has no conflicts.
*/
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- &(heapTuple->t_self), values, isnull,
- estate, true, false);
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &(heapTuple->t_self),
+ values, isnull, estate, true,
+ false, true, NULL);
}
heap_endscan(scan);
diff --git a/src/backend/catalog/indexing.c b/src/backend/catalog/indexing.c
index fe123ad..0231084 100644
--- a/src/backend/catalog/indexing.c
+++ b/src/backend/catalog/indexing.c
@@ -46,7 +46,7 @@ CatalogOpenIndexes(Relation heapRel)
resultRelInfo->ri_RelationDesc = heapRel;
resultRelInfo->ri_TrigDesc = NULL; /* we don't fire triggers */
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, false);
return resultRelInfo;
}
diff --git a/src/backend/commands/constraint.c b/src/backend/commands/constraint.c
index 561d8fa..d5ab12f 100644
--- a/src/backend/commands/constraint.c
+++ b/src/backend/commands/constraint.c
@@ -170,9 +170,10 @@ unique_key_recheck(PG_FUNCTION_ARGS)
* For exclusion constraints we just do the normal check, but now it's
* okay to throw error.
*/
- check_exclusion_constraint(trigdata->tg_relation, indexRel, indexInfo,
- &(new_row->t_self), values, isnull,
- estate, false, false);
+ check_exclusion_or_unique_constraint(trigdata->tg_relation, indexRel,
+ indexInfo, &(new_row->t_self),
+ values, isnull, estate, false,
+ false, true, NULL);
}
/*
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index d2996fb..2d45eb3 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2283,7 +2283,7 @@ CopyFrom(CopyState cstate)
1, /* dummy rangetable index */
0);
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, false);
estate->es_result_relations = resultRelInfo;
estate->es_num_result_relations = 1;
@@ -2438,7 +2438,8 @@ CopyFrom(CopyState cstate)
if (resultRelInfo->ri_NumIndices > 0)
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false,
+ InvalidOid);
/* AFTER ROW INSERT Triggers */
ExecARInsertTriggers(estate, resultRelInfo, tuple,
@@ -2552,7 +2553,7 @@ CopyFromInsertBatch(CopyState cstate, EState *estate, CommandId mycid,
ExecStoreTuple(bufferedTuples[i], myslot, InvalidBuffer, false);
recheckIndexes =
ExecInsertIndexTuples(myslot, &(bufferedTuples[i]->t_self),
- estate);
+ estate, false, InvalidOid);
ExecARInsertTriggers(estate, resultRelInfo,
bufferedTuples[i],
recheckIndexes);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a951c55..3f79713 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -103,7 +103,8 @@ static void ExplainIndexScanDetails(Oid indexid, ScanDirection indexorderdir,
static void ExplainScanTarget(Scan *plan, ExplainState *es);
static void ExplainModifyTarget(ModifyTable *plan, ExplainState *es);
static void ExplainTargetRel(Plan *plan, Index rti, ExplainState *es);
-static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es);
+static void show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors);
static void ExplainMemberNodes(List *plans, PlanState **planstates,
List *ancestors, ExplainState *es);
static void ExplainSubPlans(List *plans, List *ancestors,
@@ -762,6 +763,9 @@ ExplainPreScanNode(PlanState *planstate, Bitmapset **rels_used)
ExplainPreScanMemberNodes(((ModifyTable *) plan)->plans,
((ModifyTableState *) planstate)->mt_plans,
rels_used);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainPreScanNode(((ModifyTableState *) planstate)->onConflict,
+ rels_used);
break;
case T_Append:
ExplainPreScanMemberNodes(((Append *) plan)->appendplans,
@@ -863,6 +867,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
const char *custom_name = NULL;
int save_indent = es->indent;
bool haschildren;
+ bool suppresschildren = false;
+ ModifyTable *mtplan;
switch (nodeTag(plan))
{
@@ -871,13 +877,33 @@ ExplainNode(PlanState *planstate, List *ancestors,
break;
case T_ModifyTable:
sname = "ModifyTable";
- switch (((ModifyTable *) plan)->operation)
+ mtplan = (ModifyTable *) plan;
+ switch (mtplan->operation)
{
case CMD_INSERT:
pname = operation = "Insert";
break;
case CMD_UPDATE:
- pname = operation = "Update";
+ if (mtplan->spec == SPEC_NONE)
+ {
+ pname = operation = "Update";
+ }
+ else
+ {
+ Assert(mtplan->spec == SPEC_UPDATE);
+
+ pname = operation = "Conflict Update";
+
+ /*
+ * Do not display child sequential scan/result node.
+ * Quals from child will be directly attributed to
+ * ModifyTable node, since we prefer to avoid
+ * displaying scan node to users, as it is merely an
+ * implementation detail; it is never executed in the
+ * conventional way.
+ */
+ suppresschildren = true;
+ }
break;
case CMD_DELETE:
pname = operation = "Delete";
@@ -1457,7 +1483,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate, es);
break;
case T_ModifyTable:
- show_modifytable_info((ModifyTableState *) planstate, es);
+ show_modifytable_info((ModifyTableState *) planstate, es,
+ ancestors);
break;
case T_Hash:
show_hash_info((HashState *) planstate, es);
@@ -1585,7 +1612,8 @@ ExplainNode(PlanState *planstate, List *ancestors,
planstate->subPlan;
if (haschildren)
{
- ExplainOpenGroup("Plans", "Plans", false, es);
+ if (!suppresschildren)
+ ExplainOpenGroup("Plans", "Plans", false, es);
/* Pass current PlanState as head of ancestors list for children */
ancestors = lcons(planstate, ancestors);
}
@@ -1608,9 +1636,13 @@ ExplainNode(PlanState *planstate, List *ancestors,
switch (nodeTag(plan))
{
case T_ModifyTable:
- ExplainMemberNodes(((ModifyTable *) plan)->plans,
- ((ModifyTableState *) planstate)->mt_plans,
- ancestors, es);
+ if (((ModifyTable *) plan)->spec != SPEC_UPDATE)
+ ExplainMemberNodes(((ModifyTable *) plan)->plans,
+ ((ModifyTableState *) planstate)->mt_plans,
+ ancestors, es);
+ if (((ModifyTable *) plan)->onConflictPlan)
+ ExplainNode(((ModifyTableState *) planstate)->onConflict,
+ ancestors, "Member", NULL, es);
break;
case T_Append:
ExplainMemberNodes(((Append *) plan)->appendplans,
@@ -1648,7 +1680,9 @@ ExplainNode(PlanState *planstate, List *ancestors,
if (haschildren)
{
ancestors = list_delete_first(ancestors);
- ExplainCloseGroup("Plans", "Plans", false, es);
+
+ if (!suppresschildren)
+ ExplainCloseGroup("Plans", "Plans", false, es);
}
/* in text format, undo whatever indentation we added */
@@ -2191,7 +2225,22 @@ ExplainScanTarget(Scan *plan, ExplainState *es)
static void
ExplainModifyTarget(ModifyTable *plan, ExplainState *es)
{
+ /*
+ * We show the name of the first target relation. In multi-target-table
+ * cases this should always be the parent of the inheritance tree.
+ */
+ Assert(plan->resultRelations != NIL);
+
ExplainTargetRel((Plan *) plan, plan->nominalRelation, es);
+
+ if (plan->arbiterIndex != InvalidOid)
+ {
+ char *indexname = get_rel_name(plan->arbiterIndex);
+
+ /* nothing to do for text format explains */
+ if (es->format != EXPLAIN_FORMAT_TEXT && indexname != NULL)
+ ExplainPropertyText("Arbiter Index", indexname, es);
+ }
}
/*
@@ -2227,6 +2276,12 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
if (es->verbose)
namespace = get_namespace_name(get_rel_namespace(rte->relid));
objecttag = "Relation Name";
+
+ /*
+ * ON CONFLICT's "TARGET" alias will not appear in output for
+ * auxiliary ModifyTable as its alias, because target
+ * resultRelation is shared between parent and auxiliary queries
+ */
break;
case T_FunctionScan:
{
@@ -2305,7 +2360,8 @@ ExplainTargetRel(Plan *plan, Index rti, ExplainState *es)
* Show extra information for a ModifyTable node
*/
static void
-show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
+show_modifytable_info(ModifyTableState *mtstate, ExplainState *es,
+ List *ancestors)
{
FdwRoutine *fdwroutine = mtstate->resultRelInfo->ri_FdwRoutine;
@@ -2327,6 +2383,23 @@ show_modifytable_info(ModifyTableState *mtstate, ExplainState *es)
0,
es);
}
+ else if (mtstate->spec == SPEC_UPDATE)
+ {
+ PlanState *ps = (*mtstate->mt_plans);
+
+ /*
+ * Seqscan node is always used, unless optimizer determined that
+ * predicate precludes ever updating, in which case a simple Result
+ * node is possible
+ */
+ Assert(IsA(ps->plan, SeqScan) || IsA(ps->plan, Result));
+
+ /* Attribute child scan node's qual to ModifyTable node */
+ show_scan_qual(ps->plan->qual, "Filter", ps, ancestors, es);
+
+ if (ps->plan->qual)
+ show_instrumentation_count("Rows Removed by Filter", 1, ps, es);
+ }
}
/*
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dbcebb7..3d7761d 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2122,7 +2122,8 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* the latest version of the row was deleted, so we need do
* nothing. (Should be safe to examine xmin without getting
* buffer's content lock, since xmin never changes in an existing
- * tuple.)
+ * non-promise tuple, and there is no reason to lock a promise
+ * tuple until it is clear that it has been fulfilled.)
*/
if (!TransactionIdEquals(HeapTupleHeaderGetXmin(tuple.t_data),
priorXmax))
@@ -2203,11 +2204,12 @@ EvalPlanQualFetch(EState *estate, Relation relation, int lockmode,
* case, so as to avoid the "Halloween problem" of
* repeated update attempts. In the latter case it might
* be sensible to fetch the updated tuple instead, but
- * doing so would require changing heap_lock_tuple as well
- * as heap_update and heap_delete to not complain about
- * updating "invisible" tuples, which seems pretty scary.
- * So for now, treat the tuple as deleted and do not
- * process.
+ * doing so would require changing heap_update and
+ * heap_delete to not complain about updating "invisible"
+ * tuples, which seems pretty scary (heap_lock_tuple will
+ * not complain, but few callers expect HeapTupleInvisible,
+ * and we're not one of them). So for now, treat the tuple
+ * as deleted and do not process.
*/
ReleaseBuffer(buffer);
return NULL;
diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index fec76d4..072e002 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -182,6 +182,9 @@ static Datum ExecEvalArrayCoerceExpr(ArrayCoerceExprState *astate,
bool *isNull, ExprDoneCond *isDone);
static Datum ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
bool *isNull, ExprDoneCond *isDone);
+static Datum ExecEvalExcluded(ExcludedExprState *excludedExpr,
+ ExprContext *econtext, bool *isNull,
+ ExprDoneCond *isDone);
/* ----------------------------------------------------------------
@@ -4331,6 +4334,33 @@ ExecEvalCurrentOfExpr(ExprState *exprstate, ExprContext *econtext,
return 0; /* keep compiler quiet */
}
+/* ----------------------------------------------------------------
+ * ExecEvalExcluded
+ * ----------------------------------------------------------------
+ */
+static Datum
+ExecEvalExcluded(ExcludedExprState *excludedExpr, ExprContext *econtext,
+ bool *isNull, ExprDoneCond *isDone)
+{
+ /*
+ * ExcludedExpr is essentially an expression that adapts its single Var
+ * argument to refer to the expression context inner slot's tuple, which is
+ * reserved for the purpose of referencing EXCLUDED.* tuples within ON
+ * CONFLICT UPDATE auxiliary queries' EPQ expression context (ON CONFLICT
+ * UPDATE makes special use of the EvalPlanQual() mechanism to update).
+ *
+ * nodeModifyTable.c assigns its own table slot in the auxiliary queries'
+ * EPQ expression state (originating in the parent INSERT node) on the
+ * assumption that it may only be used by ExcludedExpr, and on the
+ * assumption that the inner slot is not otherwise useful. This occurs in
+ * advance of the expression evaluation for UPDATE (which calls here are
+ * part of) once per slot proposed for insertion, and works because of
+ * restrictions on the structure of ON CONFLICT UPDATE auxiliary queries.
+ *
+ * Just evaluate nested Var.
+ */
+ return ExecEvalScalarVar(excludedExpr->arg, econtext, isNull, isDone);
+}
/*
* ExecEvalExprSwitchContext
@@ -5058,6 +5088,30 @@ ExecInitExpr(Expr *node, PlanState *parent)
state = (ExprState *) makeNode(ExprState);
state->evalfunc = ExecEvalCurrentOfExpr;
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExprState *cstate = makeNode(ExcludedExprState);
+ Var *contained = (Var*) excludedexpr->arg;
+
+ /*
+ * varno forced to INNER_VAR -- see remarks within
+ * ExecLockUpdateTuple().
+ *
+ * We rely on the assumption that the only place that
+ * ExcludedExpr may appear is where EXCLUDED Var references
+ * originally appeared after parse analysis. The rewriter
+ * replaces these with ExcludedExpr that reference the
+ * corresponding Var within the ON CONFLICT UPDATE target RTE.
+ */
+ Assert(IsA(contained, Var));
+
+ contained->varno = INNER_VAR;
+ cstate->arg = ExecInitExpr((Expr *) contained, parent);
+ state = (ExprState *) cstate;
+ state->evalfunc = (ExprStateEvalFunc) ExecEvalExcluded;
+ }
+ break;
case T_TargetEntry:
{
TargetEntry *tle = (TargetEntry *) node;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 022041b..f1e7241 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -44,6 +44,7 @@
#include "access/relscan.h"
#include "access/transam.h"
+#include "access/xact.h"
#include "catalog/index.h"
#include "executor/execdebug.h"
#include "nodes/nodeFuncs.h"
@@ -885,7 +886,7 @@ ExecCloseScanRelation(Relation scanrel)
* ----------------------------------------------------------------
*/
void
-ExecOpenIndices(ResultRelInfo *resultRelInfo)
+ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative)
{
Relation resultRelation = resultRelInfo->ri_RelationDesc;
List *indexoidlist;
@@ -938,6 +939,13 @@ ExecOpenIndices(ResultRelInfo *resultRelInfo)
/* extract index key information from the index's pg_index info */
ii = BuildIndexInfo(indexDesc);
+ /*
+ * Iff the indexes are to be used for speculative insertion, add extra
+ * information required by unique index entries
+ */
+ if (speculative && ii->ii_Unique)
+ AddUniqueSpeculative(indexDesc, ii);
+
relationDescs[i] = indexDesc;
indexInfoArray[i] = ii;
i++;
@@ -990,7 +998,8 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
*
* This returns a list of index OIDs for any unique or exclusion
* constraints that are deferred and that had
- * potential (unconfirmed) conflicts.
+ * potential (unconfirmed) conflicts. (if noDupErr == true, the
+ * same is done for non-deferred constraints)
*
* CAUTION: this must not be called for a HOT update.
* We can't defend against that here for lack of info.
@@ -1000,7 +1009,9 @@ ExecCloseIndices(ResultRelInfo *resultRelInfo)
List *
ExecInsertIndexTuples(TupleTableSlot *slot,
ItemPointer tupleid,
- EState *estate)
+ EState *estate,
+ bool noDupErr,
+ Oid arbiterIdx)
{
List *result = NIL;
ResultRelInfo *resultRelInfo;
@@ -1070,7 +1081,17 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
/* Skip this index-update if the predicate isn't satisfied */
if (!ExecQual(predicate, econtext, false))
+ {
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ ereport(ERROR,
+ (errcode(ERRCODE_TRIGGERED_ACTION_EXCEPTION),
+ errmsg("partial arbiter unique index has predicate that does not cover tuple proposed for insertion"),
+ errdetail("ON CONFLICT inference clause implies that the tuple proposed for insertion must be covered by predicate for partial index \"%s\".",
+ RelationGetRelationName(indexRelation)),
+ errtableconstraint(heapRelation,
+ RelationGetRelationName(indexRelation))));
continue;
+ }
}
/*
@@ -1092,9 +1113,16 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* For a deferrable unique index, we tell the index AM to just detect
* possible non-uniqueness, and we add the index OID to the result
* list if further checking is needed.
+ *
+ * For a speculative insertion (ON CONFLICT UPDATE/IGNORE), just detect
+ * possible non-uniqueness, and tell the caller if it failed.
*/
if (!indexRelation->rd_index->indisunique)
checkUnique = UNIQUE_CHECK_NO;
+ else if (noDupErr && arbiterIdx == InvalidOid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
+ else if (noDupErr && arbiterIdx == indexRelation->rd_index->indexrelid)
+ checkUnique = UNIQUE_CHECK_PARTIAL;
else if (indexRelation->rd_index->indimmediate)
checkUnique = UNIQUE_CHECK_YES;
else
@@ -1112,8 +1140,11 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* If the index has an associated exclusion constraint, check that.
* This is simpler than the process for uniqueness checks since we
* always insert first and then check. If the constraint is deferred,
- * we check now anyway, but don't throw error on violation; instead
- * we'll queue a recheck event.
+ * we check now anyway, but don't throw error on violation or wait for
+ * a conclusive outcome from a concurrent insertion; instead we'll
+ * queue a recheck event. Similarly, noDupErr callers (speculative
+ * inserters) will recheck later, and wait for a conclusive outcome
+ * then.
*
* An index for an exclusion constraint can't also be UNIQUE (not an
* essential property, we just don't allow it in the grammar), so no
@@ -1121,13 +1152,15 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
*/
if (indexInfo->ii_ExclusionOps != NULL)
{
- bool errorOK = !indexRelation->rd_index->indimmediate;
+ bool violationOK = (!indexRelation->rd_index->indimmediate ||
+ noDupErr);
satisfiesConstraint =
- check_exclusion_constraint(heapRelation,
- indexRelation, indexInfo,
- tupleid, values, isnull,
- estate, false, errorOK);
+ check_exclusion_or_unique_constraint(heapRelation,
+ indexRelation, indexInfo,
+ tupleid, values, isnull,
+ estate, false,
+ violationOK, false, NULL);
}
if ((checkUnique == UNIQUE_CHECK_PARTIAL ||
@@ -1135,7 +1168,7 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
!satisfiesConstraint)
{
/*
- * The tuple potentially violates the uniqueness or exclusion
+ * The tuple potentially violates the unique index or exclusion
* constraint, so make a note of the index so that we can re-check
* it later.
*/
@@ -1146,18 +1179,154 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
return result;
}
+/* ----------------------------------------------------------------
+ * ExecCheckIndexConstraints
+ *
+ * This routine checks if a tuple violates any unique or
+ * exclusion constraints. If no conflict, returns true.
+ * Otherwise returns false, and the TID of the conflicting
+ * tuple is returned in *conflictTid
+ *
+ * Note that this doesn't lock the values in any way, so it's
+ * possible that a conflicting tuple is inserted immediately
+ * after this returns, and a later insert with the same values
+ * still conflicts. But this can be used for a pre-check before
+ * insertion.
+ * ----------------------------------------------------------------
+ */
+bool
+ExecCheckIndexConstraints(TupleTableSlot *slot,
+ EState *estate, ItemPointer conflictTid,
+ Oid arbiterIdx)
+{
+ ResultRelInfo *resultRelInfo;
+ int i;
+ int numIndices;
+ RelationPtr relationDescs;
+ Relation heapRelation;
+ IndexInfo **indexInfoArray;
+ ExprContext *econtext;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ ItemPointerData invalidItemPtr;
+ bool checkedIndex = false;
+
+ ItemPointerSetInvalid(conflictTid);
+ ItemPointerSetInvalid(&invalidItemPtr);
+
+ /*
+ * Get information from the result relation info structure.
+ */
+ resultRelInfo = estate->es_result_relation_info;
+ numIndices = resultRelInfo->ri_NumIndices;
+ relationDescs = resultRelInfo->ri_IndexRelationDescs;
+ indexInfoArray = resultRelInfo->ri_IndexRelationInfo;
+ heapRelation = resultRelInfo->ri_RelationDesc;
+
+ /*
+ * We will use the EState's per-tuple context for evaluating predicates
+ * and index expressions (creating it if it's not already there).
+ */
+ econtext = GetPerTupleExprContext(estate);
+
+ /* Arrange for econtext's scan tuple to be the tuple under test */
+ econtext->ecxt_scantuple = slot;
+
+ /*
+ * for each index, form and insert the index tuple
+ */
+ for (i = 0; i < numIndices; i++)
+ {
+ Relation indexRelation = relationDescs[i];
+ IndexInfo *indexInfo;
+ bool satisfiesConstraint;
+
+ if (indexRelation == NULL)
+ continue;
+
+ indexInfo = indexInfoArray[i];
+
+ if (!indexInfo->ii_Unique && !indexInfo->ii_ExclusionOps)
+ continue;
+
+ /* If the index is marked as read-only, ignore it */
+ if (!indexInfo->ii_ReadyForInserts)
+ continue;
+
+ /* When specific arbiter index requested, only examine it */
+ if (arbiterIdx != InvalidOid &&
+ arbiterIdx != indexRelation->rd_index->indexrelid)
+ continue;
+
+ checkedIndex = true;
+
+ /* Check for partial index */
+ if (indexInfo->ii_Predicate != NIL)
+ {
+ List *predicate;
+
+ /*
+ * If predicate state not set up yet, create it (in the estate's
+ * per-query context)
+ */
+ predicate = indexInfo->ii_PredicateState;
+ if (predicate == NIL)
+ {
+ predicate = (List *)
+ ExecPrepareExpr((Expr *) indexInfo->ii_Predicate,
+ estate);
+ indexInfo->ii_PredicateState = predicate;
+ }
+
+ /* Skip this index-update if the predicate isn't satisfied */
+ if (!ExecQual(predicate, econtext, false))
+ continue;
+ }
+
+ /*
+ * FormIndexDatum fills in its values and isnull parameters with the
+ * appropriate values for the column(s) of the index.
+ */
+ FormIndexDatum(indexInfo,
+ slot,
+ estate,
+ values,
+ isnull);
+
+ satisfiesConstraint =
+ check_exclusion_or_unique_constraint(heapRelation, indexRelation,
+ indexInfo, &invalidItemPtr,
+ values, isnull, estate, false,
+ true, true, conflictTid);
+ if (!satisfiesConstraint)
+ return false;
+
+ /* If this was a user-specified arbiter index, we're done */
+ if (arbiterIdx == indexRelation->rd_index->indexrelid)
+ break;
+ }
+
+ if (arbiterIdx != InvalidOid && !checkedIndex)
+ elog(ERROR, "unexpected failure to find arbiter unique index");
+
+ return true;
+}
+
/*
- * Check for violation of an exclusion constraint
+ * Check for violation of an exclusion or unique constraint
*
* heap: the table containing the new tuple
* index: the index supporting the exclusion constraint
* indexInfo: info about the index, including the exclusion properties
- * tupleid: heap TID of the new tuple we have just inserted
+ * tupleid: heap TID of the new tuple we have just inserted (invalid if we
+ * haven't inserted a new tuple yet)
* values, isnull: the *index* column values computed for the new tuple
* estate: an EState we can do evaluation in
* newIndex: if true, we are trying to build a new index (this affects
* only the wording of error messages)
* errorOK: if true, don't throw error for violation
+ * wait: if true, wait for conflicting transaction to finish, even if !errorOK
+ * conflictTid: if not-NULL, the TID of conflicting tuple is returned here.
*
* Returns true if OK, false if actual or potential violation
*
@@ -1167,16 +1336,25 @@ ExecInsertIndexTuples(TupleTableSlot *slot,
* is convenient for deferred exclusion checks; we need not bother queuing
* a deferred event if there is definitely no conflict at insertion time.
*
- * When errorOK is false, we'll throw error on violation, so a false result
+ * When violationOK is false, we'll throw error on violation, so a false result
* is impossible.
+ *
+ * Note: The indexam is normally responsible for checking unique constraints,
+ * so this normally only needs to be used for exclusion constraints. But this
+ * function is also called when doing a "pre-check" for conflicts, for the
+ * benefit of speculative insertion. Caller may request that conflict TID be
+ * set, to take further steps.
*/
bool
-check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
- ItemPointer tupleid, Datum *values, bool *isnull,
- EState *estate, bool newIndex, bool errorOK)
+check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo, ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate, bool newIndex,
+ bool violationOK, bool wait,
+ ItemPointer conflictTid)
{
- Oid *constr_procs = indexInfo->ii_ExclusionProcs;
- uint16 *constr_strats = indexInfo->ii_ExclusionStrats;
+ Oid *constr_procs;
+ uint16 *constr_strats;
Oid *index_collations = index->rd_indcollation;
int index_natts = index->rd_index->indnatts;
IndexScanDesc index_scan;
@@ -1190,6 +1368,17 @@ check_exclusion_constraint(Relation heap, Relation index, IndexInfo *indexInfo,
TupleTableSlot *existing_slot;
TupleTableSlot *save_scantuple;
+ if (indexInfo->ii_ExclusionOps)
+ {
+ constr_procs = indexInfo->ii_ExclusionProcs;
+ constr_strats = indexInfo->ii_ExclusionStrats;
+ }
+ else
+ {
+ constr_procs = indexInfo->ii_UniqueProcs;
+ constr_strats = indexInfo->ii_UniqueStrats;
+ }
+
/*
* If any of the input values are NULL, the constraint check is assumed to
* pass (i.e., we assume the operators are strict).
@@ -1254,7 +1443,8 @@ retry:
/*
* Ignore the entry for the tuple we're trying to check.
*/
- if (ItemPointerEquals(tupleid, &tup->t_self))
+ if (ItemPointerIsValid(tupleid) &&
+ ItemPointerEquals(tupleid, &tup->t_self))
{
if (found_self) /* should not happen */
elog(ERROR, "found self tuple multiple times in index \"%s\"",
@@ -1284,17 +1474,6 @@ retry:
}
/*
- * At this point we have either a conflict or a potential conflict. If
- * we're not supposed to raise error, just return the fact of the
- * potential conflict without waiting to see if it's real.
- */
- if (errorOK)
- {
- conflict = true;
- break;
- }
-
- /*
* If an in-progress transaction is affecting the visibility of this
* tuple, we need to wait for it to complete and then recheck. For
* simplicity we do rechecking by just restarting the whole scan ---
@@ -1305,18 +1484,89 @@ retry:
xwait = TransactionIdIsValid(DirtySnapshot.xmin) ?
DirtySnapshot.xmin : DirtySnapshot.xmax;
+ /*
+ * At this point we have either a conflict or a potential conflict. If
+ * we're not supposed to raise error, just return the fact of the
+ * potential conflict without waiting to see if it's real.
+ */
+ if (violationOK && !wait)
+ {
+ /*
+ * For unique indexes, detecting conflict is coupled with physical
+ * index tuple insertion, so we won't be called for recheck
+ */
+ Assert(!indexInfo->ii_Unique);
+
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+
+ /*
+ * Livelock insurance.
+ *
+ * When doing a speculative insertion pre-check, we cannot have an
+ * "unprincipled deadlock" with another session, fundamentally
+ * because there is no possible mutual dependency, since we only
+ * hold a lock on our token, without attempting to lock anything
+ * else (maybe this is not the first iteration, but no matter;
+ * we'll have super deleted and released insertion token lock if
+ * so, and all locks needed are already held. Also, our XID lock
+ * is irrelevant.)
+ *
+ * In the second phase, where there is a re-check for conflicts, we
+ * can't deadlock either (we never lock another thing, since we
+ * don't wait in that phase). However, a theoretical livelock
+ * hazard exists: Two sessions could each see each other's
+ * conflicting tuple, and each could go and delete, retrying
+ * forever.
+ *
+ * To break the mutual dependency, we may wait on the other xact
+ * here over our caller's request to not do so (in the second
+ * phase). This does not imply the risk of unprincipled deadlocks
+ * either, because if we end up unexpectedly waiting, the other
+ * session will super delete its own tuple *before* releasing its
+ * token lock and freeing us, and without attempting to wait on us
+ * to release our token lock. We'll take another iteration here,
+ * after waiting on the other session's token, not find a conflict
+ * this time, and then proceed (assuming we're the oldest XID).
+ *
+ * N.B.: Unprincipled deadlocks are still theoretically possible
+ * with non-speculative insertion with exclusion constraints, but
+ * this seems inconsequential, since an error was inevitable for
+ * one of the sessions anyway. We only worry about speculative
+ * insertion's problems, since they're likely with idiomatic usage.
+ */
+ if (TransactionIdPrecedes(xwait, GetCurrentTransactionId()))
+ break; /* go and super delete/restart speculative insertion */
+ }
+
if (TransactionIdIsValid(xwait))
{
ctid_wait = tup->t_data->t_ctid;
index_endscan(index_scan);
- XactLockTableWait(xwait, heap, &ctid_wait,
- XLTW_RecheckExclusionConstr);
+ if (DirtySnapshot.speculativeToken)
+ SpeculativeInsertionWait(DirtySnapshot.xmin,
+ DirtySnapshot.speculativeToken);
+ else if (violationOK)
+ XactLockTableWait(xwait, heap, &tup->t_self,
+ XLTW_RecheckExclusionConstr);
+ else
+ XactLockTableWait(xwait, heap, &ctid_wait,
+ XLTW_RecheckExclusionConstr);
goto retry;
}
/*
- * We have a definite conflict. Report it.
+ * We have a definite conflict. Return it to caller, or report it.
*/
+ if (violationOK)
+ {
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+ break;
+ }
+
error_new = BuildIndexValueDescription(index, values, isnull);
error_existing = BuildIndexValueDescription(index, existing_values,
existing_isnull);
@@ -1352,6 +1602,9 @@ retry:
* However, it is possible to define exclusion constraints for which that
* wouldn't be true --- for instance, if the operator is <>. So we no
* longer complain if found_self is still false.
+ *
+ * It would also not be true in the pre-check mode, when we haven't
+ * inserted a tuple yet.
*/
econtext->ecxt_scantuple = save_scantuple;
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 48107d9..4699060 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -151,10 +151,11 @@ lnext:
* case, so as to avoid the "Halloween problem" of repeated
* update attempts. In the latter case it might be sensible
* to fetch the updated tuple instead, but doing so would
- * require changing heap_lock_tuple as well as heap_update and
- * heap_delete to not complain about updating "invisible"
- * tuples, which seems pretty scary. So for now, treat the
- * tuple as deleted and do not process.
+ * require changing heap_update and heap_delete to not complain
+ * about updating "invisible" tuples, which seems pretty scary
+ * (heap_lock_tuple will not complain, but few callers expect
+ * HeapTupleInvisible, and we're not one of them). So for now,
+ * treat the tuple as deleted and do not process.
*/
goto lnext;
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index f96fb24..ec1ef07 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -46,12 +46,23 @@
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "storage/bufmgr.h"
+#include "storage/lmgr.h"
+#include "storage/procarray.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
#include "utils/tqual.h"
+static bool ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
+ ModifyTableState *onConflict,
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning);
+
/*
* Verify that the tuples to be produced by INSERT or UPDATE match the
* target relation's rowtype
@@ -151,6 +162,36 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
return ExecProject(projectReturning, NULL);
}
+/*
+ * ExecCheckHeapTupleVisible -- verify heap tuple is visible
+ *
+ * It would not be consistent with guarantees of the higher isolation levels to
+ * proceed with avoiding insertion (taking speculative insertion's alternative
+ * IGNORE/UPDATE path) on the basis of another tuple that is not visible.
+ * Check for the need to raise a serialization failure, and do so as necessary.
+ */
+static void
+ExecCheckHeapTupleVisible(EState *estate,
+ ResultRelInfo *relinfo,
+ ItemPointer tid)
+{
+
+ Relation rel = relinfo->ri_RelationDesc;
+ Buffer buffer;
+ HeapTupleData tuple;
+
+ if (!IsolationUsesXactSnapshot())
+ return;
+
+ tuple.t_self = *tid;
+ if (!heap_fetch(rel, estate->es_snapshot, &tuple, &buffer, false, NULL))
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent insert or update dictating alternative ON CONFLICT path")));
+
+ ReleaseBuffer(buffer);
+}
+
/* ----------------------------------------------------------------
* ExecInsert
*
@@ -163,6 +204,9 @@ ExecProcessReturning(ProjectionInfo *projectReturning,
static TupleTableSlot *
ExecInsert(TupleTableSlot *slot,
TupleTableSlot *planSlot,
+ ModifyTableState *onConflict,
+ Oid arbiterIndex,
+ SpecCmd spec,
EState *estate,
bool canSetTag)
{
@@ -246,6 +290,8 @@ ExecInsert(TupleTableSlot *slot,
}
else
{
+ ItemPointerData conflictTid;
+
/*
* Constraints might reference the tableoid column, so initialize
* t_tableOid before evaluating them.
@@ -259,20 +305,147 @@ ExecInsert(TupleTableSlot *slot,
ExecConstraints(resultRelInfo, slot, estate);
/*
+ * If we are performing speculative insertion, do a non-conclusive
+ * check for conflicts.
+ *
+ * Control returns here when there is 1) A row-locking conflict, or 2)
+ * an insertion conflict. See the executor README for a full
+ * discussion of speculative insertion.
+ */
+vlock:
+
+ /*
+ * XXX If we know or assume that there are few duplicates, it would be
+ * better to skip this, and just optimistically proceed with the
+ * insertion below.
+ */
+ if (spec != SPEC_NONE && resultRelInfo->ri_NumIndices > 0)
+ {
+ /*
+ * No need to check if running in bootstrap mode, since ON CONFLICT
+ * with system catalogs forbidden generally.
+ *
+ * Check if it's required to proceed with the second phase
+ * ("insertion proper") of speculative insertion in respect of the
+ * slot. If insertion ultimately does not proceed, no firing of
+ * AFTER ROW INSERT triggers occurs.
+ *
+ * We don't suppress the effects (or, perhaps, side-effects) of
+ * BEFORE ROW INSERT triggers. This isn't ideal, but then we
+ * cannot proceed with even considering uniqueness violations until
+ * these triggers fire on the one hand, but on the other hand they
+ * have the ability to execute arbitrary user-defined code which
+ * may perform operations entirely outside the system's ability to
+ * nullify.
+ */
+ if (!ExecCheckIndexConstraints(slot, estate, &conflictTid,
+ arbiterIndex))
+ {
+ TupleTableSlot *returning = NULL;
+
+ /*
+ * Lock and consider updating in the SPEC_INSERT case. For the
+ * SPEC_IGNORE case, it's still often necessary to verify that
+ * the tuple is visible to the executor's MVCC snapshot.
+ */
+ if (spec == SPEC_INSERT && !ExecLockUpdateTuple(resultRelInfo,
+ &conflictTid,
+ planSlot,
+ slot,
+ onConflict,
+ estate,
+ canSetTag,
+ &returning))
+ goto vlock;
+ else if (spec == SPEC_IGNORE)
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &conflictTid);
+
+ /*
+ * RETURNING may have been processed already -- the target
+ * ResultRelInfo might have made representation within
+ * ExecUpdate() that this is required. Inserted and updated
+ * tuples are projected indifferently for ON CONFLICT UPDATE
+ * with RETURNING.
+ *
+ * Since there was no row conflict, we're done.
+ */
+ return returning;
+ }
+
+ /*
+ * Before we start insertion proper, acquire our "promise tuple
+ * insertion lock". Others can use that (rather than an XID lock,
+ * which is appropriate only for non-promise tuples) to wait for us
+ * to decide if we're going to go ahead with the insertion.
+ */
+ SpeculativeInsertionLockAcquire(GetCurrentTransactionId());
+ }
+
+ /*
* insert the tuple
*
* Note: heap_insert returns the tid (location) of the new tuple in
* the t_self field.
*/
newId = heap_insert(resultRelationDesc, tuple,
- estate->es_output_cid, 0, NULL);
+ estate->es_output_cid,
+ spec != SPEC_NONE ? HEAP_INSERT_SPECULATIVE : 0,
+ NULL);
/*
* insert index entries for tuple
*/
if (resultRelInfo->ri_NumIndices > 0)
+ {
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, spec != SPEC_NONE,
+ arbiterIndex);
+
+ if (spec != SPEC_NONE)
+ {
+ HeapUpdateFailureData hufd;
+
+ /*
+ * Consider possible race: concurrent insertion conflicts with
+ * our speculative heap tuple. Must then "super-delete" the
+ * heap tuple and retry from the start.
+ *
+ * This is occasionally necessary so that "unprincipled
+ * deadlocks" are avoided; now that a conflict was found,
+ * other sessions should not wait on our speculative token, and
+ * they certainly shouldn't treat our speculatively-inserted
+ * heap tuple as an ordinary tuple that it must wait on the
+ * outcome of our xact to UPDATE/DELETE. This makes heap
+ * tuples behave as conceptual "value locks" of short duration,
+ * distinct from ordinary tuples that other xacts must wait on
+ * xmin-xact-end of in the event of a possible unique/exclusion
+ * violation (the violation that arbitrates taking the
+ * alternative UPDATE/IGNORE path).
+ */
+ if (recheckIndexes)
+ heap_delete(resultRelationDesc, &(tuple->t_self),
+ estate->es_output_cid, InvalidSnapshot, false,
+ &hufd, true);
+
+ Assert(hufd.cmax == estate->es_output_cid);
+
+ /*
+ * Iff there was no insertion conflict, release speculative
+ * insertion lock to effectively make the promise tuple an
+ * ordinary tuple
+ */
+ SpeculativeInsertionLockRelease(GetCurrentTransactionId());
+ ClearSpeculativeInsertionState();
+
+ if (recheckIndexes)
+ {
+ list_free(recheckIndexes);
+ goto vlock;
+ }
+
+ /* since there was no insertion conflict, we're done */
+ }
+ }
}
if (canSetTag)
@@ -399,7 +572,8 @@ ldelete:;
estate->es_output_cid,
estate->es_crosscheck_snapshot,
true /* wait for commit */ ,
- &hufd);
+ &hufd,
+ false);
switch (result)
{
case HeapTupleSelfUpdated:
@@ -768,7 +942,7 @@ lreplace:;
*/
if (resultRelInfo->ri_NumIndices > 0 && !HeapTupleIsHeapOnly(tuple))
recheckIndexes = ExecInsertIndexTuples(slot, &(tuple->t_self),
- estate);
+ estate, false, InvalidOid);
}
if (canSetTag)
@@ -792,6 +966,236 @@ lreplace:;
return NULL;
}
+/* ----------------------------------------------------------------
+ * Try to lock tuple for update as part of speculative insertion. If
+ * a qual originating from ON CONFLICT UPDATE is satisfied, update
+ * (but still lock row, even though it may not satisfy estate's
+ * snapshot).
+ *
+ * Returns value indicating if we're done (with or without an
+ * update), or if the executor must start from scratch.
+ * ----------------------------------------------------------------
+ */
+static bool
+ExecLockUpdateTuple(ResultRelInfo *resultRelInfo,
+ ItemPointer conflictTid,
+ TupleTableSlot *planSlot,
+ TupleTableSlot *insertSlot,
+ ModifyTableState *onConflict,
+ EState *estate,
+ bool canSetTag,
+ TupleTableSlot **returning)
+{
+ Relation relation = resultRelInfo->ri_RelationDesc;
+ HeapTupleData tuple;
+ HeapTuple copyTuple = NULL;
+ HeapUpdateFailureData hufd;
+ HTSU_Result test;
+ Buffer buffer;
+ TupleTableSlot *slot;
+ ExprContext *econtext;
+
+ /*
+ * Lock tuple for update.
+ *
+ * Like EvalPlanQualFetch(), don't follow updates. There is no actual
+ * benefit to doing so, since as discussed below, a conflict invalidates
+ * our previous conclusion that the tuple is the conclusively committed
+ * conflicting tuple.
+ */
+ tuple.t_self = *conflictTid;
+ test = heap_lock_tuple(relation, &tuple, estate->es_output_cid,
+ LockTupleExclusive, LockWaitBlock, false, &buffer,
+ &hufd);
+
+ if (test == HeapTupleMayBeUpdated)
+ copyTuple = heap_copytuple(&tuple);
+
+ switch (test)
+ {
+ case HeapTupleInvisible:
+ /*
+ * This may occur when an instantaneously invisible tuple is blamed
+ * as a conflict because multiple rows are inserted with the same
+ * constrained values.
+ *
+ * We cannot proceed, because to do so would leave users open to
+ * the risk that the same row will be updated a second time in the
+ * same command; allowing a second update affecting a single row
+ * within the same command a second time would leave the update
+ * order undefined. It is the user's responsibility to resolve
+ * these self-duplicates in advance of proposing for insertion a
+ * set of tuples, but warn them. These problems are why SQL-2003
+ * similarly specifies that for SQL MERGE, an exception must be
+ * raised in the event of an attempt to update the same row twice.
+ *
+ * XXX It might be preferable to do something similar when a row is
+ * locked twice (and not updated twice) by the same speculative
+ * insertion, as if to take each lock acquisition as a indication
+ * of a discrete, unfulfilled intent to update (perhaps in some
+ * later command of the same xact). This does not seem feasible,
+ * though.
+ */
+ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple.t_data)))
+ ereport(ERROR,
+ (errcode(ERRCODE_CARDINALITY_VIOLATION),
+ errmsg("ON CONFLICT UPDATE command could not lock/update self-inserted tuple"),
+ errhint("Ensure that no rows proposed for insertion within the same command have duplicate constrained values.")));
+
+ /* This shouldn't happen */
+ elog(ERROR, "attempted to lock invisible tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleSelfUpdated:
+ /*
+ * XXX In practice this is dead code, since BEFORE triggers fire
+ * prior to speculative insertion. Since a dirty snapshot is used
+ * to find possible conflict tuples, speculative insertion could
+ * not have seen the old/MVCC-current row version at all (even if
+ * it was only rendered old by this same command).
+ */
+ elog(ERROR,"unexpected self-updated tuple");
+ return false; /* keep compiler quiet */
+ case HeapTupleMayBeUpdated:
+ /*
+ * Success -- we're done, as tuple is locked. Verify that the
+ * tuple is known to be visible to our snapshot under conventional
+ * MVCC rules if the current isolation level mandates that. In
+ * READ COMMITTED mode, we can lock and update a tuple still in
+ * progress according to our snapshot, but higher isolation levels
+ * cannot avail of that, and must actively defend against doing so.
+ * We might get a serialization failure within ExecUpdate() anyway
+ * if this step was skipped, but this cannot be relied on, for
+ * example because the auxiliary WHERE clause happened to not be
+ * satisfied.
+ */
+ ExecCheckHeapTupleVisible(estate, resultRelInfo, &tuple.t_data->t_ctid);
+
+ /*
+ * This loosening of snapshot isolation for the benefit of READ
+ * COMMITTED speculative insertions is used consistently:
+ * speculative quals are only tested against already locked tuples.
+ * It would be rather inconsistent to UPDATE when no tuple version
+ * is MVCC-visible (which seems inevitable since we must *do
+ * something* there, and "READ COMMITTED serialization failures"
+ * are unappealing), while also avoiding updating here entirely on
+ * the basis of a non-conclusive tuple version (the version that
+ * happens to be visible to this command's MVCC snapshot, or a
+ * subsequent non-conclusive version).
+ *
+ * In other words: Only the final, conclusively locked tuple
+ * (which must have the same value in the relevant constrained
+ * attribute(s) as the value previously "value locked") matters.
+ */
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStartNode(onConflict->ps.instrument);
+
+ /*
+ * Conceptually, the parent ModifyTable is like a relation scan
+ * node that uses a dirty snapshot, returning rows which the
+ * auxiliary plan must operate on (if only to lock all such rows).
+ * EvalPlanQual() is involved in the evaluation of their UPDATE,
+ * regardless of whether or not the tuple is visible to the
+ * command's MVCC Snapshot.
+ */
+ EvalPlanQualBegin(&onConflict->mt_epqstate, onConflict->ps.state);
+
+ /*
+ * Save EPQ expression context. Auxiliary plan's scan node (which
+ * would have been just initialized by EvalPlanQualBegin() on the
+ * first time through here per query) cannot fail to provide one.
+ */
+ econtext = onConflict->mt_epqstate.planstate->ps_ExprContext;
+
+ /*
+ * UPDATE affects the same ResultRelation as INSERT in the context
+ * of ON CONFLICT UPDATE, so parent's target rti is used
+ */
+ EvalPlanQualSetTuple(&onConflict->mt_epqstate,
+ resultRelInfo->ri_RangeTableIndex, copyTuple);
+
+ /*
+ * Make available rejected tuple for referencing within UPDATE
+ * expression (that is, make available a slot with the rejected
+ * tuple, possibly already modified by BEFORE INSERT row triggers).
+ *
+ * This is for the benefit of any ExcludedExpr that may appear
+ * within UPDATE's targetlist or WHERE clause. The EXCLUDED tuple
+ * may be referenced as an ExcludedExpr, which exist purely for our
+ * benefit. The nested ExcludedExpr's Var will necessarily have an
+ * INNER_VAR varno on the assumption that the inner slot of the EPQ
+ * scan plan state's expression context will contain the EXCLUDED
+ * heaptuple slot (that is, on the assumption that during
+ * expression evaluation, the ecxt_innertuple will be assigned the
+ * insertSlot by this codepath, in advance of expression
+ * evaluation).
+ *
+ * See handling of ExcludedExpr within handleRewrite.c and
+ * execQual.c.
+ */
+ econtext->ecxt_innertuple = insertSlot;
+
+ slot = EvalPlanQualNext(&onConflict->mt_epqstate);
+
+ if (!TupIsNull(slot))
+ *returning = ExecUpdate(&tuple.t_data->t_ctid, NULL, slot,
+ planSlot, &onConflict->mt_epqstate,
+ onConflict->ps.state, canSetTag);
+
+ ReleaseBuffer(buffer);
+
+ /*
+ * As when executing an UPDATE's ModifyTable node in the
+ * conventional manner, reset the per-output-tuple ExprContext
+ */
+ ResetPerTupleExprContext(onConflict->ps.state);
+
+ /* must provide our own instrumentation support */
+ if (onConflict->ps.instrument)
+ InstrStopNode(onConflict->ps.instrument, *returning ? 1:0);
+
+ return true;
+ case HeapTupleUpdated:
+ if (IsolationUsesXactSnapshot())
+ ereport(ERROR,
+ (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+ errmsg("could not serialize access due to concurrent update")));
+
+ /*
+ * Tell caller to try again from the very start. We don't use the
+ * usual EvalPlanQual() looping pattern here, fundamentally because
+ * we don't have a useful qual to verify the next tuple with. Our
+ * "qual" is really any user-supplied qual AND the unique
+ * constraint "col OP value" implied by a speculative insertion
+ * conflict. However, because of the selective evaluation of the
+ * former "qual" (the interactions with MVCC and row locking), this
+ * is an over-simplification.
+ *
+ * We might devise a means of verifying, by way of binary equality
+ * in a similar manner to HOT codepaths, if any unique indexed
+ * columns changed, but this would only serve to ameliorate the
+ * fundamental problem. It might well not be good enough, because
+ * those columns could change too. It seems unlikely that working
+ * harder here is worthwhile.
+ *
+ * At this point, all bets are off -- it might actually turn out to
+ * be okay to proceed with insertion instead of locking now (the
+ * tuple we attempted to lock could have been deleted, for
+ * example). On the other hand, it might not be okay, but for an
+ * entirely different reason, with an entirely separate TID to
+ * blame and lock. This TID may not even be part of the same
+ * update chain.
+ */
+ ReleaseBuffer(buffer);
+ return false;
+ default:
+ elog(ERROR, "unrecognized heap_lock_tuple status: %u", test);
+ }
+
+ return false;
+}
+
/*
* Process BEFORE EACH STATEMENT triggers
@@ -803,6 +1207,9 @@ fireBSTriggers(ModifyTableState *node)
{
case CMD_INSERT:
ExecBSInsertTriggers(node->ps.state, node->resultRelInfo);
+ if (node->spec == SPEC_INSERT)
+ ExecBSUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
break;
case CMD_UPDATE:
ExecBSUpdateTriggers(node->ps.state, node->resultRelInfo);
@@ -825,6 +1232,9 @@ fireASTriggers(ModifyTableState *node)
switch (node->operation)
{
case CMD_INSERT:
+ if (node->spec == SPEC_INSERT)
+ ExecASUpdateTriggers(node->onConflict->state,
+ node->resultRelInfo);
ExecASInsertTriggers(node->ps.state, node->resultRelInfo);
break;
case CMD_UPDATE:
@@ -852,6 +1262,8 @@ ExecModifyTable(ModifyTableState *node)
{
EState *estate = node->ps.state;
CmdType operation = node->operation;
+ ModifyTableState *onConflict = (ModifyTableState *) node->onConflict;
+ SpecCmd spec = node->spec;
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
PlanState *subplanstate;
@@ -1022,7 +1434,9 @@ ExecModifyTable(ModifyTableState *node)
switch (operation)
{
case CMD_INSERT:
- slot = ExecInsert(slot, planSlot, estate, node->canSetTag);
+ slot = ExecInsert(slot, planSlot, onConflict,
+ node->arbiterIndex, spec, estate,
+ node->canSetTag);
break;
case CMD_UPDATE:
slot = ExecUpdate(tupleid, oldtuple, slot, planSlot,
@@ -1070,6 +1484,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ModifyTableState *mtstate;
CmdType operation = node->operation;
+ Plan *onConflictPlan = node->onConflictPlan;
int nplans = list_length(node->plans);
ResultRelInfo *saved_resultRelInfo;
ResultRelInfo *resultRelInfo;
@@ -1097,6 +1512,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->resultRelInfo = estate->es_result_relations + node->resultRelIndex;
mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
mtstate->mt_nplans = nplans;
+ mtstate->spec = node->spec;
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -1135,7 +1551,15 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
if (resultRelInfo->ri_RelationDesc->rd_rel->relhasindex &&
operation != CMD_DELETE &&
resultRelInfo->ri_IndexRelationDescs == NULL)
- ExecOpenIndices(resultRelInfo);
+ ExecOpenIndices(resultRelInfo, mtstate->spec != SPEC_NONE);
+
+ /*
+ * ON CONFLICT UPDATE variant must have unique index to arbitrate on
+ * taking alternative path
+ */
+ Assert(node->spec != SPEC_INSERT || node->arbiterIndex != InvalidOid);
+
+ mtstate->arbiterIndex = node->arbiterIndex;
/* Now init the plan for this result rel */
estate->es_result_relation_info = resultRelInfo;
@@ -1308,7 +1732,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
break;
case CMD_UPDATE:
case CMD_DELETE:
- junk_filter_needed = true;
+ junk_filter_needed = (node->spec == SPEC_NONE);
break;
default:
elog(ERROR, "unknown operation");
@@ -1372,6 +1796,19 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
}
}
+ /* Initialize auxiliary ModifyTable node, for ON CONFLICT UPDATE */
+ if (onConflictPlan)
+ {
+ Assert(mtstate->spec == SPEC_INSERT);
+
+ /*
+ * ExecModifyTable() is never called for auxiliary update
+ * ModifyTableState. Execution of the auxiliary plan is driven by its
+ * parent in an ad-hoc fashion.
+ */
+ mtstate->onConflict = ExecInitNode(onConflictPlan, estate, eflags);
+ }
+
/*
* Set up a tuple table slot for use for trigger output tuples. In a plan
* containing multiple ModifyTable nodes, all can share one such slot, so
@@ -1387,9 +1824,14 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* ModifyTable node too, but there's no need.) Note the use of lcons not
* lappend: we need later-initialized ModifyTable nodes to be shut down
* before earlier ones. This ensures that we don't throw away RETURNING
- * rows that need to be seen by a later CTE subplan.
+ * rows that need to be seen by a later CTE subplan. Do not append an
+ * auxiliary ON CONFLICT UPDATE node either, since it must have a parent
+ * SPEC_INSERT ModifyTable node that it is auxiliary to that directly
+ * drives execution of what is logically a single unified statement (*that*
+ * plan will be appended here, though). If it must project updated rows,
+ * that will only ever be done through the parent.
*/
- if (!mtstate->canSetTag)
+ if (!mtstate->canSetTag && mtstate->spec != SPEC_UPDATE)
estate->es_auxmodifytables = lcons(mtstate,
estate->es_auxmodifytables);
@@ -1442,6 +1884,8 @@ ExecEndModifyTable(ModifyTableState *node)
*/
for (i = 0; i < node->mt_nplans; i++)
ExecEndNode(node->mt_plans[i]);
+
+ ExecEndNode(node->onConflict);
}
void
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 6d7a877..ed86b8f 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -179,6 +179,9 @@ _copyModifyTable(const ModifyTable *from)
COPY_NODE_FIELD(resultRelations);
COPY_SCALAR_FIELD(resultRelIndex);
COPY_NODE_FIELD(plans);
+ COPY_SCALAR_FIELD(spec);
+ COPY_SCALAR_FIELD(arbiterIndex);
+ COPY_NODE_FIELD(onConflictPlan);
COPY_NODE_FIELD(withCheckOptionLists);
COPY_NODE_FIELD(returningLists);
COPY_NODE_FIELD(fdwPrivLists);
@@ -1777,6 +1780,19 @@ _copyCurrentOfExpr(const CurrentOfExpr *from)
}
/*
+ * _copyExcludedExpr
+ */
+static ExcludedExpr *
+_copyExcludedExpr(const ExcludedExpr *from)
+{
+ ExcludedExpr *newnode = makeNode(ExcludedExpr);
+
+ COPY_NODE_FIELD(arg);
+
+ return newnode;
+}
+
+/*
* _copyTargetEntry
*/
static TargetEntry *
@@ -2121,6 +2137,31 @@ _copyWithClause(const WithClause *from)
return newnode;
}
+static InferClause *
+_copyInferClause(const InferClause *from)
+{
+ InferClause *newnode = makeNode(InferClause);
+
+ COPY_NODE_FIELD(indexElems);
+ COPY_NODE_FIELD(whereClause);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
+static ConflictClause *
+_copyConflictClause(const ConflictClause *from)
+{
+ ConflictClause *newnode = makeNode(ConflictClause);
+
+ COPY_SCALAR_FIELD(specclause);
+ COPY_NODE_FIELD(infer);
+ COPY_NODE_FIELD(updatequery);
+ COPY_LOCATION_FIELD(location);
+
+ return newnode;
+}
+
static CommonTableExpr *
_copyCommonTableExpr(const CommonTableExpr *from)
{
@@ -2526,6 +2567,10 @@ _copyQuery(const Query *from)
COPY_NODE_FIELD(jointree);
COPY_NODE_FIELD(targetList);
COPY_NODE_FIELD(withCheckOptions);
+ COPY_SCALAR_FIELD(specClause);
+ COPY_NODE_FIELD(arbiterExpr);
+ COPY_NODE_FIELD(arbiterWhere);
+ COPY_NODE_FIELD(onConflict);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(groupClause);
COPY_NODE_FIELD(havingQual);
@@ -2549,6 +2594,7 @@ _copyInsertStmt(const InsertStmt *from)
COPY_NODE_FIELD(relation);
COPY_NODE_FIELD(cols);
COPY_NODE_FIELD(selectStmt);
+ COPY_NODE_FIELD(confClause);
COPY_NODE_FIELD(returningList);
COPY_NODE_FIELD(withClause);
@@ -4255,6 +4301,9 @@ copyObject(const void *from)
case T_CurrentOfExpr:
retval = _copyCurrentOfExpr(from);
break;
+ case T_ExcludedExpr:
+ retval = _copyExcludedExpr(from);
+ break;
case T_TargetEntry:
retval = _copyTargetEntry(from);
break;
@@ -4722,6 +4771,12 @@ copyObject(const void *from)
case T_WithClause:
retval = _copyWithClause(from);
break;
+ case T_InferClause:
+ retval = _copyInferClause(from);
+ break;
+ case T_ConflictClause:
+ retval = _copyConflictClause(from);
+ break;
case T_CommonTableExpr:
retval = _copyCommonTableExpr(from);
break;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 79035b2..24e58fa 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -681,6 +681,14 @@ _equalCurrentOfExpr(const CurrentOfExpr *a, const CurrentOfExpr *b)
}
static bool
+_equalExcludedExpr(const ExcludedExpr *a, const ExcludedExpr *b)
+{
+ COMPARE_NODE_FIELD(arg);
+
+ return true;
+}
+
+static bool
_equalTargetEntry(const TargetEntry *a, const TargetEntry *b)
{
COMPARE_NODE_FIELD(expr);
@@ -863,6 +871,10 @@ _equalQuery(const Query *a, const Query *b)
COMPARE_NODE_FIELD(jointree);
COMPARE_NODE_FIELD(targetList);
COMPARE_NODE_FIELD(withCheckOptions);
+ COMPARE_SCALAR_FIELD(specClause);
+ COMPARE_NODE_FIELD(arbiterExpr);
+ COMPARE_NODE_FIELD(arbiterWhere);
+ COMPARE_NODE_FIELD(onConflict);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(groupClause);
COMPARE_NODE_FIELD(havingQual);
@@ -884,6 +896,7 @@ _equalInsertStmt(const InsertStmt *a, const InsertStmt *b)
COMPARE_NODE_FIELD(relation);
COMPARE_NODE_FIELD(cols);
COMPARE_NODE_FIELD(selectStmt);
+ COMPARE_NODE_FIELD(confClause);
COMPARE_NODE_FIELD(returningList);
COMPARE_NODE_FIELD(withClause);
@@ -2426,6 +2439,27 @@ _equalWithClause(const WithClause *a, const WithClause *b)
}
static bool
+_equalInferClause(const InferClause *a, const InferClause *b)
+{
+ COMPARE_NODE_FIELD(indexElems);
+ COMPARE_NODE_FIELD(whereClause);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
+_equalConflictClause(const ConflictClause *a, const ConflictClause *b)
+{
+ COMPARE_SCALAR_FIELD(specclause);
+ COMPARE_NODE_FIELD(infer);
+ COMPARE_NODE_FIELD(updatequery);
+ COMPARE_LOCATION_FIELD(location);
+
+ return true;
+}
+
+static bool
_equalCommonTableExpr(const CommonTableExpr *a, const CommonTableExpr *b)
{
COMPARE_STRING_FIELD(ctename);
@@ -2694,6 +2728,9 @@ equal(const void *a, const void *b)
case T_CurrentOfExpr:
retval = _equalCurrentOfExpr(a, b);
break;
+ case T_ExcludedExpr:
+ retval = _equalExcludedExpr(a, b);
+ break;
case T_TargetEntry:
retval = _equalTargetEntry(a, b);
break;
@@ -3148,6 +3185,12 @@ equal(const void *a, const void *b)
case T_WithClause:
retval = _equalWithClause(a, b);
break;
+ case T_InferClause:
+ retval = _equalInferClause(a, b);
+ break;
+ case T_ConflictClause:
+ retval = _equalConflictClause(a, b);
+ break;
case T_CommonTableExpr:
retval = _equalCommonTableExpr(a, b);
break;
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 21dfda7..a9e1e13 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -235,6 +235,13 @@ exprType(const Node *expr)
case T_CurrentOfExpr:
type = BOOLOID;
break;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ type = exprType((Node *) n->arg);
+ }
+ break;
case T_PlaceHolderVar:
type = exprType((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -469,6 +476,12 @@ exprTypmod(const Node *expr)
return ((const CoerceToDomainValue *) expr)->typeMod;
case T_SetToDefault:
return ((const SetToDefault *) expr)->typeMod;
+ case T_ExcludedExpr:
+ {
+ const ExcludedExpr *n = (const ExcludedExpr *) expr;
+
+ return ((const Var *) n->arg)->vartypmod;
+ }
case T_PlaceHolderVar:
return exprTypmod((Node *) ((const PlaceHolderVar *) expr)->phexpr);
default:
@@ -894,6 +907,9 @@ exprCollation(const Node *expr)
case T_CurrentOfExpr:
coll = InvalidOid; /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ coll = exprCollation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
case T_PlaceHolderVar:
coll = exprCollation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
@@ -1089,6 +1105,12 @@ exprSetCollation(Node *expr, Oid collation)
case T_CurrentOfExpr:
Assert(!OidIsValid(collation)); /* result is always boolean */
break;
+ case T_ExcludedExpr:
+ {
+ Var *v = (Var *) ((ExcludedExpr *) expr)->arg;
+ v->varcollid = collation;
+ }
+ break;
default:
elog(ERROR, "unrecognized node type: %d", (int) nodeTag(expr));
break;
@@ -1474,6 +1496,12 @@ exprLocation(const Node *expr)
case T_WithClause:
loc = ((const WithClause *) expr)->location;
break;
+ case T_InferClause:
+ loc = ((const InferClause *) expr)->location;
+ break;
+ case T_ConflictClause:
+ loc = ((const ConflictClause *) expr)->location;
+ break;
case T_CommonTableExpr:
loc = ((const CommonTableExpr *) expr)->location;
break;
@@ -1481,6 +1509,10 @@ exprLocation(const Node *expr)
/* just use argument's location */
loc = exprLocation((Node *) ((const PlaceHolderVar *) expr)->phexpr);
break;
+ case T_ExcludedExpr:
+ /* just use nested expr's location */
+ loc = exprLocation((Node *) ((const ExcludedExpr *) expr)->arg);
+ break;
default:
/* for any other node type it's just unknown... */
loc = -1;
@@ -1910,6 +1942,8 @@ expression_tree_walker(Node *node,
break;
case T_PlaceHolderVar:
return walker(((PlaceHolderVar *) node)->phexpr, context);
+ case T_ExcludedExpr:
+ return walker(((ExcludedExpr *) node)->arg, context);
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -1958,6 +1992,12 @@ query_tree_walker(Query *query,
return true;
if (walker((Node *) query->withCheckOptions, context))
return true;
+ if (walker((Node *) query->arbiterExpr, context))
+ return true;
+ if (walker(query->arbiterWhere, context))
+ return true;
+ if (walker(query->onConflict, context))
+ return true;
if (walker((Node *) query->returningList, context))
return true;
if (walker((Node *) query->jointree, context))
@@ -2620,6 +2660,16 @@ expression_tree_mutator(Node *node,
return (Node *) newnode;
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ ExcludedExpr *newnode;
+
+ FLATCOPY(newnode, excludedexpr, ExcludedExpr);
+ MUTATE(newnode->arg, newnode->arg, Node *);
+ return (Node *) newnode;
+ }
+ break;
case T_AppendRelInfo:
{
AppendRelInfo *appinfo = (AppendRelInfo *) node;
@@ -2699,6 +2749,9 @@ query_tree_mutator(Query *query,
MUTATE(query->targetList, query->targetList, List *);
MUTATE(query->withCheckOptions, query->withCheckOptions, List *);
+ MUTATE(query->arbiterExpr, query->arbiterExpr, List *);
+ MUTATE(query->arbiterWhere, query->arbiterWhere, Node *);
+ MUTATE(query->onConflict, query->onConflict, Node *);
MUTATE(query->returningList, query->returningList, List *);
MUTATE(query->jointree, query->jointree, FromExpr *);
MUTATE(query->setOperations, query->setOperations, Node *);
@@ -2968,6 +3021,8 @@ raw_expression_tree_walker(Node *node,
return true;
if (walker(stmt->selectStmt, context))
return true;
+ if (walker(stmt->confClause, context))
+ return true;
if (walker(stmt->returningList, context))
return true;
if (walker(stmt->withClause, context))
@@ -3207,6 +3262,25 @@ raw_expression_tree_walker(Node *node,
break;
case T_WithClause:
return walker(((WithClause *) node)->ctes, context);
+
+ case T_InferClause:
+ {
+ InferClause *stmt = (InferClause *) node;
+
+ if (walker(stmt->indexElems, context))
+ return true;
+ if (walker(stmt->whereClause, context))
+ return true;
+ }
+ case T_ConflictClause:
+ {
+ ConflictClause *stmt = (ConflictClause *) node;
+
+ if (walker(stmt->infer, context))
+ return true;
+ if (walker(stmt->updatequery, context))
+ return true;
+ }
case T_CommonTableExpr:
return walker(((CommonTableExpr *) node)->ctequery, context);
default:
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index a02ba70..a4ddb9c 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -331,6 +331,9 @@ _outModifyTable(StringInfo str, const ModifyTable *node)
WRITE_NODE_FIELD(resultRelations);
WRITE_INT_FIELD(resultRelIndex);
WRITE_NODE_FIELD(plans);
+ WRITE_ENUM_FIELD(spec, SpecType);
+ WRITE_OID_FIELD(arbiterIndex);
+ WRITE_NODE_FIELD(onConflictPlan);
WRITE_NODE_FIELD(withCheckOptionLists);
WRITE_NODE_FIELD(returningLists);
WRITE_NODE_FIELD(fdwPrivLists);
@@ -1427,6 +1430,14 @@ _outCurrentOfExpr(StringInfo str, const CurrentOfExpr *node)
}
static void
+_outExcludedExpr(StringInfo str, const ExcludedExpr *node)
+{
+ WRITE_NODE_TYPE("EXCLUDED");
+
+ WRITE_NODE_FIELD(arg);
+}
+
+static void
_outTargetEntry(StringInfo str, const TargetEntry *node)
{
WRITE_NODE_TYPE("TARGETENTRY");
@@ -2302,6 +2313,10 @@ _outQuery(StringInfo str, const Query *node)
WRITE_NODE_FIELD(jointree);
WRITE_NODE_FIELD(targetList);
WRITE_NODE_FIELD(withCheckOptions);
+ WRITE_ENUM_FIELD(specClause, SpecType);
+ WRITE_NODE_FIELD(arbiterExpr);
+ WRITE_NODE_FIELD(arbiterWhere);
+ WRITE_NODE_FIELD(onConflict);
WRITE_NODE_FIELD(returningList);
WRITE_NODE_FIELD(groupClause);
WRITE_NODE_FIELD(havingQual);
@@ -3063,6 +3078,9 @@ _outNode(StringInfo str, const void *obj)
case T_CurrentOfExpr:
_outCurrentOfExpr(str, obj);
break;
+ case T_ExcludedExpr:
+ _outExcludedExpr(str, obj);
+ break;
case T_TargetEntry:
_outTargetEntry(str, obj);
break;
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index dbc162a..48a7206 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -214,6 +214,10 @@ _readQuery(void)
READ_NODE_FIELD(jointree);
READ_NODE_FIELD(targetList);
READ_NODE_FIELD(withCheckOptions);
+ READ_ENUM_FIELD(specClause, SpecCmd);
+ READ_NODE_FIELD(arbiterExpr);
+ READ_NODE_FIELD(arbiterWhere);
+ READ_NODE_FIELD(onConflict);
READ_NODE_FIELD(returningList);
READ_NODE_FIELD(groupClause);
READ_NODE_FIELD(havingQual);
@@ -1128,6 +1132,19 @@ _readCurrentOfExpr(void)
}
/*
+ * _readExcludedExpr
+ */
+static ExcludedExpr *
+_readExcludedExpr(void)
+{
+ READ_LOCALS(ExcludedExpr);
+
+ READ_NODE_FIELD(arg);
+
+ READ_DONE();
+}
+
+/*
* _readTargetEntry
*/
static TargetEntry *
@@ -1392,6 +1409,8 @@ parseNodeString(void)
return_value = _readSetToDefault();
else if (MATCH("CURRENTOFEXPR", 13))
return_value = _readCurrentOfExpr();
+ else if (MATCH("EXCLUDED", 8))
+ return_value = _readExcludedExpr();
else if (MATCH("TARGETENTRY", 11))
return_value = _readTargetEntry();
else if (MATCH("RANGETBLREF", 11))
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index b86a3cd..6f75759 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -4013,3 +4013,60 @@ string_to_const(const char *str, Oid datatype)
return makeConst(datatype, -1, collation, constlen,
conval, false, false);
}
+
+/*
+ * plan_speculative_use_index
+ * Use the planner to decide speculative insertion arbiter index
+ *
+ * Among indexes on target of INSERT ... ON CONFLICT UPDATE/IGNORE, decide
+ * which index to use to arbitrate taking alternative path. This should be
+ * called infrequently in practice, because its unusual for more than one index
+ * to be available that can satisfy a user-specified unique index inference
+ * specification.
+ *
+ * Note: caller had better already hold some type of lock on the table.
+ */
+Oid
+plan_speculative_use_index(PlannerInfo *root, List *indexList)
+{
+ IndexOptInfo *indexInfo;
+ RelOptInfo *rel;
+ IndexPath *cheapest;
+ IndexPath *indexScanPath;
+ ListCell *lc;
+
+ /* Set up RTE/RelOptInfo arrays if needed */
+ if (!root->simple_rel_array)
+ setup_simple_rel_arrays(root);
+
+ /* Build RelOptInfo */
+ rel = build_simple_rel(root, root->parse->resultRelation, RELOPT_BASEREL);
+
+ /* Locate cheapest IndexOptInfo for the target index */
+ cheapest = NULL;
+
+ foreach(lc, rel->indexlist)
+ {
+ indexInfo = (IndexOptInfo *) lfirst(lc);
+
+ if (!list_member_oid(indexList, indexInfo->indexoid))
+ continue;
+
+ /* Estimate the cost of index scan */
+ indexScanPath = create_index_path(root, indexInfo,
+ NIL, NIL, NIL, NIL, NIL,
+ ForwardScanDirection, false,
+ NULL, 1.0);
+
+ if (!cheapest || compare_fractional_path_costs(&cheapest->path,
+ &indexScanPath->path,
+ DEFAULT_RANGE_INEQ_SEL) > 0)
+ cheapest = indexScanPath;
+
+ }
+
+ if (cheapest)
+ return cheapest->indexinfo->indexoid;
+
+ return InvalidOid;
+}
diff --git a/src/backend/optimizer/path/tidpath.c b/src/backend/optimizer/path/tidpath.c
index 1258961..263ff5f 100644
--- a/src/backend/optimizer/path/tidpath.c
+++ b/src/backend/optimizer/path/tidpath.c
@@ -255,13 +255,17 @@ create_tidscan_paths(PlannerInfo *root, RelOptInfo *rel)
/*
* We don't support pushing join clauses into the quals of a tidscan, but
* it could still have required parameterization due to LATERAL refs in
- * its tlist.
+ * its tlist. To be tidy, we disallow TID scans as the unexecuted scan
+ * node of an ON CONFLICT UPDATE auxiliary query, even though there is no
+ * reason to think that would be harmful; the optimizer should always
+ * prefer a SeqScan or Result node (actually, we assert that it's one of
+ * those two in several places, so accepting TID scans would break those).
*/
required_outer = rel->lateral_relids;
tidquals = TidQualFromRestrictinfo(rel->baserestrictinfo, rel->relid);
- if (tidquals)
+ if (tidquals && root->parse->specClause != SPEC_UPDATE)
add_path(rel, (Path *) create_tidscan_path(root, rel, tidquals,
required_outer));
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 76ba1bf..449e54f 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -4812,7 +4812,8 @@ make_modifytable(PlannerInfo *root,
Index nominalRelation,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam)
+ List *rowMarks, Plan *onConflictPlan, SpecCmd spec,
+ int epqParam)
{
ModifyTable *node = makeNode(ModifyTable);
Plan *plan = &node->plan;
@@ -4862,6 +4863,9 @@ make_modifytable(PlannerInfo *root,
node->resultRelations = resultRelations;
node->resultRelIndex = -1; /* will be set correctly in setrefs.c */
node->plans = subplans;
+ node->spec = spec;
+ node->arbiterIndex = InvalidOid;
+ node->onConflictPlan = onConflictPlan;
node->withCheckOptionLists = withCheckOptionLists;
node->returningLists = returningLists;
node->rowMarks = rowMarks;
@@ -4914,6 +4918,16 @@ make_modifytable(PlannerInfo *root,
}
node->fdwPrivLists = fdw_private_list;
+ /*
+ * If a set of unique index inference expressions was provided (for
+ * INSERT...ON CONFLICT UPDATE/IGNORE), then infer appropriate
+ * unique index (or throw an error if none is available). It's
+ * possible that there will be a costing step in the event of
+ * having to choose between multiple alternatives.
+ */
+ if (root->parse->arbiterExpr)
+ node->arbiterIndex = infer_unique_index(root);
+
return node;
}
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5c4884f..358071b 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -613,7 +613,58 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
+
+ if (parse->onConflict)
+ {
+ Query *conflictQry = (Query*) parse->onConflict;
+ ModifyTable *parent = (ModifyTable *) plan;
+
+ /*
+ * An ON CONFLICT UPDATE query is a subquery of its parent
+ * INSERT ModifyTable, but isn't formally a subplan -- it's an
+ * "auxiliary" plan.
+ *
+ * During execution, the auxiliary plan state is used to
+ * execute the UPDATE query in an ad-hoc manner, driven by the
+ * parent. The executor will only ever execute the auxiliary
+ * plan through its parent. onConflictPlan is "auxiliary" to
+ * its parent in the sense that it's strictly encapsulated from
+ * other code (for example, the executor does not separately
+ * track it within estate as a plan that needs to have
+ * execution finished when it appears within a data-modifying
+ * CTE -- only the parent is specifically tracked for that
+ * purpose).
+ *
+ * There is a fundamental nexus between parent and auxiliary
+ * plans that makes a fully unified representation seem
+ * compelling (a "CMD_UPSERT" ModifyTable plan and Query).
+ * That would obviate the need to specially track auxiliary
+ * state across all stages of execution just for this case;
+ * the optimizer would then not have to generate a
+ * fully-formed, independent UPDATE subquery plan (with a
+ * scanstate only useful for EvalPlanQual() re-evaluation).
+ * However, it's convenient to plan each ModifyTable
+ * separately, as doing so maximizes code reuse. The
+ * alternative must be to introduce abstractions that (for
+ * example) allow a single "CMD_UPSERT" ModifyTable to have two
+ * distinct types of targetlist (that will need to be processed
+ * differently during parsing and rewriting anyway). The
+ * auxiliary UPDATE plan is a good trade-off between a
+ * fully-fledged "CMD_UPSERT" representation, and the opposite
+ * extreme of tracking two separate ModifyTable nodes, joined
+ * by a contrived join type, with (for example) odd properties
+ * around tuple visibility not well encapsulated. A contrived
+ * join based design would also necessitate teaching
+ * ModifyTable nodes to support rescan just for the benefit of
+ * ON CONFLICT UPDATE.
+ */
+ parent->onConflictPlan = subquery_planner(glob, conflictQry,
+ root, hasRecursion,
+ 0, NULL);
+ }
}
}
@@ -1073,6 +1124,8 @@ inheritance_planner(PlannerInfo *root)
withCheckOptionLists,
returningLists,
rowMarks,
+ NULL,
+ parse->specClause,
SS_assign_special_param(root));
}
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 860855e..fc3b76a 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -781,9 +781,35 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
* global list.
*/
splan->resultRelIndex = list_length(root->glob->resultRelations);
- root->glob->resultRelations =
- list_concat(root->glob->resultRelations,
- list_copy(splan->resultRelations));
+
+ if (!splan->onConflictPlan)
+ {
+ /*
+ * Only actually append result relation for non-auxiliary
+ * ModifyTable plans
+ */
+ root->glob->resultRelations =
+ list_concat(root->glob->resultRelations,
+ list_copy(splan->resultRelations));
+ }
+ else
+ {
+ /*
+ * Adjust rtoffset passed to child, to compensate for dummy
+ * RTE left by EXCLUDED.* alias in auxiliary plan. Plan
+ * will have same resultRelation from flattened range table
+ * as its parent.
+ */
+ splan->onConflictPlan =
+ set_plan_refs(root, splan->onConflictPlan,
+ rtoffset - PRS2_OLD_VARNO);
+
+ /*
+ * Set up the visible plan targetlist as being the same as
+ * the parent. Again, this is for the use of EXPLAIN only.
+ */
+ splan->onConflictPlan->targetlist = splan->plan.targetlist;
+ }
}
break;
case T_Append:
diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 78fb6b1..f7a0523 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -2345,6 +2345,12 @@ finalize_plan(PlannerInfo *root, Plan *plan, Bitmapset *valid_params,
valid_params,
scan_params));
}
+
+ /*
+ * No need to directly handle onConflictPlan here, since it
+ * cannot have params (due to parse analysis enforced
+ * restrictions prohibiting subqueries).
+ */
}
break;
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index fb7db6d..3086ca3 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -31,6 +31,7 @@
#include "nodes/makefuncs.h"
#include "optimizer/clauses.h"
#include "optimizer/cost.h"
+#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/predtest.h"
#include "optimizer/prep.h"
@@ -125,10 +126,12 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
/*
* Make list of indexes. Ignore indexes on system catalogs if told to.
- * Don't bother with indexes for an inheritance parent, either.
+ * Don't bother with indexes for an inheritance parent or speculative
+ * insertion UPDATE auxiliary queries, either.
*/
if (inhparent ||
- (IgnoreSystemIndexes && IsSystemRelation(relation)))
+ (IgnoreSystemIndexes && IsSystemRelation(relation)) ||
+ root->parse->specClause == SPEC_UPDATE)
hasindex = false;
else
hasindex = relation->rd_rel->relhasindex;
@@ -394,6 +397,221 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
}
/*
+ * infer_unique_index -
+ * Retrieves unique index to arbitrate speculative insertion.
+ *
+ * Uses user-supplied inference clause expressions and predicate to match a
+ * unique index from those defined and ready on the heap relation (target). An
+ * exact match is required on columns/expressions (although they can appear in
+ * any order). However, the predicate given by the user need only restrict
+ * insertion to a subset of some part of the table covered by some particular
+ * unique index (in particular, a partial unique index) in order to be
+ * inferred.
+ *
+ * The implementation does not consider which B-Tree operator class any
+ * particular available unique index uses. In particular, there is no system
+ * dependency on the default operator class for the purposes of inference.
+ * This should be okay, since by convention non-default opclasses only
+ * introduce alternative sort orders, not alternative notions of equality
+ * (there are only trivial known exceptions to this convention, where "equals"
+ * operator of a type's opclasses do not match across opclasses, exceptions
+ * that exist precisely to discourage user code from using the divergent
+ * opclass). Even if we assume that a type could usefully have multiple
+ * alternative concepts of equality, surely the definition actually implied by
+ * the operator class of actually indexed attributes is pertinent. However,
+ * this is a bit of a wart, because strictly speaking there is leeway for a
+ * query to be interpreted in deference to available unique indexes, and
+ * indexes are traditionally only an implementation detail. It hardly seems
+ * worth it to waste cycles on this corner case, though.
+ *
+ * This logic somewhat mirrors get_relation_info(). This process is not
+ * deferred to a get_relation_info() call while planning because there may not
+ * be any such call. In the ON CONFLICT UPDATE case get_relation_info() will
+ * be called, for auxiliary query planning, but even then indexes won't be
+ * examined since they're not generally interesting to that case (building
+ * index paths is explicitly avoided for auxiliary query planning, in fact).
+ */
+Oid
+infer_unique_index(PlannerInfo *root)
+{
+ Query *parse = root->parse;
+ Relation relation;
+ Oid relationObjectId;
+ Bitmapset *plainAttrs = NULL;
+ List *candidates = NIL;
+ ListCell *l;
+ List *indexList;
+
+ Assert(parse->specClause == SPEC_INSERT ||
+ parse->specClause == SPEC_IGNORE);
+
+ /*
+ * We need not lock the relation since it was already locked, either by
+ * the rewriter or when expand_inherited_rtentry() added it to the query's
+ * rangetable.
+ */
+ relationObjectId = rt_fetch(parse->resultRelation, parse->rtable)->relid;
+
+ relation = heap_open(relationObjectId, NoLock);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(l, parse->arbiterExpr)
+ {
+ Expr *elem;
+ Var *var;
+ int attno;
+
+ elem = (Expr *) lfirst(l);
+
+ /*
+ * Parse analysis of inference elements performs full parse analysis of
+ * Vars, even for non-expression indexes (in contrast with utility
+ * command related use of IndexElem). However, indexes are cataloged
+ * with simple attribute numbers for non-expression indexes.
+ * Therefore, we must build a compatible bms representation here.
+ */
+ if (!IsA(elem, Var))
+ continue;
+
+ var = (Var*) elem;
+ attno = var->varattno;
+
+ if (attno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("system columns may not appear in unique index inference specification")));
+ else if (attno == 0)
+ elog(ERROR, "whole row unique index inference specifications are not valid");
+
+ plainAttrs = bms_add_member(plainAttrs, attno);
+ }
+
+ indexList = RelationGetIndexList(relation);
+
+ /*
+ * Using that representation, iterate through the list of indexes on the
+ * target relation to try and find a match
+ */
+ foreach(l, indexList)
+ {
+ Oid indexoid = lfirst_oid(l);
+ Relation idxRel;
+ Form_pg_index idxForm;
+ Bitmapset *indexedPlainAttrs = NULL;
+ List *idxExprs;
+ List *predExprs;
+ List *whereExplicit;
+ AttrNumber natt;
+ ListCell *e;
+
+ /*
+ * Extract info from the relation descriptor for the index. We know
+ * that this is a target, so get lock type it is known will ultimately
+ * be required by the executor.
+ *
+ * Let executor complain about !indimmediate case directly.
+ */
+ idxRel = index_open(indexoid, RowExclusiveLock);
+ idxForm = idxRel->rd_index;
+
+ if (!idxForm->indisunique ||
+ !IndexIsValid(idxForm))
+ goto next;
+
+ /*
+ * If the index is valid, but cannot yet be used, ignore it. See
+ * src/backend/access/heap/README.HOT for discussion.
+ */
+ if (idxForm->indcheckxmin &&
+ !TransactionIdPrecedes(HeapTupleHeaderGetXmin(idxRel->rd_indextuple->t_data),
+ TransactionXmin))
+ goto next;
+
+ /* Check in detail if the clause attributes/expressions match */
+ for (natt = 0; natt < idxForm->indnatts; natt++)
+ {
+ int attno = idxRel->rd_index->indkey.values[natt];
+
+ if (attno < 0)
+ elog(ERROR, "system column in index");
+
+ if (attno != 0)
+ indexedPlainAttrs = bms_add_member(indexedPlainAttrs, attno);
+ }
+
+ /*
+ * Since expressions were made unique during parse analysis, it's
+ * evident that we cannot proceed with this index if the number of
+ * attributes (plain or expression) does not match exactly. This
+ * precludes support for unique indexes created with redundantly
+ * referenced columns (which are not forbidden by CREATE INDEX), but
+ * this seems inconsequential.
+ */
+ if (list_length(parse->arbiterExpr) != idxForm->indnatts)
+ goto next;
+
+ idxExprs = RelationGetIndexExpressions(idxRel);
+
+ /*
+ * Match expressions appearing in clause (if any) with index
+ * definition
+ */
+ foreach(e, parse->arbiterExpr)
+ {
+ Expr *elem = (Expr *) lfirst(e);
+
+ /* Plain Vars were already separately accounted for */
+ if (IsA(elem, Var))
+ continue;
+
+ if (!list_member(idxExprs, elem))
+ goto next;
+ }
+
+ /* Non-expression attributes (if any) must match */
+ if (!bms_equal(indexedPlainAttrs, plainAttrs))
+ goto next;
+
+ /*
+ * Any user-supplied ON CONFLICT unique index inference WHERE clause
+ * need only be implied by the cataloged index definitions predicate
+ */
+ predExprs = RelationGetIndexPredicate(idxRel);
+ whereExplicit = make_ands_implicit((Expr *) parse->arbiterWhere);
+
+ if (!predicate_implied_by(predExprs, whereExplicit))
+ goto next;
+
+ candidates = lappend_oid(candidates, idxForm->indexrelid);
+next:
+ index_close(idxRel, NoLock);
+ }
+
+ list_free(indexList);
+ heap_close(relation, NoLock);
+
+ /*
+ * In the common case where there is only a single candidate unique index,
+ * there is clearly no point in building index paths to determine which is
+ * cheapest.
+ */
+ if (list_length(candidates) == 1)
+ return linitial_oid(candidates);
+ else if (candidates == NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("could not infer which unique index to use from expressions/columns and predicate provided for ON CONFLICT")));
+ else
+ /* Otherwise, deduce the least expensive unique index */
+ return plan_speculative_use_index(root, candidates);
+
+ return InvalidOid; /* keep compiler quiet */
+}
+
+/*
* estimate_rel_size - estimate # pages and # tuples in a table or index
*
* We also estimate the fraction of the pages that are marked all-visible in
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index df89065..6c194f9 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -387,6 +387,8 @@ transformDeleteStmt(ParseState *pstate, DeleteStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -408,6 +410,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
{
Query *qry = makeNode(Query);
SelectStmt *selectStmt = (SelectStmt *) stmt->selectStmt;
+ SpecCmd spec = stmt->confClause? stmt->confClause->specclause:SPEC_NONE;
List *exprList = NIL;
bool isGeneralSelect;
List *sub_rtable;
@@ -425,6 +428,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
qry->commandType = CMD_INSERT;
pstate->p_is_insert = true;
+ pstate->p_is_speculative = spec != SPEC_NONE;
/* process the WITH clause independently of all else */
if (stmt->withClause)
@@ -472,11 +476,16 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
sub_namespace = NIL;
}
+ /* INSERT with an ON CONFLICT clause forces the "target" alias */
+ if (pstate->p_is_speculative)
+ stmt->relation->alias = makeAlias("target", NIL);
+
/*
* Must get write lock on INSERT target table before scanning SELECT, else
* we will grab the wrong kind of initial lock if the target table is also
* mentioned in the SELECT part. Note that the target table is not added
- * to the joinlist or namespace.
+ * to the joinlist or namespace. Note also that additional requiredPerms
+ * may be added to the target RTE iff there is an auxiliary UPDATE.
*/
qry->resultRelation = setTargetTable(pstate, stmt->relation,
false, false, ACL_INSERT);
@@ -741,12 +750,13 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
}
/*
- * If we have a RETURNING clause, we need to add the target relation to
- * the query namespace before processing it, so that Var references in
- * RETURNING will work. Also, remove any namespace entries added in a
- * sub-SELECT or VALUES list.
+ * If we have a RETURNING clause, or there are attributes used as the
+ * condition on which to take an alternative ON CONFLICT path, we need to
+ * add the target relation to the query namespace before processing it, so
+ * that Var references in RETURNING/the alternative path key will work.
+ * Also, remove any namespace entries added in a sub-SELECT or VALUES list.
*/
- if (stmt->returningList)
+ if (stmt->returningList || stmt->confClause)
{
pstate->p_namespace = NIL;
addRTEtoQuery(pstate, pstate->p_target_rangetblentry,
@@ -758,8 +768,49 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
/* done building the range table and jointree */
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, NULL);
-
+ qry->specClause = spec;
qry->hasSubLinks = pstate->p_hasSubLinks;
+ qry->onConflict = NULL;
+
+ if (stmt->confClause)
+ {
+ /*
+ * ON CONFLICT UPDATE requires special parse analysis of auxiliary
+ * update Query
+ */
+ if (stmt->confClause->updatequery)
+ {
+ ParseState *sub_pstate = make_parsestate(pstate);
+ Query *uqry;
+
+ /*
+ * The optimizer is not prepared to accept a subquery RTE for a
+ * non-CMD_SELECT Query. The CMD_UPDATE Query is tracked as
+ * special auxiliary state, while there is more or less analogous
+ * auxiliary state tracked in later stages of query execution.
+ *
+ * Parent canSetTag only ever actually consulted, so no need to set
+ * that here.
+ */
+ uqry = transformStmt(sub_pstate, stmt->confClause->updatequery);
+ Assert(uqry->commandType == CMD_UPDATE &&
+ uqry->specClause == SPEC_UPDATE);
+
+ /* Save auxiliary query */
+ qry->onConflict = (Node *) uqry;
+
+ free_parsestate(sub_pstate);
+ }
+
+ /*
+ * Infer a unique index from columns/expressions. This is later used
+ * to infer a unique index which arbitrates whether or not to take the
+ * alternative ON CONFLICT path (i.e. whether or not to INSERT or
+ * UPDATE/IGNORE in respect of each slot proposed for insertion).
+ */
+ transformConflictClause(pstate, stmt->confClause, &qry->arbiterExpr,
+ &qry->arbiterWhere);
+ }
assign_query_collations(pstate, qry);
@@ -1006,6 +1057,8 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->specClause = SPEC_NONE;
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
qry->hasWindowFuncs = pstate->p_hasWindowFuncs;
@@ -1903,10 +1956,14 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
Node *qual;
ListCell *origTargetList;
ListCell *tl;
+ bool InhOption;
qry->commandType = CMD_UPDATE;
pstate->p_is_update = true;
+ /* for auxiliary UPDATEs, visit parent INSERT to set target table */
+ pstate->p_is_speculative = (stmt->relation == NULL);
+
/* process the WITH clause independently of all else */
if (stmt->withClause)
{
@@ -1915,8 +1972,20 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->hasModifyingCTE = pstate->p_hasModifyingCTE;
}
+ if (!pstate->p_is_speculative)
+ {
+ InhOption = interpretInhOption(stmt->relation->inhOpt);
+ qry->specClause = SPEC_NONE;
+ }
+ else
+ {
+ /* auxiliary UPDATE does not accept ONLY */
+ InhOption = false;
+ qry->specClause = SPEC_UPDATE;
+ }
+
qry->resultRelation = setTargetTable(pstate, stmt->relation,
- interpretInhOption(stmt->relation->inhOpt),
+ InhOption,
true,
ACL_UPDATE);
@@ -1947,6 +2016,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
qry->rtable = pstate->p_rtable;
qry->jointree = makeFromExpr(pstate->p_joinlist, qual);
+ qry->onConflict = NULL;
qry->hasSubLinks = pstate->p_hasSubLinks;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 36dac29..f987432 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -215,6 +215,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
RangeVar *range;
IntoClause *into;
WithClause *with;
+ InferClause *infer;
+ ConflictClause *conf;
A_Indices *aind;
ResTarget *target;
struct PrivTarget *privtarget;
@@ -415,6 +417,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <defelt> SeqOptElem
%type <istmt> insert_rest
+%type <infer> opt_conf_expr
+%type <conf> opt_on_conflict
%type <vsetstmt> generic_set set_rest set_rest_more generic_reset reset_rest
SetResetClause FunctionSetResetClause
@@ -513,6 +517,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%type <list> cte_list
%type <list> within_group_clause
+%type <node> UpdateInsertStmt
%type <node> filter_clause
%type <list> window_clause window_definition_list opt_partition_clause
%type <windef> window_definition over_clause window_specification
@@ -551,8 +556,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
CACHE CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
CLUSTER COALESCE COLLATE COLLATION COLUMN COMMENT COMMENTS COMMIT
- COMMITTED CONCURRENTLY CONFIGURATION CONNECTION CONSTRAINT CONSTRAINTS
- CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
+ COMMITTED CONCURRENTLY CONFIGURATION CONFLICT CONNECTION CONSTRAINT
+ CONSTRAINTS CONTENT_P CONTINUE_P CONVERSION_P COPY COST CREATE
CROSS CSV CURRENT_P
CURRENT_CATALOG CURRENT_DATE CURRENT_ROLE CURRENT_SCHEMA
CURRENT_TIME CURRENT_TIMESTAMP CURRENT_USER CURSOR CYCLE
@@ -572,7 +577,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
HANDLER HAVING HEADER_P HOLD HOUR_P
- IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
+ IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P
INCLUDING INCREMENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -652,6 +657,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
+%nonassoc DISTINCT
+%nonassoc ON
%left POSTFIXOP /* dummy for postfix Op rules */
/*
* To support target_el without AS, we must give IDENT an explicit priority
@@ -9399,10 +9406,12 @@ DeallocateStmt: DEALLOCATE name
*****************************************************************************/
InsertStmt:
- opt_with_clause INSERT INTO qualified_name insert_rest returning_clause
+ opt_with_clause INSERT INTO qualified_name insert_rest
+ opt_on_conflict returning_clause
{
$5->relation = $4;
- $5->returningList = $6;
+ $5->confClause = $6;
+ $5->returningList = $7;
$5->withClause = $1;
$$ = (Node *) $5;
}
@@ -9447,6 +9456,44 @@ insert_column_item:
}
;
+opt_on_conflict:
+ ON CONFLICT opt_conf_expr UpdateInsertStmt
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_INSERT;
+ $$->infer = $3;
+ $$->updatequery = $4;
+ $$->location = @1;
+ }
+ |
+ ON CONFLICT opt_conf_expr IGNORE_P
+ {
+ $$ = makeNode(ConflictClause);
+ $$->specclause = SPEC_IGNORE;
+ $$->infer = $3;
+ $$->updatequery = NULL;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
+opt_conf_expr:
+ '(' index_params where_clause ')'
+ {
+ $$ = makeNode(InferClause);
+ $$->indexElems = $2;
+ $$->whereClause = $3;
+ $$->location = @1;
+ }
+ | /*EMPTY*/
+ {
+ $$ = NULL;
+ }
+ ;
+
returning_clause:
RETURNING target_list { $$ = $2; }
| /* EMPTY */ { $$ = NIL; }
@@ -9546,6 +9593,22 @@ UpdateStmt: opt_with_clause UPDATE relation_expr_opt_alias
}
;
+UpdateInsertStmt: UPDATE
+ SET set_clause_list
+ where_clause
+ {
+ UpdateStmt *n = makeNode(UpdateStmt);
+ /* NULL relation conveys auxiliary */
+ n->relation = NULL;
+ n->targetList = $3;
+ n->fromClause = NULL;
+ n->whereClause = $4;
+ n->returningList = NULL;
+ n->withClause = NULL;
+ $$ = (Node *)n;
+ }
+ ;
+
set_clause_list:
set_clause { $$ = $1; }
| set_clause_list ',' set_clause { $$ = list_concat($1,$3); }
@@ -13188,6 +13251,7 @@ unreserved_keyword:
| COMMIT
| COMMITTED
| CONFIGURATION
+ | CONFLICT
| CONNECTION
| CONSTRAINTS
| CONTENT_P
@@ -13247,6 +13311,7 @@ unreserved_keyword:
| HOUR_P
| IDENTITY_P
| IF_P
+ | IGNORE_P
| IMMEDIATE
| IMMUTABLE
| IMPLICIT_P
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 654dce6..03725c2 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -16,6 +16,7 @@
#include "postgres.h"
#include "access/heapam.h"
+#include "catalog/catalog.h"
#include "catalog/heap.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
@@ -75,6 +76,8 @@ static TargetEntry *findTargetlistEntrySQL99(ParseState *pstate, Node *node,
List **tlist, ParseExprKind exprKind);
static int get_matching_location(int sortgroupref,
List *sortgrouprefs, List *exprs);
+static List* resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel);
static List *addTargetToGroupList(ParseState *pstate, TargetEntry *tle,
List *grouplist, List *targetlist, int location,
bool resolveUnknown);
@@ -145,7 +148,9 @@ transformFromClause(ParseState *pstate, List *frmList)
* We also open the target relation and acquire a write lock on it.
* This must be done before processing the FROM list, in case the target
* is also mentioned as a source relation --- we want to be sure to grab
- * the write lock before any read lock.
+ * the write lock before any read lock. Note that when called during
+ * the parse analysis of an auxiliary UPDATE query, relation may be
+ * NULL, and the details are acquired from the parent.
*
* If alsoSource is true, add the target to the query's joinlist and
* namespace. For INSERT, we don't want the target to be joined to;
@@ -172,19 +177,79 @@ setTargetTable(ParseState *pstate, RangeVar *relation,
/*
* Open target rel and grab suitable lock (which we will hold till end of
- * transaction).
+ * transaction), iff this is not an auxiliary ON CONFLICT UPDATE.
*
- * free_parsestate() will eventually do the corresponding heap_close(),
- * but *not* release the lock.
+ * free_parsestate() will eventually do the corresponding heap_close(), but
+ * *not* release the lock (again, iff this is not an auxiliary ON CONFLICT
+ * UPDATE).
*/
- pstate->p_target_relation = parserOpenTable(pstate, relation,
- RowExclusiveLock);
+ if (!pstate->p_is_speculative || pstate->p_is_insert)
+ {
+ pstate->p_target_relation = parserOpenTable(pstate, relation,
+ RowExclusiveLock);
+
+ /*
+ * Now build an RTE.
+ */
+ rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
+ relation->alias, inh, false);
+
+ /*
+ * Override addRangeTableEntry's default ACL_SELECT permissions
+ * check, and instead mark target table as requiring exactly the
+ * specified permissions.
+ *
+ * If we find an explicit reference to the rel later during parse
+ * analysis, we will add the ACL_SELECT bit back again; see
+ * markVarForSelectPriv and its callers.
+ */
+ rte->requiredPerms = requiredPerms;
+ }
+ else
+ {
+ RangeTblEntry *exclRte;
+
+ /* auxilary UPDATE (of ON CONFLICT UPDATE) */
+ Assert(pstate->p_is_update);
+ /* target shared with parent */
+ pstate->p_target_relation =
+ pstate->parentParseState->p_target_relation;
+ rte = pstate->parentParseState->p_target_rangetblentry;
+
+ /*
+ * When called for auxiliary UPDATE, same target RTE is processed here
+ * for a second time. Just append requiredPerms. There is no need to
+ * override addRangeTableEntry's default ACL_SELECT permissions check
+ * now.
+ */
+ rte->requiredPerms |= requiredPerms;
+
+ /*
+ * Build EXCLUDED alias for target relation. This can be used to
+ * reference the tuple originally proposed for insertion from within
+ * the ON CONFLICT UPDATE auxiliary query. This is not visible in the
+ * parent INSERT.
+ *
+ * NOTE: 'EXCLUDED' will always have a varno equal to 1 (at least until
+ * rewriting, where the RTE is effectively discarded -- its Vars are
+ * replaced with a special-purpose primnode, ExcludedExpr).
+ */
+ exclRte = addRangeTableEntryForRelation(pstate,
+ pstate->p_target_relation,
+ makeAlias("excluded", NIL),
+ false, false);
+
+ /*
+ * Add EXCLUDED RTE to namespace. It does not matter that the RTE is
+ * not added to the Query joinlist, since its Vars are merely
+ * placeholders for ExcludedExpr.
+ */
+ addRTEtoQuery(pstate, exclRte, false, true, true);
+
+ /* Append parent/our target to Query rtable (should be last) */
+ pstate->p_rtable = lappend(pstate->p_rtable, rte);
+ }
- /*
- * Now build an RTE.
- */
- rte = addRangeTableEntryForRelation(pstate, pstate->p_target_relation,
- relation->alias, inh, false);
pstate->p_target_rangetblentry = rte;
/* assume new rte is at end */
@@ -192,17 +257,6 @@ setTargetTable(ParseState *pstate, RangeVar *relation,
Assert(rte == rt_fetch(rtindex, pstate->p_rtable));
/*
- * Override addRangeTableEntry's default ACL_SELECT permissions check, and
- * instead mark target table as requiring exactly the specified
- * permissions.
- *
- * If we find an explicit reference to the rel later during parse
- * analysis, we will add the ACL_SELECT bit back again; see
- * markVarForSelectPriv and its callers.
- */
- rte->requiredPerms = requiredPerms;
-
- /*
* If UPDATE/DELETE, add table to joinlist and namespace.
*
* Note: some callers know that they can find the new ParseNamespaceItem
@@ -2166,6 +2220,170 @@ get_matching_location(int sortgroupref, List *sortgrouprefs, List *exprs)
}
/*
+ * resolve_unique_index_expr
+ * Infer a unique index from a list of indexElems, for ON
+ * CONFLICT UPDATE/IGNORE
+ *
+ * Perform parse analysis of expressions and columns appearing within ON
+ * CONFLICT clause. During planning, the returned list of expressions is used
+ * to infer which unique index to use.
+ */
+static List *
+resolve_unique_index_expr(ParseState *pstate, InferClause *infer,
+ Relation heapRel)
+{
+ List *clauseexprs = NIL;
+ ListCell *l;
+
+ if (heapRel->rd_rel->relkind != RELKIND_RELATION)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" is not an ordinary table",
+ RelationGetRelationName(heapRel)),
+ errhint("Only ordinary tables are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ if (heapRel->rd_rel->relhassubclass)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("relation \"%s\" has inheritance children",
+ RelationGetRelationName(heapRel)),
+ errhint("Only heap relations without inheritance children are accepted as targets when a unique index is inferred for ON CONFLICT.")));
+
+ foreach(l, infer->indexElems)
+ {
+ IndexElem *ielem = (IndexElem *) lfirst(l);
+ Node *trans;
+
+ /*
+ * Raw grammar re-uses CREATE INDEX infrastructure for unique index
+ * inference clause, and so will accept opclasses by name and so on.
+ * Reject these here explicitly.
+ */
+ if (ielem->ordering != SORTBY_DEFAULT ||
+ ielem->nulls_ordering != SORTBY_NULLS_DEFAULT)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT does not accept ordering or NULLS FIRST/LAST specifications"),
+ errhint("These factors do not affect uniqueness of indexed datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->collation != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("ON CONFLICT collation specification is unnecessary"),
+ errhint("Collations do not affect uniqueness of collatable datums."),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (ielem->opclass != NIL)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("ON CONFLICT cannot accept non-default operator class specifications"),
+ parser_errposition(pstate,
+ exprLocation((Node *) infer))));
+
+ if (!ielem->expr)
+ {
+ /* Simple index attribute */
+ ColumnRef *n;
+
+ /*
+ * Grammar won't have built raw expression for us in event of plain
+ * column reference. Create one directly, and perform expression
+ * transformation, which seems better principled than simply
+ * propagating catalog-style simple attribute numbers. For
+ * example, it means the Var is marked for SELECT privileges, which
+ * speculative insertion requires. Planner expects this, and
+ * performs its own normalization for the purposes of matching
+ * against pg_index.
+ */
+ n = makeNode(ColumnRef);
+ n->fields = list_make1(makeString(ielem->name));
+ /* Location is approximately that of inference specification */
+ n->location = infer->location;
+ trans = (Node *) n;
+ }
+ else
+ {
+ /* Do parse transformation of the raw expression */
+ trans = (Node *) ielem->expr;
+ }
+
+ /*
+ * transformExpr() should have already rejected subqueries,
+ * aggregates, and window functions, based on the EXPR_KIND_ for an
+ * index expression. Expressions returning sets won't have been
+ * rejected, but don't bother doing so here; there should be no
+ * available expression unique index to match any such expression
+ * against anyway.
+ */
+ trans = transformExpr(pstate, trans, EXPR_KIND_INDEX_EXPRESSION);
+ /* Save in list of transformed expressions */
+ clauseexprs = list_append_unique(clauseexprs, trans);
+ }
+
+ return clauseexprs;
+}
+
+/*
+ * transformConflictClauseExpr -
+ * transform expressions of ON CONFLICT UPDATE/IGNORE.
+ *
+ * Transformed expressions used to infer one unique index relation to serve as
+ * an ON CONFLICT arbiter. Partial unique indexes may be inferred using WHERE
+ * clause from inference specification clause.
+ */
+void
+transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere)
+{
+ InferClause *infer = confClause->infer;
+
+ if (confClause->specclause == SPEC_INSERT && !infer)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("ON CONFLICT with UPDATE must contain columns or expressions to infer a unique index from"),
+ parser_errposition(pstate,
+ exprLocation((Node *) confClause))));
+
+ Assert(confClause->specclause != SPEC_INSERT ||
+ confClause->updatequery != NULL);
+
+ /* This obviates the need for historic snapshot support */
+ if (IsCatalogRelation(pstate->p_target_relation))
+ elog(ERROR, "ON CONFLICT not supported with catalog relations");
+
+ /*
+ * If there is no inference clause, this might be an updatable view, which
+ * are supported by ON CONFLICT IGNORE (without columns/ expressions
+ * specified to infer a unique index from -- this is mandatory for the
+ * UPDATE variant). It might also be a relation with inheritance children,
+ * which would also make proceeding with inference fail.
+ */
+ if (infer)
+ {
+ *arbiterExpr = resolve_unique_index_expr(pstate, infer,
+ pstate->p_target_relation);
+
+ /* Handling inference WHERE clause (for partial unique index inference) */
+ if (infer->whereClause)
+ *arbiterWhere = transformExpr(pstate, infer->whereClause,
+ EXPR_KIND_INDEX_PREDICATE);
+ }
+
+ /*
+ * It's convenient to form a list of expressions based on the
+ * representation used by CREATE INDEX, since the same restrictions are
+ * appropriate (on subqueries and so on). However, from here on, the
+ * handling of those expressions is identical to ordinary optimizable
+ * statements. In particular, assign_query_collations() can be trusted to
+ * do the right thing with the post parse analysis query tree inference
+ * clause representation.
+ */
+}
+
+/*
* addTargetToSortList
* If the given targetlist entry isn't already in the SortGroupClause
* list, add it to the end of the list, using the given sort ordering
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index f0f0488..d1583c7 100644
--- a/src/backend/parser/parse_expr.c
+++ b/src/backend/parser/parse_expr.c
@@ -1497,7 +1497,8 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
/*
* Check to see if the sublink is in an invalid place within the query. We
* allow sublinks everywhere in SELECT/INSERT/UPDATE/DELETE, but generally
- * not in utility statements.
+ * not in utility statements. They're also disallowed within auxiliary ON
+ * CONFLICT UPDATE commands, which we check for here.
*/
err = NULL;
switch (pstate->p_expr_kind)
@@ -1564,6 +1565,9 @@ transformSubLink(ParseState *pstate, SubLink *sublink)
* which is sane anyway.
*/
}
+
+ if (pstate->p_is_speculative && pstate->p_is_update)
+ err = _("cannot use subquery in ON CONFLICT UPDATE");
if (err)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index 4130cbf..9a94fa4 100644
--- a/src/backend/parser/parse_node.c
+++ b/src/backend/parser/parse_node.c
@@ -84,7 +84,13 @@ free_parsestate(ParseState *pstate)
errmsg("target lists can have at most %d entries",
MaxTupleAttributeNumber)));
- if (pstate->p_target_relation != NULL)
+ /*
+ * Don't close target relation for auxiliary ON CONFLICT UPDATE, since it
+ * is managed by parent INSERT directly
+ */
+ if (pstate->p_target_relation != NULL &&
+ (!pstate->p_is_speculative ||
+ pstate->p_is_insert))
heap_close(pstate->p_target_relation, NoLock);
pfree(pstate);
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 9894146..12b0b06 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -43,6 +43,12 @@ typedef struct acquireLocksOnSubLinks_context
bool for_execute; /* AcquireRewriteLocks' forExecute param */
} acquireLocksOnSubLinks_context;
+typedef struct excluded_replace_context
+{
+ int varno; /* varno of EXLCUDED.* Vars */
+ int rvarno; /* replace varno */
+} excluded_replace_context;
+
static bool acquireLocksOnSubLinks(Node *node,
acquireLocksOnSubLinks_context *context);
static Query *rewriteRuleAction(Query *parsetree,
@@ -66,11 +72,15 @@ static void markQueryForLocking(Query *qry, Node *jtnode,
LockClauseStrength strength, LockWaitPolicy waitPolicy,
bool pushedDown);
static List *matchLocks(CmdType event, RuleLock *rulelocks,
- int varno, Query *parsetree);
+ int varno, Query *parsetree, bool *hasUpdate);
static Query *fireRIRrules(Query *parsetree, List *activeRIRs,
bool forUpdatePushedDown);
static bool view_has_instead_trigger(Relation view, CmdType event);
static Bitmapset *adjust_view_column_set(Bitmapset *cols, List *targetlist);
+static Node *excluded_replace_vars(Node *expr,
+ excluded_replace_context *context);
+static Node *excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context);
/*
@@ -1288,7 +1298,8 @@ static List *
matchLocks(CmdType event,
RuleLock *rulelocks,
int varno,
- Query *parsetree)
+ Query *parsetree,
+ bool *hasUpdate)
{
List *matching_locks = NIL;
int nlocks;
@@ -1309,6 +1320,9 @@ matchLocks(CmdType event,
{
RewriteRule *oneLock = rulelocks->rules[i];
+ if (oneLock->event == CMD_UPDATE)
+ *hasUpdate = true;
+
/*
* Suppress ON INSERT/UPDATE/DELETE rules that are disabled or
* configured to not fire during the current sessions replication
@@ -2961,6 +2975,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
CmdType event = parsetree->commandType;
bool instead = false;
bool returning = false;
+ bool updatableview = false;
Query *qual_product = NULL;
List *rewritten = NIL;
ListCell *lc1;
@@ -3043,6 +3058,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
Relation rt_entry_relation;
List *locks;
List *product_queries;
+ bool hasUpdate = false;
result_relation = parsetree->resultRelation;
Assert(result_relation != 0);
@@ -3094,6 +3110,49 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
/* Process just the main targetlist */
rewriteTargetListIU(parsetree, rt_entry_relation, NULL);
}
+
+ if (parsetree->specClause == SPEC_INSERT)
+ {
+ Query *qry;
+ excluded_replace_context context;
+
+ /*
+ * While user-defined rules will never be applied in the
+ * auxiliary update query, normalization of tlist is still
+ * required
+ */
+ qry = (Query *) parsetree->onConflict;
+ rewriteTargetListIU(qry, rt_entry_relation, NULL);
+
+ /*
+ * Replace OLD Vars (associated with the EXCLUDED.* alias) with
+ * first (and only) "real" relation RTE in rtable. This allows
+ * the implementation to treat EXCLUDED.* as an alias for the
+ * target relation, which is useful during parse analysis,
+ * while ultimately having those references rewritten as
+ * special ExcludedExpr references to the corresponding Var in
+ * the target RTE.
+ *
+ * This is necessary because while we want a join-like syntax
+ * for aesthetic reasons, the resemblance is superficial. In
+ * fact, execution of the ModifyTable node (and its direct
+ * child auxiliary query) manages tupleslot state directly, and
+ * is directly tasked with making available the appropriate
+ * tupleslot to the expression context.
+ *
+ * This is a kludge, but appears necessary, since the slot made
+ * available for referencing via ExcludedExpr is in fact the
+ * slot just excluded from insertion by speculative insertion
+ * (with the effects of BEFORE ROW INSERT triggers carried).
+ * An ad-hoc method for making the excluded tuple available
+ * within the auxiliary expression context is appropriate.
+ */
+ context.varno = PRS2_OLD_VARNO;
+ context.rvarno = PRS2_OLD_VARNO + 1;
+
+ parsetree->onConflict =
+ excluded_replace_vars(parsetree->onConflict, &context);
+ }
}
else if (event == CMD_UPDATE)
{
@@ -3111,7 +3170,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
* Collect and apply the appropriate rules.
*/
locks = matchLocks(event, rt_entry_relation->rd_rules,
- result_relation, parsetree);
+ result_relation, parsetree, &hasUpdate);
product_queries = fireRules(parsetree,
result_relation,
@@ -3160,6 +3219,7 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
*/
instead = true;
returning = true;
+ updatableview = true;
}
/*
@@ -3240,6 +3300,18 @@ RewriteQuery(Query *parsetree, List *rewrite_events)
}
}
+ /*
+ * Updatable views are supported on a limited basis by ON CONFLICT
+ * IGNORE (if there is no unique index inference required, speculative
+ * insertion proceeds).
+ */
+ if (parsetree->specClause != SPEC_NONE &&
+ (product_queries != NIL || hasUpdate) &&
+ !updatableview)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("INSERT with ON CONFLICT clause may not target relation with INSERT or UPDATE rules")));
+
heap_close(rt_entry_relation, NoLock);
}
@@ -3402,3 +3474,52 @@ QueryRewrite(Query *parsetree)
return results;
}
+
+/*
+ * Apply pullup variable replacement throughout an expression tree
+ *
+ * Returns modified tree, with user-specified rvarno replaced with varno.
+ */
+static Node *
+excluded_replace_vars(Node *expr, excluded_replace_context *context)
+{
+ /*
+ * Don't recurse into subqueries; they're forbidden in auxiliary ON
+ * CONFLICT query
+ */
+ return replace_rte_variables(expr,
+ context->varno, 0,
+ excluded_replace_vars_callback,
+ (void *) context,
+ NULL);
+}
+
+static Node *
+excluded_replace_vars_callback(Var *var,
+ replace_rte_variables_context *context)
+{
+ excluded_replace_context *rcon = (excluded_replace_context *) context->callback_arg;
+ ExcludedExpr *n = makeNode(ExcludedExpr);
+
+ /* Replace with an enclosing ExcludedExpr */
+ var->varno = rcon->rvarno;
+ n->arg = (Node *) var;
+
+ /*
+ * Would have to adjust varlevelsup if referenced item is from higher query
+ * (should not happen)
+ */
+ Assert(var->varlevelsup == 0);
+
+ if (var->varattno < 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference system column using EXCLUDED.* alias")));
+
+ if (var->varattno == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_COLUMN_REFERENCE),
+ errmsg("cannot reference whole-row using EXCLUDED.* alias")));
+
+ return (Node*) n;
+}
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index a1ebc72..a1c5bcb 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -421,6 +421,13 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
latestXid))
ShmemVariableCache->latestCompletedXid = latestXid;
+ /* Also clear any speculative insertion information */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+
LWLockRelease(ProcArrayLock);
}
else
@@ -438,6 +445,11 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
pgxact->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
pgxact->delayChkpt = false; /* be sure this is cleared in abort */
proc->recoveryConflictPending = false;
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
Assert(pgxact->nxids == 0);
Assert(pgxact->overflowed == false);
@@ -476,6 +488,13 @@ ProcArrayClearTransaction(PGPROC *proc)
/* Clear the subtransaction-XID cache too */
pgxact->nxids = 0;
pgxact->overflowed = false;
+
+ /* these should be clear, but just in case.. */
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
}
/*
@@ -1110,6 +1129,96 @@ TransactionIdIsActive(TransactionId xid)
/*
+ * SetSpeculativeInsertionToken -- Set speculative token
+ *
+ * The backend local counter value is set, to allow waiters to differentiate
+ * individual speculative insertions.
+ */
+void
+SetSpeculativeInsertionToken(uint32 token)
+{
+ MyProc->specInsertToken = token;
+}
+
+/*
+ * SetSpeculativeInsertionTid -- Set TID for speculative relfilenode
+ */
+void
+SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel = relnode;
+ ItemPointerCopy(tid, &MyProc->specInsertTid);
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * ClearSpeculativeInsertionState -- Clear token and TID for ourselves
+ */
+void
+ClearSpeculativeInsertionState(void)
+{
+ LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+ MyProc->specInsertRel.spcNode = InvalidOid;
+ MyProc->specInsertRel.dbNode = InvalidOid;
+ MyProc->specInsertRel.relNode = InvalidOid;
+ ItemPointerSetInvalid(&MyProc->specInsertTid);
+ MyProc->specInsertToken = 0;
+ LWLockRelease(ProcArrayLock);
+}
+
+/*
+ * Returns a speculative insertion token for waiting for the insertion to
+ * finish
+ */
+uint32
+SpeculativeInsertionIsInProgress(TransactionId xid, RelFileNode rel,
+ ItemPointer tid)
+{
+ ProcArrayStruct *arrayP = procArray;
+ int index;
+ uint32 result = 0;
+
+ if (TransactionIdPrecedes(xid, RecentXmin))
+ return result;
+
+ /*
+ * Get the top transaction id.
+ *
+ * XXX We could search the proc array first, like
+ * TransactionIdIsInProgress() does, but this isn't performance-critical.
+ */
+ xid = SubTransGetTopmostTransaction(xid);
+
+ LWLockAcquire(ProcArrayLock, LW_SHARED);
+
+ for (index = 0; index < arrayP->numProcs; index++)
+ {
+ int pgprocno = arrayP->pgprocnos[index];
+ PGPROC *proc = &allProcs[pgprocno];
+ volatile PGXACT *pgxact = &allPgXact[pgprocno];
+
+ if (pgxact->xid == xid)
+ {
+ /*
+ * Found the backend. Is it doing a speculative insertion of the
+ * given tuple?
+ */
+ if (RelFileNodeEquals(proc->specInsertRel, rel) &&
+ ItemPointerEquals(tid, &proc->specInsertTid))
+ result = proc->specInsertToken;
+
+ break;
+ }
+ }
+
+ LWLockRelease(ProcArrayLock);
+
+ return result;
+}
+
+
+/*
* GetOldestXmin -- returns oldest transaction that was running
* when any current transaction was started.
*
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index d13a167..6044128 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -576,6 +576,81 @@ ConditionalXactLockTableWait(TransactionId xid)
}
/*
+ * Per-backend final disambiguator of an attempt to insert speculatively.
+ *
+ * This may wraparound, but since it is only a final disambiguator (speculative
+ * waiters also check TID and relfilenode), this is deemed to be acceptable.
+ * There is only a theoretical, vanishingly small chance of a backend
+ * spuriously considering that it must wait on another backend's
+ * end-of-speculative insertion (call to SpeculativeInsertionLockRelease())
+ * when that isn't strictly necessary, and even this is likely to be
+ * inconsequential. At worst, unprincipled deadlocks are not entirely
+ * eliminated in extreme corner cases.
+ */
+static uint32 speculativeInsertionToken = 0;
+
+/*
+ * SpeculativeInsertionLockAcquire
+ *
+ * Insert a lock showing that the given transaction ID is inserting a tuple,
+ * but hasn't yet decided whether it's going to keep it. The lock can then be
+ * used to wait for the decision to go ahead with the insertion, or aborting
+ * it.
+ *
+ * The token is used to distinguish multiple insertions by the same
+ * transaction. A counter will do, for example.
+ */
+void
+SpeculativeInsertionLockAcquire(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ speculativeInsertionToken++;
+ SetSpeculativeInsertionToken(speculativeInsertionToken);
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ (void) LockAcquire(&tag, ExclusiveLock, false, false);
+}
+
+/*
+ * SpeculativeInsertionLockRelease
+ *
+ * Delete the lock showing that the given transaction is speculatively
+ * inserting a tuple.
+ */
+void
+SpeculativeInsertionLockRelease(TransactionId xid)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, speculativeInsertionToken);
+
+ LockRelease(&tag, ExclusiveLock, false);
+}
+
+/*
+ * SpeculativeInsertionWait
+ *
+ * Wait for the specified transaction to finish or abort the insertion of a
+ * tuple.
+ */
+void
+SpeculativeInsertionWait(TransactionId xid, uint32 token)
+{
+ LOCKTAG tag;
+
+ SET_LOCKTAG_SPECULATIVE_INSERTION(tag, xid, token);
+
+ Assert(TransactionIdIsValid(xid));
+ Assert(token != 0);
+
+ (void) LockAcquire(&tag, ShareLock, false, false);
+ LockRelease(&tag, ShareLock, false);
+}
+
+
+/*
* XactLockTableWaitErrorContextCb
* Error context callback for transaction lock waits.
*/
@@ -873,6 +948,11 @@ DescribeLockTag(StringInfo buf, const LOCKTAG *tag)
tag->locktag_field1,
tag->locktag_field2);
break;
+ case LOCKTAG_PROMISE_TUPLE_INSERTION:
+ appendStringInfo(buf,
+ _("tuple insertion by transaction %u"),
+ tag->locktag_field1);
+ break;
case LOCKTAG_OBJECT:
appendStringInfo(buf,
_("object %u of class %u of database %u"),
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 9c14e8a..41c4191 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -189,7 +189,8 @@ ProcessQuery(PlannedStmt *plan,
*/
if (completionTag)
{
- Oid lastOid;
+ Oid lastOid;
+ ModifyTableState *pstate;
switch (queryDesc->operation)
{
@@ -198,12 +199,16 @@ ProcessQuery(PlannedStmt *plan,
"SELECT %u", queryDesc->estate->es_processed);
break;
case CMD_INSERT:
+ pstate = (((ModifyTableState *) queryDesc->planstate));
+ Assert(IsA(pstate, ModifyTableState));
+
if (queryDesc->estate->es_processed == 1)
lastOid = queryDesc->estate->es_lastoid;
else
lastOid = InvalidOid;
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
- "INSERT %u %u", lastOid, queryDesc->estate->es_processed);
+ "%s %u %u", pstate->spec == SPEC_INSERT? "UPSERT":"INSERT",
+ lastOid, queryDesc->estate->es_processed);
break;
case CMD_UPDATE:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
@@ -1356,7 +1361,10 @@ PortalRunMulti(Portal portal, bool isTopLevel,
* 0" here because technically there is no query of the matching tag type,
* and printing a non-zero count for a different query type seems wrong,
* e.g. an INSERT that does an UPDATE instead should not print "0 1" if
- * one row was updated. See QueryRewrite(), step 3, for details.
+ * one row was updated (unless the ON CONFLICT UPDATE, or "UPSERT" variant
+ * of INSERT was used to update the row, where it's logically a direct
+ * effect of the top level command). See QueryRewrite(), step 3, for
+ * details.
*/
if (completionTag && completionTag[0] == '\0')
{
@@ -1366,6 +1374,8 @@ PortalRunMulti(Portal portal, bool isTopLevel,
sprintf(completionTag, "SELECT 0 0");
else if (strcmp(completionTag, "INSERT") == 0)
strcpy(completionTag, "INSERT 0 0");
+ else if (strcmp(completionTag, "UPSERT") == 0)
+ strcpy(completionTag, "UPSERT 0 0");
else if (strcmp(completionTag, "UPDATE") == 0)
strcpy(completionTag, "UPDATE 0");
else if (strcmp(completionTag, "DELETE") == 0)
diff --git a/src/backend/utils/adt/lockfuncs.c b/src/backend/utils/adt/lockfuncs.c
index a1967b69..95d62cb 100644
--- a/src/backend/utils/adt/lockfuncs.c
+++ b/src/backend/utils/adt/lockfuncs.c
@@ -28,6 +28,7 @@ static const char *const LockTagTypeNames[] = {
"tuple",
"transactionid",
"virtualxid",
+ "inserter transactionid",
"object",
"userlock",
"advisory"
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index c1d860c..04235e2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -5645,6 +5645,24 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
return NULL;
}
+ else if (var->varno == INNER_VAR)
+ {
+ /* Assume an EXCLUDED variable */
+ rte = rt_fetch(PRS2_OLD_VARNO, dpns->rtable);
+
+ /*
+ * Sanity check: EXCLUDED.* Vars should only appear in auxiliary ON
+ * CONFLICT UPDATE queries. Assert that rte and planstate are
+ * consistent with that.
+ */
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(IsA(dpns->planstate, SeqScanState) ||
+ IsA(dpns->planstate, ResultState));
+
+ refname = "excluded";
+ colinfo = deparse_columns_fetch(PRS2_OLD_VARNO, dpns);
+ attnum = var->varattno;
+ }
else
{
elog(ERROR, "bogus varno: %d", var->varno);
@@ -6385,6 +6403,7 @@ isSimpleNode(Node *node, Node *parentNode, int prettyFlags)
case T_CoerceToDomainValue:
case T_SetToDefault:
case T_CurrentOfExpr:
+ case T_ExcludedExpr:
/* single words: always simple */
return true;
@@ -7610,6 +7629,26 @@ get_rule_expr(Node *node, deparse_context *context,
}
break;
+ case T_ExcludedExpr:
+ {
+ ExcludedExpr *excludedexpr = (ExcludedExpr *) node;
+ Var *variable = (Var *) excludedexpr->arg;
+ bool save_varprefix;
+
+ /*
+ * Force parentheses because our caller probably assumed our
+ * Var is a simple expression.
+ */
+ appendStringInfoChar(buf, '(');
+ save_varprefix = context->varprefix;
+ /* Ensure EXCLUDED.* prefix is always visible */
+ context->varprefix = true;
+ get_rule_expr((Node *) variable, context, true);
+ context->varprefix = save_varprefix;
+ appendStringInfoChar(buf, ')');
+ }
+ break;
+
case T_List:
{
char *sep;
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index 777f55c..1b67438 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -262,6 +262,9 @@ HeapTupleSatisfiesSelf(HeapTuple htup, Snapshot snapshot, Buffer buffer)
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */
@@ -360,6 +363,7 @@ HeapTupleSatisfiesToast(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ Assert(!HeapTupleHeaderSuperDeleted(tuple));
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -446,6 +450,7 @@ HeapTupleSatisfiesUpdate(HeapTuple htup, CommandId curcid,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ Assert(!HeapTupleHeaderSuperDeleted(tuple));
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -726,6 +731,7 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
Assert(htup->t_tableOid != InvalidOid);
snapshot->xmin = snapshot->xmax = InvalidTransactionId;
+ snapshot->speculativeToken = 0;
if (!HeapTupleHeaderXminCommitted(tuple))
{
@@ -807,6 +813,26 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
else if (TransactionIdIsInProgress(HeapTupleHeaderGetRawXmin(tuple)))
{
+ RelFileNode rnode;
+ ForkNumber forkno;
+ BlockNumber blockno;
+
+ BufferGetTag(buffer, &rnode, &forkno, &blockno);
+
+ /* tuples can only be in the main fork */
+ Assert(forkno == MAIN_FORKNUM);
+ Assert(blockno == ItemPointerGetBlockNumber(&htup->t_self));
+
+ /*
+ * Set speculative token. Caller can worry about xmax, since it
+ * requires a conclusively locked row version, and a concurrent
+ * update to this tuple is a conflict of its purposes.
+ */
+ snapshot->speculativeToken =
+ SpeculativeInsertionIsInProgress(HeapTupleHeaderGetRawXmin(tuple),
+ rnode,
+ &htup->t_self);
+
snapshot->xmin = HeapTupleHeaderGetRawXmin(tuple);
/* XXX shouldn't we fall through to look at xmax? */
return true; /* in insertion by other */
@@ -823,6 +849,9 @@ HeapTupleSatisfiesDirty(HeapTuple htup, Snapshot snapshot,
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/* by here, the inserting transaction has committed */
if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */
@@ -1022,6 +1051,9 @@ HeapTupleSatisfiesMVCC(HeapTuple htup, Snapshot snapshot,
}
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return false;
+
/*
* By here, the inserting transaction has committed - have to check
* when...
@@ -1218,6 +1250,9 @@ HeapTupleSatisfiesVacuum(HeapTuple htup, TransactionId OldestXmin,
*/
}
+ if (HeapTupleHeaderSuperDeleted(tuple))
+ return HEAPTUPLE_DEAD;
+
/*
* Okay, the inserter committed, so it was good at some point. Now what
* about the deleting transaction?
@@ -1406,7 +1441,10 @@ HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
if (!(tuple->t_infomask & HEAP_XMAX_COMMITTED))
return false;
- /* Deleter committed, so tuple is dead if the XID is old enough. */
+ /*
+ * Deleter committed, so tuple is dead if the XID is old enough. This
+ * handles super deleted tuples correctly.
+ */
return TransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple), OldestXmin);
}
@@ -1539,6 +1577,8 @@ HeapTupleHeaderIsOnlyLocked(HeapTupleHeader tuple)
{
TransactionId xmax;
+ Assert(!HeapTupleHeaderSuperDeleted(tuple));
+
/* if there's no valid Xmax, then there's obviously no update either */
if (tuple->t_infomask & HEAP_XMAX_INVALID)
return true;
@@ -1596,6 +1636,9 @@ TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
* We don't need to support HEAP_MOVED_(IN|OFF) for now because we only support
* reading catalog pages which couldn't have been created in an older version.
*
+ * We don't support speculative insertion into catalogs, and so there are no
+ * checks for super deleted tuples.
+ *
* We don't set any hint bits in here as it seems unlikely to be beneficial as
* those should already be set by normal access and it seems to be too
* dangerous to do so as the semantics of doing so during timetravel are more
@@ -1611,6 +1654,7 @@ HeapTupleSatisfiesHistoricMVCC(HeapTuple htup, Snapshot snapshot,
Assert(ItemPointerIsValid(&htup->t_self));
Assert(htup->t_tableOid != InvalidOid);
+ Assert(!HeapTupleHeaderSuperDeleted(tuple));
/* inserting transaction aborted */
if (HeapTupleHeaderXminInvalid(tuple))
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 275bdcc..9302e41 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -894,9 +894,12 @@ PrintQueryResults(PGresult *results)
success = StoreQueryTuple(results);
else
success = PrintQueryTuples(results);
- /* if it's INSERT/UPDATE/DELETE RETURNING, also print status */
+ /*
+ * if it's INSERT/UPSERT/UPDATE/DELETE RETURNING, also print status
+ */
cmdstatus = PQcmdStatus(results);
if (strncmp(cmdstatus, "INSERT", 6) == 0 ||
+ strncmp(cmdstatus, "UPSERT", 6) == 0 ||
strncmp(cmdstatus, "UPDATE", 6) == 0 ||
strncmp(cmdstatus, "DELETE", 6) == 0)
PrintQueryStatus(results);
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 939d93d..62e760a 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -28,6 +28,7 @@
#define HEAP_INSERT_SKIP_WAL 0x0001
#define HEAP_INSERT_SKIP_FSM 0x0002
#define HEAP_INSERT_FROZEN 0x0004
+#define HEAP_INSERT_SPECULATIVE 0x0008
typedef struct BulkInsertStateData *BulkInsertState;
@@ -141,7 +142,7 @@ extern void heap_multi_insert(Relation relation, HeapTuple *tuples, int ntuples,
CommandId cid, int options, BulkInsertState bistate);
extern HTSU_Result heap_delete(Relation relation, ItemPointer tid,
CommandId cid, Snapshot crosscheck, bool wait,
- HeapUpdateFailureData *hufd);
+ HeapUpdateFailureData *hufd, bool killspeculative);
extern HTSU_Result heap_update(Relation relation, ItemPointer otid,
HeapTuple newtup,
CommandId cid, Snapshot crosscheck, bool wait,
diff --git a/src/include/access/heapam_xlog.h b/src/include/access/heapam_xlog.h
index a2ed2a0..870985d 100644
--- a/src/include/access/heapam_xlog.h
+++ b/src/include/access/heapam_xlog.h
@@ -73,6 +73,8 @@
#define XLOG_HEAP_SUFFIX_FROM_OLD (1<<6)
/* last xl_heap_multi_insert record for one heap_multi_insert() call */
#define XLOG_HEAP_LAST_MULTI_INSERT (1<<7)
+/* reuse xl_heap_multi_insert-only bit for xl_heap_delete */
+#define XLOG_HEAP_KILLED_SPECULATIVE_TUPLE XLOG_HEAP_LAST_MULTI_INSERT
/* convenience macro for checking whether any form of old tuple was logged */
#define XLOG_HEAP_CONTAINS_OLD \
diff --git a/src/include/access/htup_details.h b/src/include/access/htup_details.h
index d2ad910..ae8fa80 100644
--- a/src/include/access/htup_details.h
+++ b/src/include/access/htup_details.h
@@ -305,6 +305,18 @@ struct HeapTupleHeaderData
)
/*
+ * Was tuple "super deleted" following unsuccessful speculative insertion (i.e.
+ * conflict was detected at insertion time)? Is is not sufficient to set
+ * HEAP_XMIN_INVALID to super delete because it is only a hint, and because it
+ * interacts with transaction commit status. Speculative insertion decouples
+ * visibility from transaction duration for one special purpose.
+ */
+#define HeapTupleHeaderSuperDeleted(tup) \
+( \
+ (!TransactionIdIsValid(HeapTupleHeaderGetRawXmin(tup))) \
+)
+
+/*
* HeapTupleHeaderGetRawXmax gets you the raw Xmax field. To find out the Xid
* that updated a tuple, you might need to resolve the MultiXactId if certain
* bits are set. HeapTupleHeaderGetUpdateXid checks those bits and takes care
diff --git a/src/include/catalog/index.h b/src/include/catalog/index.h
index e7cc7a0..42c10d4 100644
--- a/src/include/catalog/index.h
+++ b/src/include/catalog/index.h
@@ -80,6 +80,8 @@ extern void index_drop(Oid indexId, bool concurrent);
extern IndexInfo *BuildIndexInfo(Relation index);
+extern void AddUniqueSpeculative(Relation index, IndexInfo *ii);
+
extern void FormIndexDatum(IndexInfo *indexInfo,
TupleTableSlot *slot,
EState *estate,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 40fde83..accdc83 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -352,16 +352,21 @@ extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
extern void ExecCloseScanRelation(Relation scanrel);
-extern void ExecOpenIndices(ResultRelInfo *resultRelInfo);
+extern void ExecOpenIndices(ResultRelInfo *resultRelInfo, bool speculative);
extern void ExecCloseIndices(ResultRelInfo *resultRelInfo);
+extern List *ExecLockIndexValues(TupleTableSlot *slot, EState *estate,
+ SpecCmd specReason);
extern List *ExecInsertIndexTuples(TupleTableSlot *slot, ItemPointer tupleid,
- EState *estate);
-extern bool check_exclusion_constraint(Relation heap, Relation index,
- IndexInfo *indexInfo,
- ItemPointer tupleid,
- Datum *values, bool *isnull,
- EState *estate,
- bool newIndex, bool errorOK);
+ EState *estate, bool noDupErr, Oid arbiterIdx);
+extern bool ExecCheckIndexConstraints(TupleTableSlot *slot, EState *estate,
+ ItemPointer conflictTid, Oid arbiterIdx);
+extern bool check_exclusion_or_unique_constraint(Relation heap, Relation index,
+ IndexInfo *indexInfo,
+ ItemPointer tupleid,
+ Datum *values, bool *isnull,
+ EState *estate,
+ bool newIndex, bool errorOK,
+ bool wait, ItemPointer conflictTid);
extern void RegisterExprContextCallback(ExprContext *econtext,
ExprContextCallbackFunction function,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41288ed..2e4e168 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -41,6 +41,9 @@
* ExclusionOps Per-column exclusion operators, or NULL if none
* ExclusionProcs Underlying function OIDs for ExclusionOps
* ExclusionStrats Opclass strategy numbers for ExclusionOps
+ * UniqueOps Theses are like Exclusion*, but for unique indexes
+ * UniqueProcs
+ * UniqueStrats
* Unique is it a unique index?
* ReadyForInserts is it valid for inserts?
* Concurrent are we doing a concurrent index build?
@@ -62,6 +65,9 @@ typedef struct IndexInfo
Oid *ii_ExclusionOps; /* array with one entry per column */
Oid *ii_ExclusionProcs; /* array with one entry per column */
uint16 *ii_ExclusionStrats; /* array with one entry per column */
+ Oid *ii_UniqueOps; /* array with one entry per column */
+ Oid *ii_UniqueProcs; /* array with one entry per column */
+ uint16 *ii_UniqueStrats; /* array with one entry per column */
bool ii_Unique;
bool ii_ReadyForInserts;
bool ii_Concurrent;
@@ -967,6 +973,16 @@ typedef struct DomainConstraintState
ExprState *check_expr; /* for CHECK, a boolean expression */
} DomainConstraintState;
+/* ----------------
+ * ExcludedExprState node
+ * ----------------
+ */
+typedef struct ExcludedExprState
+{
+ ExprState xprstate;
+ ExprState *arg; /* the argument */
+} ExcludedExprState;
+
/* ----------------------------------------------------------------
* Executor State Trees
@@ -1088,6 +1104,9 @@ typedef struct ModifyTableState
int mt_whichplan; /* which one is being executed (0..n-1) */
ResultRelInfo *resultRelInfo; /* per-subplan target relations */
List **mt_arowmarks; /* per-subplan ExecAuxRowMark lists */
+ SpecCmd spec; /* reason for speculative insertion */
+ Oid arbiterIndex; /* unique index to arbitrate taking alt path */
+ PlanState *onConflict; /* associated OnConflict state */
EPQState mt_epqstate; /* for evaluating EvalPlanQual rechecks */
bool fireBSTriggers; /* do we need to fire stmt triggers? */
} ModifyTableState;
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index 97ef0fc..8d6fba4 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -168,6 +168,7 @@ typedef enum NodeTag
T_CoerceToDomainValue,
T_SetToDefault,
T_CurrentOfExpr,
+ T_ExcludedExpr,
T_TargetEntry,
T_RangeTblRef,
T_JoinExpr,
@@ -207,6 +208,7 @@ typedef enum NodeTag
T_NullTestState,
T_CoerceToDomainState,
T_DomainConstraintState,
+ T_ExcludedExprState,
/*
* TAGS FOR PLANNER NODES (relation.h)
@@ -412,6 +414,8 @@ typedef enum NodeTag
T_RowMarkClause,
T_XmlSerialize,
T_WithClause,
+ T_InferClause,
+ T_ConflictClause,
T_CommonTableExpr,
/*
@@ -624,4 +628,18 @@ typedef enum JoinType
(1 << JOIN_RIGHT) | \
(1 << JOIN_ANTI))) != 0)
+/*
+ * SpecCmd -
+ * "Speculative insertion" clause
+ *
+ * This is needed in both parsenodes.h and plannodes.h, so put it here...
+ */
+typedef enum
+{
+ SPEC_NONE, /* Not involved in speculative insertion */
+ SPEC_IGNORE, /* INSERT of "ON CONFLICT IGNORE" */
+ SPEC_INSERT, /* INSERT of "ON CONFLICT UPDATE" */
+ SPEC_UPDATE /* UPDATE of "ON CONFLICT UPDATE" */
+} SpecCmd;
+
#endif /* NODES_H */
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 86d1c07..c03c9ca 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -132,6 +132,11 @@ typedef struct Query
List *withCheckOptions; /* a list of WithCheckOption's */
+ SpecCmd specClause; /* speculative insertion clause */
+ List *arbiterExpr; /* Unique index arbiter exprs */
+ Node *arbiterWhere; /* Unique index arbiter WHERE clause */
+ Node *onConflict; /* ON CONFLICT Query */
+
List *returningList; /* return-values list (of TargetEntry) */
List *groupClause; /* a list of SortGroupClause's */
@@ -564,7 +569,7 @@ typedef enum TableLikeOption
} TableLikeOption;
/*
- * IndexElem - index parameters (used in CREATE INDEX)
+ * IndexElem - index parameters (used in CREATE INDEX, and in ON CONFLICT)
*
* For a plain index attribute, 'name' is the name of the table column to
* index, and 'expr' is NULL. For an index expression, 'name' is NULL and
@@ -999,6 +1004,36 @@ typedef struct WithClause
} WithClause;
/*
+ * InferClause -
+ * ON CONFLICT unique index inference clause
+ *
+ * Note: InferClause does not propagate into the Query representation.
+ */
+typedef struct InferClause
+{
+ NodeTag type;
+ List *indexElems; /* IndexElems to infer unique index */
+ Node *whereClause; /* qualification (partial-index predicate) */
+ int location; /* token location, or -1 if unknown */
+} InferClause;
+
+/*
+ * ConflictClause -
+ * representation of ON CONFLICT clause
+ *
+ * Note: ConflictClause does not propagate into the Query representation.
+ * However, Query may contain onConflict child Query.
+ */
+typedef struct ConflictClause
+{
+ NodeTag type;
+ SpecCmd specclause; /* Variant specified */
+ InferClause *infer; /* Optional index inference clause */
+ Node *updatequery; /* Update parse stmt */
+ int location; /* token location, or -1 if unknown */
+} ConflictClause;
+
+/*
* CommonTableExpr -
* representation of WITH list element
*
@@ -1048,6 +1083,7 @@ typedef struct InsertStmt
RangeVar *relation; /* relation to insert into */
List *cols; /* optional: names of the target columns */
Node *selectStmt; /* the source SELECT/VALUES, or NULL */
+ ConflictClause *confClause; /* ON CONFLICT clause */
List *returningList; /* list of expressions to return */
WithClause *withClause; /* WITH clause */
} InsertStmt;
@@ -1073,7 +1109,7 @@ typedef struct DeleteStmt
typedef struct UpdateStmt
{
NodeTag type;
- RangeVar *relation; /* relation to update */
+ RangeVar *relation; /* relation to update (NULL for speculative) */
List *targetList; /* the target list (of ResTarget) */
Node *whereClause; /* qualifications */
List *fromClause; /* optional from clause for more tables */
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index f6683f0..7366e2c 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -178,6 +178,9 @@ typedef struct ModifyTable
List *resultRelations; /* integer list of RT indexes */
int resultRelIndex; /* index of first resultRel in plan's list */
List *plans; /* plan(s) producing source data */
+ SpecCmd spec; /* speculative insertion specification */
+ Oid arbiterIndex; /* Oid of ON CONFLICT arbiter index */
+ Plan *onConflictPlan; /* Plan for ON CONFLICT UPDATE auxiliary query */
List *withCheckOptionLists; /* per-target-table WCO lists */
List *returningLists; /* per-target-table RETURNING tlists */
List *fdwPrivLists; /* per-target-table FDW private data lists */
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1d06f42..21c39dc 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -1147,6 +1147,53 @@ typedef struct CurrentOfExpr
int cursor_param; /* refcursor parameter number, or 0 */
} CurrentOfExpr;
+/*
+ * ExcludedExpr - an EXCLUDED.* expression
+ *
+ * During parse analysis of ON CONFLICT UPDATE auxiliary queries, a dummy
+ * EXCLUDED range table entry is generated, which is actually just an alias for
+ * the target relation. This is useful during parse analysis, allowing the
+ * parser to produce simple error messages, for example. There is the
+ * appearance of a join within the auxiliary ON CONFLICT UPDATE, superficially
+ * similar to a join in an UPDATE ... FROM; this is a limited, ad-hoc join
+ * though, as the executor needs to tightly control the referenced tuple/slot
+ * through which update evaluation references excluded values originally
+ * proposed for insertion. Note that EXCLUDED.* values carry forward the
+ * effects of BEFORE ROW INSERT triggers.
+ *
+ * To implement a limited "join" for ON CONFLICT UPDATE auxiliary queries,
+ * during the rewrite stage, Vars referencing the alias EXCLUDED.* RTE are
+ * swapped with ExcludedExprs, which also contain Vars; their Vars are
+ * equivalent, but reference the target instead. The ExcludedExpr Var actually
+ * evaluates against varno INNER_VAR during expression evaluation (and not a
+ * varno INDEX_VAR associated with an entry in the flattened range table
+ * representing the target, which is necessarily being scanned whenever an
+ * ExcludedExpr is evaluated) while still being logically associated with the
+ * target. The Var is only rigged to reference the inner slot during
+ * ExcludedExpr initialization. The executor closely controls the evaluation
+ * expression, installing the EXCLUDED slot actually excluded from insertion
+ * into the inner slot of the child/auxiliary evaluation context in an ad-hoc
+ * fashion, which, after ExcludedExpr initialization, is expected (i.e. it is
+ * expected during ExcludedExpr evaluation that the parent insert will make
+ * each excluded tuple available in the inner slot in turn). ExcludedExpr are
+ * only ever evaluated during special speculative insertion related EPQ
+ * expression evaluation, purely for the benefit of auxiliary UPDATE
+ * expressions.
+ *
+ * Aside from representing a logical choke point for this special expression
+ * evaluation, having a dedicated primnode also prevents the optimizer from
+ * considering various optimization that might otherwise be attempted.
+ * Obviously there is no useful join optimization possible within the auxiliary
+ * query, and an ExcludedExpr based post-rewrite query tree representation is a
+ * convenient way of preventing that, as well as related inapplicable
+ * optimizations concerning the equivalence of Vars.
+ */
+typedef struct ExcludedExpr
+{
+ Expr xpr;
+ Node *arg; /* argument (Var) */
+} ExcludedExpr;
+
/*--------------------
* TargetEntry -
* a target entry (used in query target lists)
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 6cad92e..801effe 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -64,6 +64,7 @@ extern Expr *adjust_rowcompare_for_index(RowCompareExpr *clause,
int indexcol,
List **indexcolnos,
bool *var_on_left_p);
+extern Oid plan_speculative_use_index(PlannerInfo *root, List *indexList);
/*
* tidpath.h
diff --git a/src/include/optimizer/plancat.h b/src/include/optimizer/plancat.h
index 8eb2e57..878adfe 100644
--- a/src/include/optimizer/plancat.h
+++ b/src/include/optimizer/plancat.h
@@ -28,6 +28,8 @@ extern PGDLLIMPORT get_relation_info_hook_type get_relation_info_hook;
extern void get_relation_info(PlannerInfo *root, Oid relationObjectId,
bool inhparent, RelOptInfo *rel);
+extern Oid infer_unique_index(PlannerInfo *root);
+
extern void estimate_rel_size(Relation rel, int32 *attr_widths,
BlockNumber *pages, double *tuples, double *allvisfrac);
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fa72918..81a9058 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -85,7 +85,8 @@ extern ModifyTable *make_modifytable(PlannerInfo *root,
Index nominalRelation,
List *resultRelations, List *subplans,
List *withCheckOptionLists, List *returningLists,
- List *rowMarks, int epqParam);
+ List *rowMarks, Plan *onConflictPlan, SpecCmd spec,
+ int epqParam);
extern bool is_projection_capable_plan(Plan *plan);
/*
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index 7c243ec..cf501e6 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -87,6 +87,7 @@ PG_KEYWORD("commit", COMMIT, UNRESERVED_KEYWORD)
PG_KEYWORD("committed", COMMITTED, UNRESERVED_KEYWORD)
PG_KEYWORD("concurrently", CONCURRENTLY, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("configuration", CONFIGURATION, UNRESERVED_KEYWORD)
+PG_KEYWORD("conflict", CONFLICT, UNRESERVED_KEYWORD)
PG_KEYWORD("connection", CONNECTION, UNRESERVED_KEYWORD)
PG_KEYWORD("constraint", CONSTRAINT, RESERVED_KEYWORD)
PG_KEYWORD("constraints", CONSTRAINTS, UNRESERVED_KEYWORD)
@@ -180,6 +181,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD)
PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD)
PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD)
PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD)
PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD)
PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD)
PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD)
diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 6a4438f..d1d0d12 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -41,6 +41,8 @@ extern List *transformDistinctClause(ParseState *pstate,
List **targetlist, List *sortClause, bool is_agg);
extern List *transformDistinctOnClause(ParseState *pstate, List *distinctlist,
List **targetlist, List *sortClause);
+extern void transformConflictClause(ParseState *pstate, ConflictClause *confClause,
+ List **arbiterExpr, Node **arbiterWhere);
extern List *addTargetToSortList(ParseState *pstate, TargetEntry *tle,
List *sortlist, List *targetlist, SortBy *sortby,
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 3103b71..2b5804e 100644
--- a/src/include/parser/parse_node.h
+++ b/src/include/parser/parse_node.h
@@ -153,6 +153,7 @@ struct ParseState
bool p_hasModifyingCTE;
bool p_is_insert;
bool p_is_update;
+ bool p_is_speculative;
bool p_locked_from_parent;
Relation p_target_relation;
RangeTblEntry *p_target_rangetblentry;
diff --git a/src/include/storage/lmgr.h b/src/include/storage/lmgr.h
index f5d70e5..6bb95fc 100644
--- a/src/include/storage/lmgr.h
+++ b/src/include/storage/lmgr.h
@@ -76,6 +76,11 @@ extern bool ConditionalXactLockTableWait(TransactionId xid);
extern void WaitForLockers(LOCKTAG heaplocktag, LOCKMODE lockmode);
extern void WaitForLockersMultiple(List *locktags, LOCKMODE lockmode);
+/* Lock an XID for tuple insertion (used to wait for an insertion to finish) */
+extern void SpeculativeInsertionLockAcquire(TransactionId xid);
+extern void SpeculativeInsertionLockRelease(TransactionId xid);
+extern void SpeculativeInsertionWait(TransactionId xid, uint32 token);
+
/* Lock a general object (other than a relation) of the current database */
extern void LockDatabaseObject(Oid classid, Oid objid, uint16 objsubid,
LOCKMODE lockmode);
diff --git a/src/include/storage/lock.h b/src/include/storage/lock.h
index 1100923..9c21810 100644
--- a/src/include/storage/lock.h
+++ b/src/include/storage/lock.h
@@ -176,6 +176,8 @@ typedef enum LockTagType
/* ID info for a transaction is its TransactionId */
LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
/* ID info for a virtual transaction is its VirtualTransactionId */
+ LOCKTAG_PROMISE_TUPLE_INSERTION, /* tuple insertion, keyed by Xid */
+ /* ID info for a transaction is its TransactionId */
LOCKTAG_OBJECT, /* non-relation database object */
/* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
@@ -261,6 +263,14 @@ typedef struct LOCKTAG
(locktag).locktag_type = LOCKTAG_VIRTUALTRANSACTION, \
(locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+#define SET_LOCKTAG_SPECULATIVE_INSERTION(locktag,xid,token) \
+ ((locktag).locktag_field1 = (xid), \
+ (locktag).locktag_field2 = (token), \
+ (locktag).locktag_field3 = 0, \
+ (locktag).locktag_field4 = 0, \
+ (locktag).locktag_type = LOCKTAG_PROMISE_TUPLE_INSERTION, \
+ (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
+
#define SET_LOCKTAG_OBJECT(locktag,dboid,classoid,objoid,objsubid) \
((locktag).locktag_field1 = (dboid), \
(locktag).locktag_field2 = (classoid), \
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index e807a2e..c72f55b 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -16,9 +16,11 @@
#include "access/xlogdefs.h"
#include "lib/ilist.h"
+#include "storage/itemptr.h"
#include "storage/latch.h"
#include "storage/lock.h"
#include "storage/pg_sema.h"
+#include "storage/relfilenode.h"
/*
* Each backend advertises up to PGPROC_MAX_CACHED_SUBXIDS TransactionIds
@@ -132,6 +134,17 @@ struct PGPROC
*/
SHM_QUEUE myProcLocks[NUM_LOCK_PARTITIONS];
+ /*
+ * Info to allow us to perform speculative insertion without "unprincipled
+ * deadlocks". This state allows others to wait on the outcome of an
+ * optimistically inserted speculative tuple for only the duration of the
+ * insertion (not to the end of our xact) iff the insertion does not work
+ * out (due to our detecting a conflict).
+ */
+ RelFileNode specInsertRel; /* Relfilenode speculatively inserted into */
+ ItemPointerData specInsertTid; /* TID within specInsertRel */
+ uint32 specInsertToken; /* Final disambiguator of insertions */
+
struct XidCache subxids; /* cache for subtransaction XIDs */
/* Per-backend LWLock. Protects fields below. */
diff --git a/src/include/storage/procarray.h b/src/include/storage/procarray.h
index 97c6e93..ea2bba9 100644
--- a/src/include/storage/procarray.h
+++ b/src/include/storage/procarray.h
@@ -55,6 +55,13 @@ extern TransactionId GetOldestXmin(Relation rel, bool ignoreVacuum);
extern TransactionId GetOldestActiveTransactionId(void);
extern TransactionId GetOldestSafeDecodingTransactionId(void);
+extern void SetSpeculativeInsertionToken(uint32 token);
+extern void SetSpeculativeInsertionTid(RelFileNode relnode, ItemPointer tid);
+extern void ClearSpeculativeInsertionState(void);
+extern uint32 SpeculativeInsertionIsInProgress(TransactionId xid,
+ RelFileNode rel,
+ ItemPointer tid);
+
extern VirtualTransactionId *GetVirtualXIDsDelayingChkpt(int *nvxids);
extern bool HaveVirtualXIDsDelayingChkpt(VirtualTransactionId *vxids, int nvxids);
diff --git a/src/include/utils/snapshot.h b/src/include/utils/snapshot.h
index 26fb257..cd5ad76 100644
--- a/src/include/utils/snapshot.h
+++ b/src/include/utils/snapshot.h
@@ -87,6 +87,17 @@ typedef struct SnapshotData
bool copied; /* false if it's a static snapshot */
/*
+ * Snapshot's speculative token is value set by HeapTupleSatisfiesDirty,
+ * indicating that the tuple is being inserted speculatively, and may yet
+ * be "super-deleted" before EOX. The caller may use the value with
+ * PromiseTupleInsertionWait to wait for the inserter to decide. It is only
+ * set when a valid 'xmin' is set, too. By convention, when
+ * speculativeToken is zero, the caller must assume that is should wait on
+ * a non-speculative tuple (i.e. wait for xmin/xmax to commit).
+ */
+ uint32 speculativeToken;
+
+ /*
* note: all ids in subxip[] are >= xmin, but we don't bother filtering
* out any that are >= xmax
*/
--
1.9.1
0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchtext/x-patch; charset=US-ASCII; name=0001-Make-UPDATE-privileges-distinct-from-INSERT-privileg.patchDownload
From 5f17d4d33b20e8ecebb844680658930557f5c886 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <pg@heroku.com>
Date: Tue, 26 Aug 2014 21:28:40 -0700
Subject: [PATCH 1/6] Make UPDATE privileges distinct from INSERT privileges in
RTEs
Previously, relation range table entries used a single Bitmapset field
representing which columns required either UPDATE or INSERT privileges,
despite the fact that INSERT and UPDATE privileges are separately
cataloged, and may be independently held. This worked because
ExecCheckRTEPerms() was called with a ACL_INSERT or ACL_UPDATE
requiredPerms, and based on that it was evident which type of
optimizable statement was under consideration. Since historically no
type of optimizable statement could directly INSERT and UPDATE at the
same time, there was no ambiguity as to which privileges were required.
This largely mechanical commit is required infrastructure for the
INSERT...ON CONFLICT UPDATE feature, which introduces an optimizable
statement that may be subject to both INSERT and UPDATE permissions
enforcement. Tests follow in a later commit.
sepgsql is also affected by this commit. Note that this commit
necessitates an initdb, since stored ACLs are broken.
---
contrib/sepgsql/dml.c | 31 ++++++---
src/backend/commands/copy.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/trigger.c | 22 +++---
src/backend/executor/execMain.c | 110 +++++++++++++++++++-----------
src/backend/nodes/copyfuncs.c | 3 +-
src/backend/nodes/equalfuncs.c | 3 +-
src/backend/nodes/outfuncs.c | 3 +-
src/backend/nodes/readfuncs.c | 3 +-
src/backend/optimizer/plan/setrefs.c | 6 +-
src/backend/optimizer/prep/prepsecurity.c | 6 +-
src/backend/optimizer/prep/prepunion.c | 8 ++-
src/backend/parser/analyze.c | 4 +-
src/backend/parser/parse_relation.c | 21 ++++--
src/backend/rewrite/rewriteHandler.c | 52 ++++++++------
src/include/nodes/parsenodes.h | 14 ++--
16 files changed, 176 insertions(+), 114 deletions(-)
diff --git a/contrib/sepgsql/dml.c b/contrib/sepgsql/dml.c
index 36c6a37..4a71753 100644
--- a/contrib/sepgsql/dml.c
+++ b/contrib/sepgsql/dml.c
@@ -145,7 +145,8 @@ fixup_inherited_columns(Oid parentId, Oid childId, Bitmapset *columns)
static bool
check_relation_privileges(Oid relOid,
Bitmapset *selected,
- Bitmapset *modified,
+ Bitmapset *inserted,
+ Bitmapset *updated,
uint32 required,
bool abort_on_violation)
{
@@ -231,8 +232,9 @@ check_relation_privileges(Oid relOid,
* Check permissions on the columns
*/
selected = fixup_whole_row_references(relOid, selected);
- modified = fixup_whole_row_references(relOid, modified);
- columns = bms_union(selected, modified);
+ inserted = fixup_whole_row_references(relOid, inserted);
+ updated = fixup_whole_row_references(relOid, updated);
+ columns = bms_union(selected, bms_union(inserted, updated));
while ((index = bms_first_member(columns)) >= 0)
{
@@ -241,13 +243,16 @@ check_relation_privileges(Oid relOid,
if (bms_is_member(index, selected))
column_perms |= SEPG_DB_COLUMN__SELECT;
- if (bms_is_member(index, modified))
+ if (bms_is_member(index, inserted))
{
- if (required & SEPG_DB_TABLE__UPDATE)
- column_perms |= SEPG_DB_COLUMN__UPDATE;
if (required & SEPG_DB_TABLE__INSERT)
column_perms |= SEPG_DB_COLUMN__INSERT;
}
+ if (bms_is_member(index, updated))
+ {
+ if (required & SEPG_DB_TABLE__UPDATE)
+ column_perms |= SEPG_DB_COLUMN__UPDATE;
+ }
if (column_perms == 0)
continue;
@@ -304,7 +309,7 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
required |= SEPG_DB_TABLE__INSERT;
if (rte->requiredPerms & ACL_UPDATE)
{
- if (!bms_is_empty(rte->modifiedCols))
+ if (!bms_is_empty(rte->updatedCols))
required |= SEPG_DB_TABLE__UPDATE;
else
required |= SEPG_DB_TABLE__LOCK;
@@ -333,7 +338,8 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
{
Oid tableOid = lfirst_oid(li);
Bitmapset *selectedCols;
- Bitmapset *modifiedCols;
+ Bitmapset *insertedCols;
+ Bitmapset *updatedCols;
/*
* child table has different attribute numbers, so we need to fix
@@ -341,15 +347,18 @@ sepgsql_dml_privileges(List *rangeTabls, bool abort_on_violation)
*/
selectedCols = fixup_inherited_columns(rte->relid, tableOid,
rte->selectedCols);
- modifiedCols = fixup_inherited_columns(rte->relid, tableOid,
- rte->modifiedCols);
+ insertedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->insertedCols);
+ updatedCols = fixup_inherited_columns(rte->relid, tableOid,
+ rte->updatedCols);
/*
* check permissions on individual tables
*/
if (!check_relation_privileges(tableOid,
selectedCols,
- modifiedCols,
+ insertedCols,
+ updatedCols,
required, abort_on_violation))
return false;
}
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 92ff632..d2996fb 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -847,7 +847,7 @@ DoCopy(const CopyStmt *stmt, const char *queryString, uint64 *processed)
FirstLowInvalidHeapAttributeNumber;
if (is_from)
- rte->modifiedCols = bms_add_member(rte->modifiedCols, attno);
+ rte->insertedCols = bms_add_member(rte->insertedCols, attno);
else
rte->selectedCols = bms_add_member(rte->selectedCols, attno);
}
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index c961429..bf2235d 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -433,7 +433,7 @@ intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
rte->requiredPerms = ACL_INSERT;
for (attnum = 1; attnum <= intoRelationDesc->rd_att->natts; attnum++)
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attnum - FirstLowInvalidHeapAttributeNumber);
ExecCheckRTPerms(list_make1(rte), true);
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 5c1c1be..7defe80 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -71,8 +71,8 @@ static int MyTriggerDepth = 0;
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
/* Local function prototypes */
static void ConvertTriggerToFK(CreateTrigStmt *stmt, Oid funcoid);
@@ -2343,7 +2343,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TriggerDesc *trigdesc;
int i;
TriggerData LocTriggerData;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
trigdesc = relinfo->ri_TrigDesc;
@@ -2352,7 +2352,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (!trigdesc->trig_update_before_statement)
return;
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
LocTriggerData.type = T_TriggerData;
LocTriggerData.tg_event = TRIGGER_EVENT_UPDATE |
@@ -2373,7 +2373,7 @@ ExecBSUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, NULL, NULL))
+ updatedCols, NULL, NULL))
continue;
LocTriggerData.tg_trigger = trigger;
@@ -2398,7 +2398,7 @@ ExecASUpdateTriggers(EState *estate, ResultRelInfo *relinfo)
if (trigdesc && trigdesc->trig_update_after_statement)
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
false, NULL, NULL, NIL,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
}
TupleTableSlot *
@@ -2416,7 +2416,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
HeapTuple oldtuple;
TupleTableSlot *newSlot;
int i;
- Bitmapset *modifiedCols;
+ Bitmapset *updatedCols;
Bitmapset *keyCols;
LockTupleMode lockmode;
@@ -2425,10 +2425,10 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
* been modified, then we can use a weaker lock, allowing for better
* concurrency.
*/
- modifiedCols = GetModifiedColumns(relinfo, estate);
+ updatedCols = GetUpdatedColumns(relinfo, estate);
keyCols = RelationGetIndexAttrBitmap(relinfo->ri_RelationDesc,
INDEX_ATTR_BITMAP_KEY);
- if (bms_overlap(keyCols, modifiedCols))
+ if (bms_overlap(keyCols, updatedCols))
lockmode = LockTupleExclusive;
else
lockmode = LockTupleNoKeyExclusive;
@@ -2482,7 +2482,7 @@ ExecBRUpdateTriggers(EState *estate, EPQState *epqstate,
TRIGGER_TYPE_UPDATE))
continue;
if (!TriggerEnabled(estate, relinfo, trigger, LocTriggerData.tg_event,
- modifiedCols, trigtuple, newtuple))
+ updatedCols, trigtuple, newtuple))
continue;
LocTriggerData.tg_trigtuple = trigtuple;
@@ -2552,7 +2552,7 @@ ExecARUpdateTriggers(EState *estate, ResultRelInfo *relinfo,
AfterTriggerSaveEvent(estate, relinfo, TRIGGER_EVENT_UPDATE,
true, trigtuple, newtuple, recheckIndexes,
- GetModifiedColumns(relinfo, estate));
+ GetUpdatedColumns(relinfo, estate));
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
}
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 33b172b..dbcebb7 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -82,6 +82,9 @@ static void ExecutePlan(EState *estate, PlanState *planstate,
ScanDirection direction,
DestReceiver *dest);
static bool ExecCheckRTEPerms(RangeTblEntry *rte);
+static bool ExecCheckRTEPermsModified(Oid relOid, Oid userid,
+ Bitmapset *modifiedCols,
+ AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static char *ExecBuildSlotValueDescription(Oid reloid,
TupleTableSlot *slot,
@@ -97,8 +100,10 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
* it uses, so we let them be duplicated. Be sure to update both if one needs
* to be changed, however.
*/
-#define GetModifiedColumns(relinfo, estate) \
- (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->modifiedCols)
+#define GetUpdatedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->updatedCols)
+#define GetInsertedColumns(relinfo, estate) \
+ (rt_fetch((relinfo)->ri_RangeTableIndex, (estate)->es_range_table)->insertedCols)
/* end of local decls */
@@ -559,7 +564,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
AclMode remainingPerms;
Oid relOid;
Oid userid;
- int col;
/*
* Only plain-relation RTEs need to be checked here. Function RTEs are
@@ -597,6 +601,8 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
remainingPerms = requiredPerms & ~relPerms;
if (remainingPerms != 0)
{
+ int col = -1;
+
/*
* If we lack any permissions that exist only as relation permissions,
* we can fail straight away.
@@ -625,7 +631,6 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
return false;
}
- col = -1;
while ((col = bms_next_member(rte->selectedCols, col)) >= 0)
{
/* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
@@ -648,43 +653,63 @@ ExecCheckRTEPerms(RangeTblEntry *rte)
}
/*
- * Basically the same for the mod columns, with either INSERT or
- * UPDATE privilege as specified by remainingPerms.
+ * Basically the same for the mod columns, for both INSERT and UPDATE
+ * privilege as specified by remainingPerms.
*/
- remainingPerms &= ~ACL_SELECT;
- if (remainingPerms != 0)
- {
- /*
- * When the query doesn't explicitly change any columns, allow the
- * query if we have permission on any column of the rel. This is
- * to handle SELECT FOR UPDATE as well as possible corner cases in
- * INSERT and UPDATE.
- */
- if (bms_is_empty(rte->modifiedCols))
- {
- if (pg_attribute_aclcheck_all(relOid, userid, remainingPerms,
- ACLMASK_ANY) != ACLCHECK_OK)
- return false;
- }
+ if (remainingPerms & ACL_INSERT && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->insertedCols,
+ ACL_INSERT))
+ return false;
- col = -1;
- while ((col = bms_next_member(rte->modifiedCols, col)) >= 0)
- {
- /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
- AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+ if (remainingPerms & ACL_UPDATE && !ExecCheckRTEPermsModified(relOid,
+ userid,
+ rte->updatedCols,
+ ACL_UPDATE))
+ return false;
+ }
+ return true;
+}
- if (attno == InvalidAttrNumber)
- {
- /* whole-row reference can't happen here */
- elog(ERROR, "whole-row update is not implemented");
- }
- else
- {
- if (pg_attribute_aclcheck(relOid, attno, userid,
- remainingPerms) != ACLCHECK_OK)
- return false;
- }
- }
+/*
+ * ExecCheckRTEPermsModified
+ * Check INSERT or UPDATE access permissions for a single RTE (these
+ * are processed uniformly).
+ */
+static bool
+ExecCheckRTEPermsModified(Oid relOid, Oid userid, Bitmapset *modifiedCols,
+ AclMode requiredPerms)
+{
+ int col = -1;
+
+ /*
+ * When the query doesn't explicitly update any columns, allow the
+ * query if we have permission on any column of the rel. This is
+ * to handle SELECT FOR UPDATE as well as possible corner cases in
+ * UPDATE.
+ */
+ if (bms_is_empty(modifiedCols))
+ {
+ if (pg_attribute_aclcheck_all(relOid, userid, requiredPerms,
+ ACLMASK_ANY) != ACLCHECK_OK)
+ return false;
+ }
+
+ while ((col = bms_next_member(modifiedCols, col)) >= 0)
+ {
+ /* bit #s are offset by FirstLowInvalidHeapAttributeNumber */
+ AttrNumber attno = col + FirstLowInvalidHeapAttributeNumber;
+
+ if (attno == InvalidAttrNumber)
+ {
+ /* whole-row reference can't happen here */
+ elog(ERROR, "whole-row update is not implemented");
+ }
+ else
+ {
+ if (pg_attribute_aclcheck(relOid, attno, userid,
+ requiredPerms) != ACLCHECK_OK)
+ return false;
}
}
return true;
@@ -1623,7 +1648,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1649,7 +1675,8 @@ ExecConstraints(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
@@ -1708,7 +1735,8 @@ ExecWithCheckOptions(ResultRelInfo *resultRelInfo,
char *val_desc;
Bitmapset *modifiedCols;
- modifiedCols = GetModifiedColumns(resultRelInfo, estate);
+ modifiedCols = GetUpdatedColumns(resultRelInfo, estate);
+ modifiedCols = bms_union(modifiedCols, GetInsertedColumns(resultRelInfo, estate));
val_desc = ExecBuildSlotValueDescription(RelationGetRelid(rel),
slot,
tupdesc,
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index e5b0dce..6d7a877 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -2029,7 +2029,8 @@ _copyRangeTblEntry(const RangeTblEntry *from)
COPY_SCALAR_FIELD(requiredPerms);
COPY_SCALAR_FIELD(checkAsUser);
COPY_BITMAPSET_FIELD(selectedCols);
- COPY_BITMAPSET_FIELD(modifiedCols);
+ COPY_BITMAPSET_FIELD(insertedCols);
+ COPY_BITMAPSET_FIELD(updatedCols);
COPY_NODE_FIELD(securityQuals);
return newnode;
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 6e8b308..79035b2 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -2345,7 +2345,8 @@ _equalRangeTblEntry(const RangeTblEntry *a, const RangeTblEntry *b)
COMPARE_SCALAR_FIELD(requiredPerms);
COMPARE_SCALAR_FIELD(checkAsUser);
COMPARE_BITMAPSET_FIELD(selectedCols);
- COMPARE_BITMAPSET_FIELD(modifiedCols);
+ COMPARE_BITMAPSET_FIELD(insertedCols);
+ COMPARE_BITMAPSET_FIELD(updatedCols);
COMPARE_NODE_FIELD(securityQuals);
return true;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 8486448..a02ba70 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2457,7 +2457,8 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
WRITE_UINT_FIELD(requiredPerms);
WRITE_OID_FIELD(checkAsUser);
WRITE_BITMAPSET_FIELD(selectedCols);
- WRITE_BITMAPSET_FIELD(modifiedCols);
+ WRITE_BITMAPSET_FIELD(insertedCols);
+ WRITE_BITMAPSET_FIELD(updatedCols);
WRITE_NODE_FIELD(securityQuals);
}
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index ae24d05..dbc162a 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1253,7 +1253,8 @@ _readRangeTblEntry(void)
READ_UINT_FIELD(requiredPerms);
READ_OID_FIELD(checkAsUser);
READ_BITMAPSET_FIELD(selectedCols);
- READ_BITMAPSET_FIELD(modifiedCols);
+ READ_BITMAPSET_FIELD(insertedCols);
+ READ_BITMAPSET_FIELD(updatedCols);
READ_NODE_FIELD(securityQuals);
READ_DONE();
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 57195e5..860855e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -368,9 +368,9 @@ flatten_rtes_walker(Node *node, PlannerGlobal *glob)
*
* In the flat rangetable, we zero out substructure pointers that are not
* needed by the executor; this reduces the storage space and copying cost
- * for cached plans. We keep only the alias and eref Alias fields, which
- * are needed by EXPLAIN, and the selectedCols and modifiedCols bitmaps,
- * which are needed for executor-startup permissions checking and for
+ * for cached plans. We keep only the alias and eref Alias fields, which are
+ * needed by EXPLAIN, and the selectedCols, insertedCols and updatedCols
+ * bitmaps, which are needed for executor-startup permissions checking and for
* trigger event checking.
*/
static void
diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c
index af3ee61..f86e792 100644
--- a/src/backend/optimizer/prep/prepsecurity.c
+++ b/src/backend/optimizer/prep/prepsecurity.c
@@ -115,7 +115,8 @@ expand_security_quals(PlannerInfo *root, List *tlist)
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the original relation
@@ -213,7 +214,8 @@ expand_security_qual(PlannerInfo *root, List *tlist, int rt_index,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Now deal with any PlanRowMark on this RTE by requesting a lock
diff --git a/src/backend/optimizer/prep/prepunion.c b/src/backend/optimizer/prep/prepunion.c
index 05f601e..1e28363 100644
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -1367,14 +1367,16 @@ expand_inherited_rtentry(PlannerInfo *root, RangeTblEntry *rte, Index rti)
* if this is the parent table, leave copyObject's result alone.
*
* Note: we need to do this even though the executor won't run any
- * permissions checks on the child RTE. The modifiedCols bitmap may
- * be examined for trigger-firing purposes.
+ * permissions checks on the child RTE. The insertedCols/updatedCols
+ * bitmaps may be examined for trigger-firing purposes.
*/
if (childOID != parentOID)
{
childrte->selectedCols = translate_col_privs(rte->selectedCols,
appinfo->translated_vars);
- childrte->modifiedCols = translate_col_privs(rte->modifiedCols,
+ childrte->insertedCols = translate_col_privs(rte->insertedCols,
+ appinfo->translated_vars);
+ childrte->updatedCols = translate_col_privs(rte->updatedCols,
appinfo->translated_vars);
}
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index a68f2e8..df89065 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -733,7 +733,7 @@ transformInsertStmt(ParseState *pstate, InsertStmt *stmt)
false);
qry->targetList = lappend(qry->targetList, tle);
- rte->modifiedCols = bms_add_member(rte->modifiedCols,
+ rte->insertedCols = bms_add_member(rte->insertedCols,
attr_num - FirstLowInvalidHeapAttributeNumber);
icols = lnext(icols);
@@ -2002,7 +2002,7 @@ transformUpdateStmt(ParseState *pstate, UpdateStmt *stmt)
origTarget->location);
/* Mark the target column as requiring update permissions */
- target_rte->modifiedCols = bms_add_member(target_rte->modifiedCols,
+ target_rte->updatedCols = bms_add_member(target_rte->updatedCols,
attrno - FirstLowInvalidHeapAttributeNumber);
origTargetList = lnext(origTargetList);
diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 8d4f79f..d2820d8 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -1052,7 +1052,8 @@ addRangeTableEntry(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1105,7 +1106,8 @@ addRangeTableEntryForRelation(ParseState *pstate,
rte->requiredPerms = ACL_SELECT;
rte->checkAsUser = InvalidOid; /* not set-uid by default, either */
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1183,7 +1185,8 @@ addRangeTableEntryForSubquery(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1437,7 +1440,8 @@ addRangeTableEntryForFunction(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1509,7 +1513,8 @@ addRangeTableEntryForValues(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1577,7 +1582,8 @@ addRangeTableEntryForJoin(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
@@ -1677,7 +1683,8 @@ addRangeTableEntryForCTE(ParseState *pstate,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* Add completed RTE to pstate's range table list, but not to join list
diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 9d2c280..9894146 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -1403,7 +1403,8 @@ ApplyRetrieveRule(Query *parsetree,
rte->requiredPerms = 0;
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* For the most part, Vars referencing the view should remain as
@@ -1466,12 +1467,14 @@ ApplyRetrieveRule(Query *parsetree,
subrte->requiredPerms = rte->requiredPerms;
subrte->checkAsUser = rte->checkAsUser;
subrte->selectedCols = rte->selectedCols;
- subrte->modifiedCols = rte->modifiedCols;
+ subrte->insertedCols = rte->insertedCols;
+ subrte->updatedCols = rte->updatedCols;
rte->requiredPerms = 0; /* no permission check on subquery itself */
rte->checkAsUser = InvalidOid;
rte->selectedCols = NULL;
- rte->modifiedCols = NULL;
+ rte->insertedCols = NULL;
+ rte->updatedCols = NULL;
/*
* If FOR [KEY] UPDATE/SHARE of view, mark all the contained tables as
@@ -2584,9 +2587,9 @@ rewriteTargetView(Query *parsetree, Relation view)
/*
* For INSERT/UPDATE the modified columns must all be updatable. Note that
* we get the modified columns from the query's targetlist, not from the
- * result RTE's modifiedCols set, since rewriteTargetListIU may have added
- * additional targetlist entries for view defaults, and these must also be
- * updatable.
+ * result RTE's insertedCols and/or updatedCols set, since
+ * rewriteTargetListIU may have added additional targetlist entries for
+ * view defaults, and these must also be updatable.
*/
if (parsetree->commandType != CMD_DELETE)
{
@@ -2723,26 +2726,31 @@ rewriteTargetView(Query *parsetree, Relation view)
*
* Initially, new_rte contains selectedCols permission check bits for all
* base-rel columns referenced by the view, but since the view is a SELECT
- * query its modifiedCols is empty. We set modifiedCols to include all
- * the columns the outer query is trying to modify, adjusting the column
- * numbers as needed. But we leave selectedCols as-is, so the view owner
- * must have read permission for all columns used in the view definition,
- * even if some of them are not read by the outer query. We could try to
- * limit selectedCols to only columns used in the transformed query, but
- * that does not correspond to what happens in ordinary SELECT usage of a
- * view: all referenced columns must have read permission, even if
- * optimization finds that some of them can be discarded during query
- * transformation. The flattening we're doing here is an optional
- * optimization, too. (If you are unpersuaded and want to change this,
- * note that applying adjust_view_column_set to view_rte->selectedCols is
- * clearly *not* the right answer, since that neglects base-rel columns
- * used in the view's WHERE quals.)
+ * query its insertedCols/updatedCols is empty. We set insertedCols and
+ * updatedCols to include all the columns the outer query is trying to
+ * modify, adjusting the column numbers as needed. But we leave
+ * selectedCols as-is, so the view owner must have read permission for all
+ * columns used in the view definition, even if some of them are not read
+ * by the outer query. We could try to limit selectedCols to only columns
+ * used in the transformed query, but that does not correspond to what
+ * happens in ordinary SELECT usage of a view: all referenced columns must
+ * have read permission, even if optimization finds that some of them can
+ * be discarded during query transformation. The flattening we're doing
+ * here is an optional optimization, too. (If you are unpersuaded and want
+ * to change this, note that applying adjust_view_column_set to
+ * view_rte->selectedCols is clearly *not* the right answer, since that
+ * neglects base-rel columns used in the view's WHERE quals.)
*
* This step needs the modified view targetlist, so we have to do things
* in this order.
*/
- Assert(bms_is_empty(new_rte->modifiedCols));
- new_rte->modifiedCols = adjust_view_column_set(view_rte->modifiedCols,
+ Assert(bms_is_empty(new_rte->insertedCols) &&
+ bms_is_empty(new_rte->updatedCols));
+
+ new_rte->insertedCols = adjust_view_column_set(view_rte->insertedCols,
+ view_targetlist);
+
+ new_rte->updatedCols = adjust_view_column_set(view_rte->updatedCols,
view_targetlist);
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b1dfa85..86d1c07 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -717,11 +717,12 @@ typedef struct XmlSerialize
* For SELECT/INSERT/UPDATE permissions, if the user doesn't have
* table-wide permissions then it is sufficient to have the permissions
* on all columns identified in selectedCols (for SELECT) and/or
- * modifiedCols (for INSERT/UPDATE; we can tell which from the query type).
- * selectedCols and modifiedCols are bitmapsets, which cannot have negative
- * integer members, so we subtract FirstLowInvalidHeapAttributeNumber from
- * column numbers before storing them in these fields. A whole-row Var
- * reference is represented by setting the bit for InvalidAttrNumber.
+ * insertedCols and/or updatedCols (INSERT with ON CONFLICT UPDATE may
+ * have all 3). selectedCols, insertedCols and updatedCols are
+ * bitmapsets, which cannot have negative integer members, so we subtract
+ * FirstLowInvalidHeapAttributeNumber from column numbers before storing
+ * them in these fields. A whole-row Var reference is represented by
+ * setting the bit for InvalidAttrNumber.
*--------------------
*/
typedef enum RTEKind
@@ -816,7 +817,8 @@ typedef struct RangeTblEntry
AclMode requiredPerms; /* bitmask of required access permissions */
Oid checkAsUser; /* if valid, check access as this role */
Bitmapset *selectedCols; /* columns needing SELECT permission */
- Bitmapset *modifiedCols; /* columns needing INSERT/UPDATE permission */
+ Bitmapset *insertedCols; /* columns needing INSERT permission */
+ Bitmapset *updatedCols; /* columns needing UPDATE permission */
List *securityQuals; /* any security barrier quals to apply */
} RangeTblEntry;
--
1.9.1
On Tue, Feb 10, 2015 at 12:09 PM, Peter Geoghegan <pg@heroku.com> wrote:
Then the problem suddenly becomes that previous choices of
indexes/statements aren't possible anymore. It seems much better to
introduce the syntax now and not have too much of a usecase for
it.The only way the lack of a way of specifying which opclass to use
could bite us is if there were two *actually* defined unique indexes
on the same column, each with different "equality" operators. That
seems like kind of a funny scenario, even if that were quite possible
(even if non-default opclasses existed that had a non-identical
"equality" operators, which is basically not the case today).I grant that is a bit odd that we're talking about unique indexes
definitions affecting semantics, but that is to a certain extent the
nature of the beast. As a compromise, I suggest having the inference
specification optionally accept a named opclass per attribute, in the
style of CREATE INDEX (I'm already reusing a bit of the raw parser
support for CREATE INDEX, you see) - that'll make inference insist on
that opclass, rather than make it a strict matter of costing available
alternatives (not that any alternative is expected with idiomatic
usage). That implies no additional parser overhead, really. If that's
considered ugly, then at least it's an ugly thing that literally no
one will ever use in the foreseeable future...and an ugly thing that
is no more necessary in CREATE INDEX than here (and yet CREATE INDEX
lives with the ugliness).
Any thoughts on this, anyone?
AFAICT, only this and the behavior of logical decoding are open items
at this point. I'd like to close out both of those sooner rather than
later.
Thanks
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/18/2015 11:43 PM, Peter Geoghegan wrote:
Heikki seemed to think that the deadlock problems were not really
worth fixing independently of ON CONFLICT UPDATE support, but rather
represented a useful way of committing code incrementally. Do I have
that right?
Yes.
The way I chose to break the livelock (what I call "livelock
insurance") does indeed involve comparing XIDs, which Heikki thought
most promising. But it also involves having the oldest XID wait on
another session's speculative token in the second phase, which
ordinarily does not occur in the second
phase/check_exclusion_or_unique_constraint() call. The idea is that
one session is guaranteed to be the waiter that has a second iteration
within its second-phase check_exclusion_or_unique_constraint() call,
where (following the super deletion of conflict TIDs by the other
conflicting session or sessions) reliably finds that it can proceed
(those other sessions are denied the opportunity to reach their second
phase by our second phase waiter's still-not-super-deleted tuple).However, it's difficult to see how to map this on to general exclusion
constraint insertion + enforcement. In Heikki's recent sketch of this
[1], there is no pre-check, since that is considered an "UPSERT thing"
deferred until a later patch, and therefore my scheme here cannot work
(recall that for plain inserts with exclusion constraints, we insert
first and check last). I have a hard time justifying adding the
pre-check for plain exclusion constraint inserters given the total
lack of complaints from the field regarding unprincipled deadlocks,
and given the fact that it would probably make the code more
complicated than it needs to be. It is critical that there be a
pre-check to prevent livelock, though, because otherwise the
restarting sessions can go straight to their "second" phase
(check_exclusion_or_unique_constraint() call), without ever realizing
that they should not do so.
Hmm. I haven't looked at your latest patch, but I don't think you need
to pre-check for this to work. To recap, the situation is that two
backends have already inserted the heap tuple, and then see that the
other backend's tuple conflicts. To avoid a livelock, it's enough that
one backend super-deletes its own tuple first, before waiting for the
other to complete, while the other other backend waits without
super-deleting. No?
It seems like the livelock insurance is pretty close to or actually
free, and doesn't imply that a "second phase wait for token" needs to
happen too often. With heavy contention on a small number of possible
tuples (100), and 8 clients deleting and inserting, I can see it
happening only once every couple of hundred milliseconds on my laptop.
It's not hard to imagine why the code didn't obviously appear to be
broken before now, as the window for an actual livelock must have been
small. Also, deadlocks bring about more deadlocks (since the deadlock
detector kicks in), whereas livelocks do not bring about more
livelocks.
It might be easier to provoke the livelocks with a GiST opclass that's
unusually slow. I wrote the attached opclass for the purpose of testing
this a while ago, but I haven't actually gotten around to do much with
it. It's called "useless_gist", because it's a GiST opclass for
integers, like btree_gist, but the penalty and picksplit functions are
totally dumb. The result is that the tuples are put to the index in
pretty much random order, and every scan has to scan the whole index.
I'm posting it here, in the hope that it happens to be useful, but I
don't really know if it is.
- Heikki
Attachments:
On 02/16/2015 11:31 AM, Andres Freund wrote:
On 2015-02-16 10:00:24 +0200, Heikki Linnakangas wrote:
I'm starting to think that we should bite the bullet and consume an infomask
bit for this. The infomask bits are a scarce resource, but we should use
them when it makes sense. It would be good for forensic purposes too, to
leave a trace that a super-deletion happened.Well, we IIRC don't have any left right now. We could reuse
MOVED_IN|MOVED_OUT, as that never was set in the past. But it'd
essentially use two infomask bits forever...
t_infomask is all used, but t_infomask2 has two bits left:
/*
* information stored in t_infomask2:
*/
#define HEAP_NATTS_MASK 0x07FF /* 11 bits for number of attributes */
/* bits 0x1800 are available */
#define HEAP_KEYS_UPDATED 0x2000 /* tuple was updated and key cols
* modified, or tuple deleted */
#define HEAP_HOT_UPDATED 0x4000 /* tuple was HOT-updated */
#define HEAP_ONLY_TUPLE 0x8000 /* this is heap-only tuple */#define HEAP2_XACT_MASK 0xE000 /* visibility-related bits */
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 19, 2015 at 5:21 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
Hmm. I haven't looked at your latest patch, but I don't think you need to
pre-check for this to work. To recap, the situation is that two backends
have already inserted the heap tuple, and then see that the other backend's
tuple conflicts. To avoid a livelock, it's enough that one backend
super-deletes its own tuple first, before waiting for the other to complete,
while the other other backend waits without super-deleting. No?
I fully agree with your summary here. However, why should we suppose
that while we wait, the other backends don't both delete and then
re-insert their tuple? They need the pre-check to know not to
re-insert their tuple (seeing our tuple, immediately after we wake as
the preferred backend with the older XID) in order to break the race.
But today, exclusion constraints are optimistic in that the insert
happens first, and only then the check. The pre-check turns that the
other way around, in a limited though necessary sense.
Granted, it's unlikely that we'd livelock due to one session always
deleting and then nullifying that by re-inserting in time, but the
theoretical risk seems real. Therefore, I'm not inclined to bother
committing something without a pre-check, particularly since we're not
really interested in fixing unprincipled deadlocks for ordinary
exclusion constraint inserters (useful to know that you also think
that doesn't matter, BTW). Does that make sense?
This is explained within "livelock insurance" new-to-V2.3 comments in
check_exclusion_or_unique_constraint(). (Not that I think that
explanation is easier to follow than this one).
It might be easier to provoke the livelocks with a GiST opclass that's
unusually slow. I wrote the attached opclass for the purpose of testing this
a while ago, but I haven't actually gotten around to do much with it. It's
called "useless_gist", because it's a GiST opclass for integers, like
btree_gist, but the penalty and picksplit functions are totally dumb. The
result is that the tuples are put to the index in pretty much random order,
and every scan has to scan the whole index. I'm posting it here, in the hope
that it happens to be useful, but I don't really know if it is.
Thanks. I'll try and use this for testing. Haven't been able to break
exclusion constraints with the jjanes_upsert test suite in a long
time, now.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/19/2015 08:16 PM, Peter Geoghegan wrote:
On Thu, Feb 19, 2015 at 5:21 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:Hmm. I haven't looked at your latest patch, but I don't think you need to
pre-check for this to work. To recap, the situation is that two backends
have already inserted the heap tuple, and then see that the other backend's
tuple conflicts. To avoid a livelock, it's enough that one backend
super-deletes its own tuple first, before waiting for the other to complete,
while the other other backend waits without super-deleting. No?I fully agree with your summary here. However, why should we suppose
that while we wait, the other backends don't both delete and then
re-insert their tuple? They need the pre-check to know not to
re-insert their tuple (seeing our tuple, immediately after we wake as
the preferred backend with the older XID) in order to break the race.
But today, exclusion constraints are optimistic in that the insert
happens first, and only then the check. The pre-check turns that the
other way around, in a limited though necessary sense.
I'm not sure I understand exactly what you're saying, but AFAICS the
pre-check doesn't completely solve that either. It's entirely possible
that the other backend deletes its tuple, our backend then performs the
pre-check, and the other backend re-inserts its tuple again. Sure, the
pre-check reduces the chances, but we're talking about a rare condition
to begin with, so I don't think it makes sense to add much code just to
reduce the chances further.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Feb 19, 2015 at 11:10 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
I fully agree with your summary here. However, why should we suppose
that while we wait, the other backends don't both delete and then
re-insert their tuple? They need the pre-check to know not to
re-insert their tuple (seeing our tuple, immediately after we wake as
the preferred backend with the older XID) in order to break the race.
But today, exclusion constraints are optimistic in that the insert
happens first, and only then the check. The pre-check turns that the
other way around, in a limited though necessary sense.I'm not sure I understand exactly what you're saying, but AFAICS the
pre-check doesn't completely solve that either. It's entirely possible that
the other backend deletes its tuple, our backend then performs the
pre-check, and the other backend re-inserts its tuple again. Sure, the
pre-check reduces the chances, but we're talking about a rare condition to
begin with, so I don't think it makes sense to add much code just to reduce
the chances further.
But super deletion occurs *before* releasing the token lock, which is
the last thing we do before looping around and starting again. So iff
we're the oldest XID, the one that gets to "win" by unexpectedly
waiting on another's token in our second phase (second call to
check_exclusion_or_unique_constraint()), we will not, in fact, see
anyone else's tuple, because they'll all be forced to go through the
first phase and find our pre-existing, never-deleted tuple, so we
can't see any new tuple from them. And, because they super delete
before releasing their token, they'll definitely have super deleted
when we're woken up, so we can't see any old/existing tuple either. We
have our tuple inserted this whole time - ergo, we do, in fact, "win"
reliably.
The fly in the ointment is regular inserters, perhaps, but we've
agreed that they're not too important here, and even when that happens
we're in "deadlock land", not "livelock land", which is obviously a
nicer place to live.
Does that make sense?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/19/2015 10:09 PM, Peter Geoghegan wrote:
On Thu, Feb 19, 2015 at 11:10 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:I fully agree with your summary here. However, why should we suppose
that while we wait, the other backends don't both delete and then
re-insert their tuple? They need the pre-check to know not to
re-insert their tuple (seeing our tuple, immediately after we wake as
the preferred backend with the older XID) in order to break the race.
But today, exclusion constraints are optimistic in that the insert
happens first, and only then the check. The pre-check turns that the
other way around, in a limited though necessary sense.I'm not sure I understand exactly what you're saying, but AFAICS the
pre-check doesn't completely solve that either. It's entirely possible that
the other backend deletes its tuple, our backend then performs the
pre-check, and the other backend re-inserts its tuple again. Sure, the
pre-check reduces the chances, but we're talking about a rare condition to
begin with, so I don't think it makes sense to add much code just to reduce
the chances further.But super deletion occurs *before* releasing the token lock, which is
the last thing we do before looping around and starting again. So iff
we're the oldest XID, the one that gets to "win" by unexpectedly
waiting on another's token in our second phase (second call to
check_exclusion_or_unique_constraint()), we will not, in fact, see
anyone else's tuple, because they'll all be forced to go through the
first phase and find our pre-existing, never-deleted tuple, so we
can't see any new tuple from them. And, because they super delete
before releasing their token, they'll definitely have super deleted
when we're woken up, so we can't see any old/existing tuple either. We
have our tuple inserted this whole time - ergo, we do, in fact, "win"
reliably.
So, um, are you agreeing that there is no problem? Or did I
misunderstand? If you see a potential issue here, can you explain it as
a simple list of steps, please.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 20, 2015 at 11:34 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
So, um, are you agreeing that there is no problem? Or did I misunderstand?
If you see a potential issue here, can you explain it as a simple list of
steps, please.
Yes. I'm saying that AFAICT, there is no livelock hazard provided
other sessions must do the pre-check (as they must for ON CONFLICT
IGNORE). So I continue to believe that they must pre-check, which you
questioned.
The only real downside here (with not doing the pre-check for regular
inserters with exclusion constraints) is that we can't fix
unprincipled deadlocks for general exclusion constraint inserters
(since we don't want to make them pre-check), but we don't actually
care about that directly. But it also means that it's hard to see how
we can incrementally commit such a fix to break down the patch into
more manageable chunks, which is a little unfortunate.
Hard to break down the problem into steps, since it isn't a problem
that I was able to recreate (as a noticeable livelock). But the point
of what I was saying is that the first phase pre-check allows us to
notice that the one session that got stuck waiting in the second phase
(other conflicters notice its tuple, and so don't insert a new tuple
this iteration).
Conventional insertion with exclusion constraints insert first and
only then looks for conflicts (since there is no pre-check). More
concretely, if you're the transaction that does not "break" here,
within check_exclusion_or_unique_constraint(), and so unexpectedly
waits in the second phase:
+ /*
+ * At this point we have either a conflict or a potential conflict. If
+ * we're not supposed to raise error, just return the fact of the
+ * potential conflict without waiting to see if it's real.
+ */
+ if (violationOK && !wait)
+ {
+ /*
+ * For unique indexes, detecting conflict is coupled with physical
+ * index tuple insertion, so we won't be called for recheck
+ */
+ Assert(!indexInfo->ii_Unique);
+
+ conflict = true;
+ if (conflictTid)
+ *conflictTid = tup->t_self;
+
+ /*
+ * Livelock insurance.
+ *
+ * When doing a speculative insertion pre-check, we cannot have an
+ * "unprincipled deadlock" with another session, fundamentally
+ * because there is no possible mutual dependency, since we only
+ * hold a lock on our token, without attempting to lock anything
+ * else (maybe this is not the first iteration, but no matter;
+ * we'll have super deleted and released insertion token lock if
+ * so, and all locks needed are already held. Also, our XID lock
+ * is irrelevant.)
+ *
+ * In the second phase, where there is a re-check for conflicts, we
+ * can't deadlock either (we never lock another thing, since we
+ * don't wait in that phase). However, a theoretical livelock
+ * hazard exists: Two sessions could each see each other's
+ * conflicting tuple, and each could go and delete, retrying
+ * forever.
+ *
+ * To break the mutual dependency, we may wait on the other xact
+ * here over our caller's request to not do so (in the second
+ * phase). This does not imply the risk of unprincipled deadlocks
+ * either, because if we end up unexpectedly waiting, the other
+ * session will super delete its own tuple *before* releasing its
+ * token lock and freeing us, and without attempting to wait on us
+ * to release our token lock. We'll take another iteration here,
+ * after waiting on the other session's token, not find a conflict
+ * this time, and then proceed (assuming we're the oldest XID).
+ *
+ * N.B.: Unprincipled deadlocks are still theoretically possible
+ * with non-speculative insertion with exclusion constraints, but
+ * this seems inconsequential, since an error was inevitable for
+ * one of the sessions anyway. We only worry about speculative
+ * insertion's problems, since they're likely with idiomatic usage.
+ */
+ if (TransactionIdPrecedes(xwait, GetCurrentTransactionId()))
+ break; /* go and super delete/restart speculative insertion */
+ }
+
Then you must successfully insert when you finally "goto retry" and do
another iteration within check_exclusion_or_unique_constraint(). The
other conflicters can't have failed to notice your pre-existing tuple,
and can't have failed to super delete their own tuples before you are
woken (provided you really are the single oldest XID).
Is that any clearer?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/20/2015 10:39 PM, Peter Geoghegan wrote:
On Fri, Feb 20, 2015 at 11:34 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:So, um, are you agreeing that there is no problem? Or did I misunderstand?
If you see a potential issue here, can you explain it as a simple list of
steps, please.Yes. I'm saying that AFAICT, there is no livelock hazard provided
other sessions must do the pre-check (as they must for ON CONFLICT
IGNORE). So I continue to believe that they must pre-check, which you
questioned.
...
Hard to break down the problem into steps, since it isn't a problem
that I was able to recreate (as a noticeable livelock).
Then I refuse to believe that the livelock hazard exists, without the
pre-check. If you have a livelock scenario in mind, it really shouldn't
be that difficult to write down the list of steps.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Feb 20, 2015 at 1:07 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
Then I refuse to believe that the livelock hazard exists, without the
pre-check. If you have a livelock scenario in mind, it really shouldn't be
that difficult to write down the list of steps.
I just meant practical, recreatable steps - a test case. I should
emphasize that what I'm saying is not that important. Even if I am
wrong, I'm not suggesting that we do anything that we don't both agree
is needed anyway. If I'm right, that is merely an impediment to
incrementally committing the work by "fixing" exclusion constraints,
AFAICT. Ultimately, that isn't all that important. Anyway, here is how
I think livelocks could happen, in theory, with regular insertion into
a table with exclusion constraints, with your patch [1]/messages/by-id/54DFC6F8.5050108@vmware.com -- Peter Geoghegan applied (which
has no pre-check), this can happen:
* Session 1 physically inserts, and then checks for a conflict.
* Session 2 physically inserts, and then checks for a conflict.
* Session 1 sees session 2's conflicting TID, then super deletes and
releases token.
* Session 2 sees session 1's conflicting TID, then super deletes and
releases token.
* Session 1 waits or tries to wait on session 2's token. It isn't held
anymore, or is only held for an instant.
* Session 2 waits or tries to wait on session 1's token. It isn't held
anymore, or is only held for an instant.
* Session 1 restarts from scratch, having made no useful progress in
respect of the slot being inserted.
* Session 2 restarts from scratch, having made no useful progress in
respect of the slot being inserted.
(Livelock)
If there is a tie-break on XID (you omitted this from your patch [1]/messages/by-id/54DFC6F8.5050108@vmware.com -- Peter Geoghegan
but acknowledge it as an omission), than that doesn't really change
anything (without adding a pre-check, too). That's because: What do we
actually do or not do when we're the oldest XID, that gets to "win"?
Do we not wait on anything, and just declare that we're done? Then I
think that breaks exclusion constraint enforcement, because we need to
rescan the index to do that (i.e., "goto retry"). Do we wait on their
token, as my most recent revision does, but *without* a pre-check, for
regular inserters? Then I think that our old tuple could keep multiple
other sessions spinning indefinitely. Only by checking for conflicts
*first*, without a non-super-deleted physical index tuple can these
other sessions notice that there is a conflict *without creating more
conflicts*, which is what I believe is really needed. At the very
least it's something I'm much more comfortable with, and that seems
like reason enough to do it that way, given that we don't actually
care about unprincipled deadlocks with regular inserters with
exclusion constraints. Why take the chance with livelocking such
inserters, though?
I hope that we don't get bogged down on this, because, as I said, it
doesn't matter too much. I'm tempted to concede the point for that
reason, since the livelock is probably virtually impossible. I'm just
giving you my opinion on how to make the handling of exclusion
constraints as reliable as possible.
Thanks
[1]: /messages/by-id/54DFC6F8.5050108@vmware.com -- Peter Geoghegan
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/21/2015 12:15 AM, Peter Geoghegan wrote:
On Fri, Feb 20, 2015 at 1:07 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:Then I refuse to believe that the livelock hazard exists, without the
pre-check. If you have a livelock scenario in mind, it really shouldn't be
that difficult to write down the list of steps.I just meant practical, recreatable steps - a test case. I should
emphasize that what I'm saying is not that important. Even if I am
wrong, I'm not suggesting that we do anything that we don't both agree
is needed anyway. If I'm right, that is merely an impediment to
incrementally committing the work by "fixing" exclusion constraints,
AFAICT. Ultimately, that isn't all that important. Anyway, here is how
I think livelocks could happen, in theory, with regular insertion into
a table with exclusion constraints, with your patch [1] applied (which
has no pre-check), this can happen:* Session 1 physically inserts, and then checks for a conflict.
* Session 2 physically inserts, and then checks for a conflict.
* Session 1 sees session 2's conflicting TID, then super deletes and
releases token.* Session 2 sees session 1's conflicting TID, then super deletes and
releases token.* Session 1 waits or tries to wait on session 2's token. It isn't held
anymore, or is only held for an instant.* Session 2 waits or tries to wait on session 1's token. It isn't held
anymore, or is only held for an instant.* Session 1 restarts from scratch, having made no useful progress in
respect of the slot being inserted.* Session 2 restarts from scratch, having made no useful progress in
respect of the slot being inserted.(Livelock)
If there is a tie-break on XID (you omitted this from your patch [1]
but acknowledge it as an omission), than that doesn't really change
anything (without adding a pre-check, too). That's because: What do we
actually do or not do when we're the oldest XID, that gets to "win"?
Ah, ok, I can see the confusion now.
Do we not wait on anything, and just declare that we're done? Then I
think that breaks exclusion constraint enforcement, because we need to
rescan the index to do that (i.e., "goto retry"). Do we wait on their
token, as my most recent revision does, but *without* a pre-check, for
regular inserters? Then I think that our old tuple could keep multiple
other sessions spinning indefinitely.
What I had in mind is that the "winning" inserter waits on the other
inserter's token, without super-deleting. Like all inserts do today. So
the above scenario becomes:
* Session 1 physically inserts, and then checks for a conflict.
* Session 2 physically inserts, and then checks for a conflict.
* Session 1 sees session 2's conflicting TID. Session 1's XID is older,
so it "wins". It waits for session 2's token, without super-deleting.
* Session 2 sees session 1's conflicting TID. It super deletes,
releases token, and sleeps on session 1's token.
* Session 1 wakes up. It looks at session 2's tuple again and sees that
it was super-deleted. There are no further conflicts, so the insertion
is complete, and it releases the token.
* Session 2 wakes up. It looks at session 1's tuple again and sees that
it's still there. It goes back to sleep, this time on session 2's XID.
* Session 1 commits. Session 2 wakes up, sees that the tuple is still
there, and throws a "contraint violation" error.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Feb 21, 2015 at 11:15 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
Ah, ok, I can see the confusion now.
Cool.
Do we not wait on anything, and just declare that we're done? Then I
think that breaks exclusion constraint enforcement, because we need to
rescan the index to do that (i.e., "goto retry"). Do we wait on their
token, as my most recent revision does, but *without* a pre-check, for
regular inserters? Then I think that our old tuple could keep multiple
other sessions spinning indefinitely.What I had in mind is that the "winning" inserter waits on the other
inserter's token, without super-deleting. Like all inserts do today. So the
above scenario becomes:* Session 1 physically inserts, and then checks for a conflict.
* Session 2 physically inserts, and then checks for a conflict.
* Session 1 sees session 2's conflicting TID. Session 1's XID is older, so
it "wins". It waits for session 2's token, without super-deleting.* Session 2 sees session 1's conflicting TID. It super deletes,
releases token, and sleeps on session 1's token.* Session 1 wakes up. It looks at session 2's tuple again and sees that it
was super-deleted. There are no further conflicts, so the insertion is
complete, and it releases the token.* Session 2 wakes up. It looks at session 1's tuple again and sees that it's
still there. It goes back to sleep, this time on session 2's XID.* Session 1 commits. Session 2 wakes up, sees that the tuple is still there,
and throws a "contraint violation" error.
I think we're actually 100% in agreement, then. I just prefer to have
the second last step (the check without a promise tuple visible to
anyone made by the "loser") occur as part of the pre-check that
happens anyway with ON CONFLICT IGNORE. Otherwise, you'll end up with
some much more complicated control flow that has to care about not
doing that twice for ON CONFLICT IGNORE...and for the benefit of what?
For regular inserters, that we don't actually care about fixing
unprincipled deadlocks for?
Are we on the same page now?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/21/2015 10:41 PM, Peter Geoghegan wrote:
On Sat, Feb 21, 2015 at 11:15 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:What I had in mind is that the "winning" inserter waits on the other
inserter's token, without super-deleting. Like all inserts do today. So the
above scenario becomes:* Session 1 physically inserts, and then checks for a conflict.
* Session 2 physically inserts, and then checks for a conflict.
* Session 1 sees session 2's conflicting TID. Session 1's XID is older, so
it "wins". It waits for session 2's token, without super-deleting.* Session 2 sees session 1's conflicting TID. It super deletes,
releases token, and sleeps on session 1's token.* Session 1 wakes up. It looks at session 2's tuple again and sees that it
was super-deleted. There are no further conflicts, so the insertion is
complete, and it releases the token.* Session 2 wakes up. It looks at session 1's tuple again and sees that it's
still there. It goes back to sleep, this time on session 2's XID.* Session 1 commits. Session 2 wakes up, sees that the tuple is still there,
and throws a "contraint violation" error.I think we're actually 100% in agreement, then. I just prefer to have
the second last step (the check without a promise tuple visible to
anyone made by the "loser") occur as part of the pre-check that
happens anyway with ON CONFLICT IGNORE. Otherwise, you'll end up with
some much more complicated control flow that has to care about not
doing that twice for ON CONFLICT IGNORE...and for the benefit of what?
For regular inserters, that we don't actually care about fixing
unprincipled deadlocks for?
Right. That will allow me to review and test the locking part of the
patch separately.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 02/17/2015 02:11 AM, Peter Geoghegan wrote:
Whatever works, really. I can't say that the performance implications
of acquiring that hwlock are at the forefront of my mind. I never
found that to be a big problem on an 8 core box, relative to vanilla
INSERTs, FWIW - lock contention is not normal, and may be where any
heavweight lock costs would really be encountered.Oh, cool. I guess the fast-path in lmgr.c kicks ass, then :-).
Seems that way. But even if that wasn't true, it wouldn't matter,
since I don't see that we have a choice.
I did some quick performance testing on this. For easy testing, I used a
checkout of git master, and simply added LockAcquire + LockRelease calls
to ExecInsert, around the heap_insert() call. The test case I used was:
psql -c "create table footest (id serial primary key);"
echo "insert into footest select from generate_series(1, 10000);" >
inserts.sql
pgbench -n -f inserts.sql postgres -T100 -c4"
With the extra lock calls, I got 56 tps on my laptop. With unpatched git
master, I got 60 tps. I also looked at the profile with "perf", and
indeed about 10% of the CPU time was spent in LockAcquire and
LockRelease together.
So the extra locking incurs about 10% overhead. I think this was pretty
ḿuch a worst case scenario, but not a hugely unrealistic one - many
real-world tables have only a few columns, and few indexes. With more
CPUs you would probably start to see contention, in addition to just the
extra overhead.
Are we OK with a 10% overhead, caused by the locking? That's probably
acceptable if that's what it takes to get UPSERT. But it's not OK just
to solve the deadlock issue with regular insertions into a table with
exclusion constraints. Can we find a scheme to eliminate that overhead?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 2, 2015 at 11:20 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Are we OK with a 10% overhead, caused by the locking? That's probably
acceptable if that's what it takes to get UPSERT. But it's not OK just to
solve the deadlock issue with regular insertions into a table with exclusion
constraints. Can we find a scheme to eliminate that overhead?
Looks like you tested a B-Tree index here. That doesn't seem
particularly representative of what you'd see with exclusion
constraints.
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 03/02/2015 09:29 PM, Peter Geoghegan wrote:
On Mon, Mar 2, 2015 at 11:20 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Are we OK with a 10% overhead, caused by the locking? That's probably
acceptable if that's what it takes to get UPSERT. But it's not OK just to
solve the deadlock issue with regular insertions into a table with exclusion
constraints. Can we find a scheme to eliminate that overhead?Looks like you tested a B-Tree index here. That doesn't seem
particularly representative of what you'd see with exclusion
constraints.
Hmm. I used a b-tree to estimate the effect that the locking would have
in the UPSERT case, for UPSERT into a table with a b-tree index. But
you're right that for the question of whether this is acceptable for the
case of regular insert into a table with exclusion constraints, other
indexams are more interesting. In a very quick test, the overhead with a
single GiST index seems to be about 5%. IMHO that's still a noticeable
slowdown, so let's try to find a way to eliminate it.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 2, 2015 at 12:15 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Hmm. I used a b-tree to estimate the effect that the locking would have in
the UPSERT case, for UPSERT into a table with a b-tree index. But you're
right that for the question of whether this is acceptable for the case of
regular insert into a table with exclusion constraints, other indexams are
more interesting. In a very quick test, the overhead with a single GiST
index seems to be about 5%. IMHO that's still a noticeable slowdown, so
let's try to find a way to eliminate it.
Honestly, I don't know why you're so optimistic that this can be
fixed, even with that heavyweight lock (for regular inserters +
exclusion constraints).
My experimental branch, which I showed you privately shows big
problems when I simulate upserting with exclusion constraints (so we
insert first, handle exclusion violations using the traditional upsert
subxact pattern that does work with B-Trees). It's much harder to back
out of a heap_update() than it is to back out of a heap_insert(),
since we might well be updated a tuple visible to some other MVCC
snapshot. Whereas we can super delete a tuple knowing that only a
dirty snapshot might have seen it, which bounds the complexity (only
wait sites for B-Trees + exclusion constraints need to worry about
speculative insertion tokens and so on).
My experimental branch works just fine (with a variant jjanes_upsert
with subxact looping), until I need to restart an update after a
"failed" heap_update() that still returned HeapTupleMayBeUpdated
(having super deleted within an ExecUpdate() call). There is no good
way to do that for ExecUpdate() that I can see, because an existing,
visible row is affected (unlike with ExecInsert()). Even if it was
possible, it would be hugely invasive to already very complicated code
paths.
I continue to believe that the best way forward is to incrementally
commit the work by committing ON CONFLICT IGNORE first. That way,
speculative tokens will remain strictly the concern of UPSERTers or
sessions doing INSERT ... ON CONFLICT IGNORE. Unless, you're happy to
have UPDATEs still deadlock, and only fix unprincipled deadlocks for
the INSERT case. I don't know why you want to make the patch
incremental by "fixing" unprincipled deadlocks in the first place,
since you've said that you don't really care about it as a real world
problem, and because it now appears to have significant additional
difficulties that were not anticipated.
Please, let's focus on getting ON CONFLICT IGNORE into a commitable
state - that's the best way of incrementally committing this work.
Fixing unprincipled deadlocks is not a useful way of incrementally
committing this work. I've spent several days producing a prototype
that shows the exact nature of the problem. If you think I'm mistaken,
and that fixing unprincipled deadlocks first is the right thing to do,
please explain why with reference to that prototype. AFAICT, doing
things your way is going to add significant additional complexity for
*no appreciable benefit*. You've already gotten exactly what you were
looking for in every other regard. In particular, ON CONFLICT UPDATE
could work with exclusion constraints without any of these problems,
because of the way we do the auxiliary update there (we lock the row
ahead of updating/qual evaluation). I've bent over backwards to make
sure that is the case. Please, throw me a bone here.
Thank you
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 03/02/2015 11:21 PM, Peter Geoghegan wrote:
On Mon, Mar 2, 2015 at 12:15 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
Hmm. I used a b-tree to estimate the effect that the locking would have in
the UPSERT case, for UPSERT into a table with a b-tree index. But you're
right that for the question of whether this is acceptable for the case of
regular insert into a table with exclusion constraints, other indexams are
more interesting. In a very quick test, the overhead with a single GiST
index seems to be about 5%. IMHO that's still a noticeable slowdown, so
let's try to find a way to eliminate it.Honestly, I don't know why you're so optimistic that this can be
fixed, even with that heavyweight lock (for regular inserters +
exclusion constraints).
We already concluded that it can be fixed, with the heavyweight lock.
See /messages/by-id/54F4A0E0.4070602@iki.fi. Do you
see some new problem with that that hasn't been discussed yet? To
eliminate the heavy-weight lock, we'll need some new ideas, but it
doesn't seem that hard.
My experimental branch, which I showed you privately shows big
problems when I simulate upserting with exclusion constraints (so we
insert first, handle exclusion violations using the traditional upsert
subxact pattern that does work with B-Trees). It's much harder to back
out of a heap_update() than it is to back out of a heap_insert(),
since we might well be updated a tuple visible to some other MVCC
snapshot. Whereas we can super delete a tuple knowing that only a
dirty snapshot might have seen it, which bounds the complexity (only
wait sites for B-Trees + exclusion constraints need to worry about
speculative insertion tokens and so on).My experimental branch works just fine (with a variant jjanes_upsert
with subxact looping), until I need to restart an update after a
"failed" heap_update() that still returned HeapTupleMayBeUpdated
(having super deleted within an ExecUpdate() call). There is no good
way to do that for ExecUpdate() that I can see, because an existing,
visible row is affected (unlike with ExecInsert()). Even if it was
possible, it would be hugely invasive to already very complicated code
paths.
Ah, so we can't easily use super-deletion to back out an UPDATE. I had
not considered that.
I continue to believe that the best way forward is to incrementally
commit the work by committing ON CONFLICT IGNORE first. That way,
speculative tokens will remain strictly the concern of UPSERTers or
sessions doing INSERT ... ON CONFLICT IGNORE.
Ok, let's try that. Can you cut a patch that does just ON CONFLICT
IGNORE, please?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 3, 2015 at 12:05 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
My experimental branch works just fine (with a variant jjanes_upsert
with subxact looping), until I need to restart an update after a
"failed" heap_update() that still returned HeapTupleMayBeUpdated
(having super deleted within an ExecUpdate() call). There is no good
way to do that for ExecUpdate() that I can see, because an existing,
visible row is affected (unlike with ExecInsert()). Even if it was
possible, it would be hugely invasive to already very complicated code
paths.Ah, so we can't easily use super-deletion to back out an UPDATE. I had not
considered that.
Yeah. When I got into considering making EvalPlanQualFetch() look at
speculative tokens, it became abundantly clear that that code would
never be committed, even if I could make it work.
I continue to believe that the best way forward is to incrementally
commit the work by committing ON CONFLICT IGNORE first. That way,
speculative tokens will remain strictly the concern of UPSERTers or
sessions doing INSERT ... ON CONFLICT IGNORE.Ok, let's try that. Can you cut a patch that does just ON CONFLICT IGNORE,
please?
Of course. I'll have that for your shortly.
Thanks
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers