Re: WIP: Covering + unique indexes

Started by Brad DeJongabout 9 years ago1 messages

Brad DeJong

Brad.Dejong@infor.com

about 9 years ago

Anastasia, et al,

This is a review of including_columns_9.7_v5.patch.

I looked through the commit fest list and this patch was interesting and I wanted to try it.

I have used include columns on Microsoft SQL Server; DB2 for Linux, Unix, Windows; and DB2 for z/OS.

After reading the e-mail discussions, there are still points that I am not clear on.

Given "create table foo (a int, b int, c int, d int)" and "create unique index foo_a_b on foo (a, b) including (c)".

index only? heap tuple needed?
select a, b, c from foo where a = 1 yes no
select a, b, d from foo where a = 1 no yes
select a, b from foo where a = 1 and c = 1 ? ?

It was clear from the discussion that the index scan/access method would not resolve the "c = 1" predicate but it was not clear if "c = 1" could be resolved without accessing the heap tuple.

Are included columns counted against the 32 column and 2712 byte index limits? I did not see either explicitly mentioned in the discussion or the documentation. I only ask because in SQL Server the limits are different for include columns.

1. syntax - on 2016-08-14, Andrey Borodin wrote "I think MS SQL syntax INCLUDE instead of INCLUDING would be better". I would go further than that. This feature is already supported by 2 of the top 5 SQL databases and they both use INCLUDE. Using different syntax because of an internal implementation detail seems short sighted.

2. The patch does not seem to apply cleanly anymore.

git checkout master

Already on 'master'

git pull

From http://git.postgresql.org/git/postgresql
d49cc58..06f5fd2 master -> origin/master
Updating d49cc58..06f5fd2
Fast-forward
src/include/port.h | 2 +-
src/port/dirmod.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

patch -pl < including_columns_9.7_v5.patch

patching file contrib/dblink/dblink.c
...
patching file src/backend/parser/gram.y
...
Hunk #2 FAILED at 375.
...
1 out of 12 hunks FAILED -- saving rejects to file src/backend/parser/gram.y.rej
...
patching file src/bin/pg_dump/pg_dump.c
...
Hunk #8 FAILED at 6399.
Hunk #9 FAILED at 6426.
...
2 out of 13 hunks FAILED -- saving rejects to file src/bin/pg_dump/pg_dump.c.rej
...

3. After "fixing" compilation errors (best guess based on similar change in other location), "make check" failed.

make check

...
parallel group (3 tests): index_including create_view create_index
create_index ... ok
create_view ... ok
index_including ... FAILED
...

4. documentation - minor items (these are not actual diffs)
  create_index.sgml
      -    [ INCLUDING ( <replaceable class="parameter">column_name</replaceable> )]
      +    [ INCLUDING ( <replaceable class="parameter">column_name</replaceable> [, ...] )]

               An optional <literal>INCLUDING</> clause allows a list of columns to be
      -        specified which will be included in the index, in the non-key portion of the index.
      +        specified which will be included in the non-key portion of the index.

The whole paragraph on INCLUDING does not include many of the points raised in the feature discussions.

create_table.sgml (for both UNIQUE and PRIMARY KEY constraints) (similar change could be made to nbtree/README)
- Optional clause <literal>INCLUDING</literal> allows to add into the index
- a portion of columns on which the uniqueness is not enforced upon.
- Note, that althogh constraint is not enforced on included columns, it still
- depends on them. Consequently, some operations on these columns (e.g. <literal>DROP COLUMN</literal>)
- can cause cascade constraint and index deletion.

      +        An optional <literal>INCLUDING</literal> clause allows a list of columns
      +        to be specified which will be included in the non-key portion of the index.
      +        Although uniqueness is not enforced on the included columns, the constraint
      +        still depends on them. Consequently, some operations on the included columns
      +        (e.g. <literal>DROP COLUMN</literal>) can cause cascading constraint and index deletion.

  indexcmds.c
      -        * are key columns, and which are just part of the INCLUDING list by check
      +        * are key columns, and which are just part of the INCLUDING list by checking

    ruleutils.c
      -               * meaningless there, so do not include them into the message.
      +               * meaningless there, so do not include them in the message.

    pg_dump.c (does "if (fout->remoteVersion >= 90600)" also need to change?)
      -               * In 9.6 we add INCLUDING columns functionality
      -               * that requires new fields to be added.
      +               * In 10.0 we added INCLUDING columns functionality
      +               * that required new fields to be added.

5. coding
parse_utilcmd.c
@@ -1334,6 +1334,38 @@ ...
The loop is handling included columns separately.
The loop adds the collation name for each included column if it is not the default.

Q: Given that the create index/create constraint syntax does not allow a collation to be specified for included columns, how can you ever have a non-default collation?

@@ -1776,6 +1816,7 @@
The comment here says "NOTE that exclusion constraints don't support included nonkey attributes". However, the paragraph on INCLUDING in create_index.sgml says "It's the same for the other constraints (PRIMARY KEY and EXCLUDE)".