Indirect indexes

Started by Alvaro Herreraover 9 years ago61 messageshackers
Jump to latest
#1Alvaro Herrera
alvherre@2ndquadrant.com

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]Eventually we can expand this to allow for "normal" datatypes, say bigint, but that's likely to require a much bigger patch in order to change IndexTuple to support it. I would defer that project to a later time.

A new flag is added to the index AM routine, indicating whether it can
support indirect indexes. Initially, only the b-tree AM would support
indirect indexes, but I plan to develop support for GIN indirect soon
afterwards, which seems most valuable.

To create an indirect index, the command
CREATE INDIRECT INDEX
is used. Currently this always uses the defined primary key index[2]It is possible to extend the grammar to allow other UNIQUE indexes to be used, if they are on top of NOT NULL columns. This would allow to extend existing production databases with a new column. A proposed syntax is CREATE INDIRECT INDEX idx ON tab (a, b, c) REFERENCES some_unique_index [optional WHERE clause] ; which Bison accepts. I propose not to implement this yet. However this is an important item because it allows existing databases to simply add an UNIQUE NOT NULL column to their existing big tables to take advantage of the feature, without requiring a lengthy dump/reload of tables that currently only have larger keys..

Implementation-wise, to find a value using an indirect index that index
is scanned first; this produces a PK value, which can be used to scan
the primary key index, the result of which is returned.

There are two big advantages to indirect indexes, both of which are
related to UPDATE's "write amplification":

1. UPDATE is faster. Indirect indexes on column that are not modified
by the update do not need to be updated.
2. HOT can be used more frequently. Columns indexed only by indirect
indexes do not need to be considered for whether an update needs to
be non-HOT, so this further limits "write amplification".

The biggest downside is that in order to find out a heap tuple using the
index we need to descend two indexes (the indirect and the PK) instead
of one, so it's slower. For many use cases the tradeoff is justified.

I measured the benefits with the current prototype implementation. In
two separate schemas, I created a pgbench_accounts table, with 12
"filler" columns, and indexed them all; one schema used regular indexes,
the other used indirect indexes. Filled them both to the equivalent of
scale 50, which results in a table of some 2171 MB; the 12 indexes are
282 MB each, and the PK index is 107 MB). I then ran a pgbench with a
custom script that update a random one of those columns and leave the
others alone on both schemas (not simultaneously). I ran 100k updates
for each case, 5 times:

method │ TPS: min / avg (stddev) / max │ Duration: min / avg / max │ avg_wal
──────────┼───────────────────────────────────┼─────────────────────────────────────────┼─────────
direct │ 601.2 / 1029.9 ( 371.9) / 1520.9 │ 00:01:05.76 / 00:01:48.58 / 00:02:46.39 │ 4841 MB
indirect │ 2165.1 / 3081.6 ( 574.8) / 3616.4 │ 00:00:27.66 / 00:00:33.56 / 00:00:46.2 │ 1194 MB
(2 rows)

This is a pretty small test (not long enough for autovacuum to trigger
decently) but I think this should be compelling enough to present the
case.

Please discuss.

Implementation notes:

Executor-wise, we could have a distinct IndirectIndexScan node, or we
could just hide the second index scan inside a regular IndexScan. I
think from a cleanliness POV there is no reason to have a separate node;
efficiency wise I think a separate node leads to less branches in the
code. (In my toy patch I actually have the second indexscan hidden
inside a separate "ibtree" AM; not what I really propose for commit.)
Additionally, executor will have to keep track of the values in the PK
index so that they can be passed down on insertion to each indirect
index.

Planner-wise, I don't think we need to introduce a distinct indirect
index Path. We can just let the cost estimator attach the true cost of
the two scans to a regular index scan path, and the correct executor
node is produced later if that index is chosen.

In relcache we'll need an additional bitmapset of columns indexed by
indirect indexes. This is used so that heap_update can return an output
bitmapset of such columns that were not modified (this is computed by
HeapSatisfiesHOTandKeyUpdate). The executor uses this to know when to
skip updating each indirect index.

Vacuuming presents an additional challenge: in order to remove index
items from an indirect index, it's critical to scan the PK index first
and collect the PK values that are being removed. Then scan the
indirect index and remove any items that match the PK items removed.
This is a bit problematic because of the additional memory needed to
store the array of PK values. I haven't implemented this yet.

Items I haven't thought about yet:
* UNIQUE INDIRECT? I think these should work, but require some
tinkering.
* Deferred unique indexes? See unique_key_recheck.
* CREATE INDEX CONCURRENTLY.

[1]: Eventually we can expand this to allow for "normal" datatypes, say bigint, but that's likely to require a much bigger patch in order to change IndexTuple to support it. I would defer that project to a later time.
bigint, but that's likely to require a much bigger patch in order to
change IndexTuple to support it. I would defer that project to a later
time.

[2]: It is possible to extend the grammar to allow other UNIQUE indexes to be used, if they are on top of NOT NULL columns. This would allow to extend existing production databases with a new column. A proposed syntax is CREATE INDIRECT INDEX idx ON tab (a, b, c) REFERENCES some_unique_index [optional WHERE clause] ; which Bison accepts. I propose not to implement this yet. However this is an important item because it allows existing databases to simply add an UNIQUE NOT NULL column to their existing big tables to take advantage of the feature, without requiring a lengthy dump/reload of tables that currently only have larger keys.
to be used, if they are on top of NOT NULL columns. This would allow to
extend existing production databases with a new column. A proposed
syntax is
CREATE INDIRECT INDEX idx ON tab (a, b, c)
REFERENCES some_unique_index
[optional WHERE clause] ;
which Bison accepts. I propose not to implement this yet. However this
is an important item because it allows existing databases to simply add
an UNIQUE NOT NULL column to their existing big tables to take advantage
of the feature, without requiring a lengthy dump/reload of tables that
currently only have larger keys.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#1)
Re: Indirect indexes

On 10/18/2016 11:28 AM, Alvaro Herrera wrote:

Vacuuming presents an additional challenge: in order to remove index
items from an indirect index, it's critical to scan the PK index first
and collect the PK values that are being removed. Then scan the
indirect index and remove any items that match the PK items removed.
This is a bit problematic because of the additional memory needed to
store the array of PK values. I haven't implemented this yet.

As a whole I think the idea is interesting but the above scares me. Are
we trading initial performance gains for performance degradation through
maintenance? Since autovacuum is an indeterminate launch we could have a
situation where even a medium level updated laden table becomes a source
of contentions for IO and memory resources. I don't know that we would
see issues on modern bare metal but considering our implementation space
is places like RDS and GCE now, this is a serious consideration.

Sincerely,

JD

--
Command Prompt, Inc. http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Joshua D. Drake (#2)
Re: Indirect indexes

On 18 October 2016 at 21:41, Joshua D. Drake <jd@commandprompt.com> wrote:

Are we
trading initial performance gains for performance degradation through
maintenance?

Eh? That's backwards, so No. The whole point of this is it avoids long
term degradation currently caused by non-HOT updates.

Normal UPDATEs that don't change PKs won't generate any changes to
VACUUM away, so only actions that remove PK values will cause anything
to be collected and removed from indexes.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Claudio Freire
klaussfreire@gmail.com
In reply to: Alvaro Herrera (#1)
Re: Indirect indexes

On Tue, Oct 18, 2016 at 3:28 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]

You don't need that limitation (and vacuum will be simpler) if you add
the PK as another key, akin to:

CREATE INDIRECT INDEX idx ON tab (a, b, c);

turns into

CREATE INDEX idx ON tab (a, b, c, pk);

And is queried appropriately (using an index-only scan, extracting the
PK from the index tuple, and then querying the PK index to get the
tids).

In fact, I believe that can work with all index ams supporting index-only scans.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Claudio Freire (#4)
Re: Indirect indexes

On 18 October 2016 at 22:04, Claudio Freire <klaussfreire@gmail.com> wrote:

On Tue, Oct 18, 2016 at 3:28 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]

You don't need that limitation (and vacuum will be simpler) if you add
the PK as another key, akin to:

CREATE INDIRECT INDEX idx ON tab (a, b, c);

turns into

CREATE INDEX idx ON tab (a, b, c, pk);

And is queried appropriately (using an index-only scan, extracting the
PK from the index tuple, and then querying the PK index to get the
tids).

In fact, I believe that can work with all index ams supporting index-only scans.

But if you did that, an UPDATE of a b or c would cause a non-HOT
update, so would defeat the purpose of indirect indexes.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alvaro Herrera (#1)
Re: Indirect indexes

Hi, Alvaro!

Thank you for your proposal. One question about vacuum excites me most.

On Tue, Oct 18, 2016 at 9:28 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Vacuuming presents an additional challenge: in order to remove index
items from an indirect index, it's critical to scan the PK index first
and collect the PK values that are being removed. Then scan the
indirect index and remove any items that match the PK items removed.
This is a bit problematic because of the additional memory needed to
store the array of PK values. I haven't implemented this yet.

Imagine another situation: PK column was not updated, but indirect indexed
column was updated.
Thus, for single heap tuple we would have single PK tuple and two indirect
index tuples (correct me if I'm wrong).
How are we going to delete old indirect index tuple?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#7Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#6)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 12:21 AM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Tue, Oct 18, 2016 at 9:28 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Vacuuming presents an additional challenge: in order to remove index
items from an indirect index, it's critical to scan the PK index first
and collect the PK values that are being removed. Then scan the
indirect index and remove any items that match the PK items removed.
This is a bit problematic because of the additional memory needed to
store the array of PK values. I haven't implemented this yet.

Imagine another situation: PK column was not updated, but indirect indexed
column was updated.
Thus, for single heap tuple we would have single PK tuple and two indirect
index tuples (correct me if I'm wrong).
How are we going to delete old indirect index tuple?

Let me explain it in more details.

There is a table with two columns and indirect index on it.

CREATE TABLE tbl (id integer primary key, val integer);
CREAET INDIRECT INDEX tbl_val_indirect_idx ON tbl (val);

Then do insert and update.

INSERT INTO tbl VALUES (1, 1);
UPDATE tbl SET val = 2 WHERE id = 1;

Then heap would contain two tuples.

ctid | id | val
-------+----+-----
(0;1) | 1 | 1
(0;2) | 1 | 2

tbl_pk_idx would contain another two tuples

id | item_pointer
----+--------------
1 | (0;1)
1 | (0;2)

And tbl_val_indirect_idx would have also two tuples

val | id
-----+----
1 | 1
2 | 1

Then vacuum removes (0;1) from heap, reference to (0;1) from tbl_pk_idx.
But how will it remove (1,1) tuple from tbl_val_indirect_idx? Thus, before
vacuuming tbl_val_indirect_idx we should know not only values of id which
are being removed, but actually (id, val) pairs which are being removed.
Should we collect those paris while scanning heap? But we should also take
into account that multiple heap tuples might have same (id, val) pair
values (assuming there could be other columns being updated). Therefore,
we should take into account when last pair of particular (id, val) pair
value was deleted from heap. That would be very huge change to vacuum, may
be even writing way more complex vacuum algorithm from scratch. Probably,
you see the better solution of this problem.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#8Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#5)
Re: Indirect indexes

On Tue, Oct 18, 2016 at 5:48 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 October 2016 at 22:04, Claudio Freire <klaussfreire@gmail.com> wrote:

On Tue, Oct 18, 2016 at 3:28 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]

You don't need that limitation (and vacuum will be simpler) if you add
the PK as another key, akin to:

CREATE INDIRECT INDEX idx ON tab (a, b, c);

turns into

CREATE INDEX idx ON tab (a, b, c, pk);

And is queried appropriately (using an index-only scan, extracting the
PK from the index tuple, and then querying the PK index to get the
tids).

In fact, I believe that can work with all index ams supporting index-only scans.

But if you did that, an UPDATE of a b or c would cause a non-HOT
update, so would defeat the purpose of indirect indexes.

I meant besides all the other work, omitting the tid from the index
(as only the PK matters), marking them indirect, and all that.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Alexander Korotkov (#7)
Re: Indirect indexes

On 18 October 2016 at 23:46, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Then vacuum removes (0;1) from heap, reference to (0;1) from tbl_pk_idx.
But how will it remove (1,1) tuple from tbl_val_indirect_idx? Thus, before
vacuuming tbl_val_indirect_idx we should know not only values of id which
are being removed, but actually (id, val) pairs which are being removed.
Should we collect those paris while scanning heap? But we should also take
into account that multiple heap tuples might have same (id, val) pair values
(assuming there could be other columns being updated). Therefore, we should
take into account when last pair of particular (id, val) pair value was
deleted from heap. That would be very huge change to vacuum, may be even
writing way more complex vacuum algorithm from scratch. Probably, you see
the better solution of this problem.

The best way to sum up the problem is to consider how we deal with
repeated updates to a single tuple that flip the value from A to B
then back to A then to B then A etc.. Any value in the index can point
to multiple versions of the same tuple and multiple index values can
point to the same tuple (PK value). This problem behaviour was already
known to me from Claudio's earlier analysis of WARM (thanks Claudio).

Yes, VACUUMing that is likely to be a complex issue, as you say. At
the moment I don't have a plan for that, but am not worried.

Indirect indexes produce less index entries in general than current,
so the problem is by-design much smaller than current situation.
Indirect indexes can support killed-tuple interface, so scanning the
index by users will result in natural index maintenance, further
reducing the problem.

So there will be a much reduced need for bulk maintenance. Bulk
maintainence of the index, when needed, can be performed by scanning
the whole table via the index, after the PK index has been vacuumed.
That can be optimized using an index-only scan of the PK to avoid
touching the heap, which should be effective since the VM has been so
recently refreshed. For correctness it would require the index blocks
to be locked against write while checking for removal, so bulk
collection of values to optimize the underlying index doesn't seem
useful. The index scan could also be further optimized by introducing
a visibility map for the index, which is something that would also
optimize normal index VACUUMs as well, but that is a later project and
not something for 10.x

At this stage, the discussion should be 1) can it work? 2) do we want
it? At present, I see that it can work and we just need to be careful
to measure the effectiveness of it to demonstrate that it really is a
better way of doing things in some cases. Indirect indexes are
definitely not a panacea for all problems, for me it is just another
option to add into the rich world of Postgres indexing. Getting
traction can be difficult if people don't understand the pros and cons
of the different index types, but I'm happy we have a wide spread of
knowledge now.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#1)
Re: Indirect indexes

On Tue, Oct 18, 2016 at 2:28 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]

So, I think that this is a really promising direction, but also that
you should try very hard to try to get out from under this 6-byte PK
limitation. That seems really ugly, and in practice it probably means
your PK is probably going to be limited to int4, which is kind of sad
since it leaves people using int8 or text PKs out in the cold. I
believe Claudio Freire is on to something when he suggests storing the
PK in the index tuple; one could try to skip storing the TID, or
always store it as all-zeroes. Simon objected that putting the PK
into the index tuple would disable HOT, but I don't think that's a
valid objection. The whole point of an indirect index is that it
doesn't disable HOT, and the physical location within the index page
you stick the PK value doesn't have any impact on whether that's safe.

The VACUUM problems seem fairly serious. It's true that these indexes
will be less subject to bloat, because they only need updating when
the PK or the indexed columns change, not when other indexed columns
change. On the other hand, there's nothing to prevent a PK from being
recycled for an unrelated tuple. We can guarantee that a TID won't be
recycled until all index references to the TID are gone, but there's
no such guarantee for a PK. AFAICT, that would mean that an indirect
index would have to be viewed as unreliable: after looking up the PK,
you'd *always* have to recheck that it actually matched the index
qual.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#10)
Re: Indirect indexes

On 19 October 2016 at 14:52, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 18, 2016 at 2:28 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

I propose we introduce the concept of "indirect indexes". I have a toy
implementation and before I go further with it, I'd like this assembly's
input on the general direction.

Indirect indexes are similar to regular indexes, except that instead of
carrying a heap TID as payload, they carry the value of the table's
primary key. Because this is laid out on top of existing index support
code, values indexed by the PK can only be six bytes long (the length of
ItemPointerData); in other words, 281,474,976,710,656 rows are
supported, which should be sufficient for most use cases.[1]

So, I think that this is a really promising direction, but also that
you should try very hard to try to get out from under this 6-byte PK
limitation. That seems really ugly, and in practice it probably means
your PK is probably going to be limited to int4, which is kind of sad
since it leaves people using int8 or text PKs out in the cold. I
believe Claudio Freire is on to something when he suggests storing the
PK in the index tuple; one could try to skip storing the TID, or
always store it as all-zeroes.

The main problem IMV is GIN indexes. It's relatively easy to discuss
variable length PKs with btrees, but the GIN format is designed around
use of 6byte values, so expanding beyond that would require
significant redesign/reimplementation. That would be at least a year's
work for not much benefit, so cannot occur for the first release.

A limit 281 trillion rows means the row headers alone will be 9
Petabytes, before we even include block wastage and data payload per
row. So that is more than we'll need for a few years. Setting the max
sequence value to 281474976710656 should be easy enough.

Simon objected that putting the PK
into the index tuple would disable HOT, but I don't think that's a
valid objection.

Just to be clear, that's not what I objected to. Claudio appeared to
be suggesting that an indirect index is the same thing as an index
with PK tacked onto the end, which I re-confirm is not the case since
doing that would not provide the primary objective of indirect
indexes.

The whole point of an indirect index is that it
doesn't disable HOT, and the physical location within the index page
you stick the PK value doesn't have any impact on whether that's safe.

Agreed, though it does have an impact on the length of the index tuple
and thus the size and effectiveness of the index.

Perhaps its best to see the restriction to 6byte PKs as both the first
phase of implementation and an optimization, rather than an ugly wart.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Alexander Korotkov
aekorotkov@gmail.com
In reply to: Robert Haas (#10)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 3:52 PM, Robert Haas <robertmhaas@gmail.com> wrote:

The VACUUM problems seem fairly serious. It's true that these indexes
will be less subject to bloat, because they only need updating when
the PK or the indexed columns change, not when other indexed columns
change. On the other hand, there's nothing to prevent a PK from being
recycled for an unrelated tuple. We can guarantee that a TID won't be
recycled until all index references to the TID are gone, but there's
no such guarantee for a PK. AFAICT, that would mean that an indirect
index would have to be viewed as unreliable: after looking up the PK,
you'd *always* have to recheck that it actually matched the index
qual.

AFAICS, even without considering VACUUM, indirect indexes would be always
used with recheck.
As long as they don't contain visibility information. When indirect
indexed column was updated, indirect index would refer same PK with
different index keys.
There is no direct link between indirect index tuple and heap tuple, only
logical link using PK. Thus, you would anyway have to recheck.

Another approach would be to include visibility information into indirect
indexes themselves. In this case, index should support snapshots by itself
and don't produce false positives.
This approach would require way more intrusive changes in index AMs. We
would probably not able to reuse same index AM and have to make a new one.
But it seems like rather better design for me.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#13Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Alexander Korotkov (#12)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 7:19 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Wed, Oct 19, 2016 at 3:52 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

The VACUUM problems seem fairly serious. It's true that these indexes
will be less subject to bloat, because they only need updating when
the PK or the indexed columns change, not when other indexed columns
change. On the other hand, there's nothing to prevent a PK from being
recycled for an unrelated tuple. We can guarantee that a TID won't be
recycled until all index references to the TID are gone, but there's
no such guarantee for a PK. AFAICT, that would mean that an indirect
index would have to be viewed as unreliable: after looking up the PK,
you'd *always* have to recheck that it actually matched the index
qual.

AFAICS, even without considering VACUUM, indirect indexes would be always
used with recheck.
As long as they don't contain visibility information. When indirect
indexed column was updated, indirect index would refer same PK with
different index keys.
There is no direct link between indirect index tuple and heap tuple, only
logical link using PK. Thus, you would anyway have to recheck.

I agree. Also, I think the recheck mechanism will have to be something like
what I wrote for WARM i.e. only checking for index quals won't be enough
and we would actually need to verify that the heap tuple satisfies the key
in the indirect index.

Thanks,
Pavan

--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#14Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#11)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 9:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

The main problem IMV is GIN indexes. It's relatively easy to discuss
variable length PKs with btrees, but the GIN format is designed around
use of 6byte values, so expanding beyond that would require
significant redesign/reimplementation. That would be at least a year's
work for not much benefit, so cannot occur for the first release.

That doesn't bother me. We can add an amcanindirect flag or similar.

The whole point of an indirect index is that it
doesn't disable HOT, and the physical location within the index page
you stick the PK value doesn't have any impact on whether that's safe.

Agreed, though it does have an impact on the length of the index tuple
and thus the size and effectiveness of the index.

Perhaps its best to see the restriction to 6byte PKs as both the first
phase of implementation and an optimization, rather than an ugly wart.

Perhaps, but I'm not ready to rule out "ugly wart".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Bruce Momjian
bruce@momjian.us
In reply to: Pavan Deolasee (#13)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 07:23:28PM +0530, Pavan Deolasee wrote:

AFAICS, even without considering VACUUM, indirect indexes would be always
used with recheck.
As long as they don't contain visibility information.� When indirect
indexed column was updated, indirect index would refer same PK with
different index keys.
There is no direct link between indirect index tuple and heap tuple, only
logical link using PK.� Thus, you would anyway have to recheck.

I agree. Also, I think the recheck mechanism will have to be something like
what I wrote for WARM i.e. only checking for index quals won't be enough and we
would actually need to verify that the heap tuple satisfies the key in the
indirect index.�

I personally would like to see how far we get with WARM before adding
this feature that requires a DBA to evaluate and enable it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Claudio Freire
klaussfreire@gmail.com
In reply to: Simon Riggs (#11)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 10:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

Simon objected that putting the PK
into the index tuple would disable HOT, but I don't think that's a
valid objection.

Just to be clear, that's not what I objected to. Claudio appeared to
be suggesting that an indirect index is the same thing as an index
with PK tacked onto the end, which I re-confirm is not the case since
doing that would not provide the primary objective of indirect
indexes.

No, I was suggesting using the storage format of those indexes.
Perhaps I wasn't clear.

CREATE INDEX could be implemented entirely as the rewrite I mention, I
believe. But everything else can't, as you say.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#15)
Re: Indirect indexes

On 19 October 2016 at 18:40, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Oct 19, 2016 at 07:23:28PM +0530, Pavan Deolasee wrote:

AFAICS, even without considering VACUUM, indirect indexes would be always
used with recheck.
As long as they don't contain visibility information. When indirect
indexed column was updated, indirect index would refer same PK with
different index keys.
There is no direct link between indirect index tuple and heap tuple, only
logical link using PK. Thus, you would anyway have to recheck.

I agree. Also, I think the recheck mechanism will have to be something like
what I wrote for WARM i.e. only checking for index quals won't be enough and we
would actually need to verify that the heap tuple satisfies the key in the
indirect index.

I personally would like to see how far we get with WARM before adding
this feature that requires a DBA to evaluate and enable it.

Assuming WARM is accepted, that *might* be fine.

What we should ask is what is the difference between indirect indexes
and WARM and to what extent they overlap.

My current understanding is that WARM won't help you if you update
parts of a JSON document and/or use GIN indexes, but is effective
without needing to add a new index type and will be faster for
retrieval than indirect indexes.

So everybody please chirp in with benefits or comparisons.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#17)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 06:58:05PM +0200, Simon Riggs wrote:

I agree. Also, I think the recheck mechanism will have to be something like
what I wrote for WARM i.e. only checking for index quals won't be enough and we
would actually need to verify that the heap tuple satisfies the key in the
indirect index.

I personally would like to see how far we get with WARM before adding
this feature that requires a DBA to evaluate and enable it.

Assuming WARM is accepted, that *might* be fine.

First, I love WARM because everyone gets the benefits by default. For
example, a feature that improves performance by 10% but is only used by
1% of users has a usefulness of 0.1% --- at least that is how I think of
it.

What we should ask is what is the difference between indirect indexes
and WARM and to what extent they overlap.

My current understanding is that WARM won't help you if you update
parts of a JSON document and/or use GIN indexes, but is effective
without needing to add a new index type and will be faster for
retrieval than indirect indexes.

So everybody please chirp in with benefits or comparisons.

I am not sure we have even explored all the limits of WARM with btree
indexes --- I haven't heard anyone talk about non-btree indexes yet.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Claudio Freire
klaussfreire@gmail.com
In reply to: Bruce Momjian (#18)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 2:04 PM, Bruce Momjian <bruce@momjian.us> wrote:

What we should ask is what is the difference between indirect indexes
and WARM and to what extent they overlap.

My current understanding is that WARM won't help you if you update
parts of a JSON document and/or use GIN indexes, but is effective
without needing to add a new index type and will be faster for
retrieval than indirect indexes.

So everybody please chirp in with benefits or comparisons.

I am not sure we have even explored all the limits of WARM with btree
indexes --- I haven't heard anyone talk about non-btree indexes yet.

AFAIK there's no fundamental reason why it wouldn't work for other
index ams, but it does require quite a bit of legwork to get
everything working there.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Alexander Korotkov
aekorotkov@gmail.com
In reply to: Simon Riggs (#9)
Re: Indirect indexes

On Wed, Oct 19, 2016 at 12:53 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 18 October 2016 at 23:46, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

Then vacuum removes (0;1) from heap, reference to (0;1) from tbl_pk_idx.
But how will it remove (1,1) tuple from tbl_val_indirect_idx? Thus,

before

vacuuming tbl_val_indirect_idx we should know not only values of id which
are being removed, but actually (id, val) pairs which are being removed.
Should we collect those paris while scanning heap? But we should also

take

into account that multiple heap tuples might have same (id, val) pair

values

(assuming there could be other columns being updated). Therefore, we

should

take into account when last pair of particular (id, val) pair value was
deleted from heap. That would be very huge change to vacuum, may be even
writing way more complex vacuum algorithm from scratch. Probably, you

see

the better solution of this problem.

The best way to sum up the problem is to consider how we deal with
repeated updates to a single tuple that flip the value from A to B
then back to A then to B then A etc.. Any value in the index can point
to multiple versions of the same tuple and multiple index values can
point to the same tuple (PK value). This problem behaviour was already
known to me from Claudio's earlier analysis of WARM (thanks Claudio).

Thank you for pointing. I didn't follow details of WARM discussion.

Yes, VACUUMing that is likely to be a complex issue, as you say. At

the moment I don't have a plan for that, but am not worried.

AFAICS, the main goal of indirect indexes is to reduce their maintenance
cost. Indirect indexes are much easier to maintain during UPDATEs and this
is good. But it's harder to VACUUM them. So, we need to figure out how
much maintenance cost would be reduced for indirect indexes. This is why I
think digging into VACUUM problems is justified for now.

Indirect indexes produce less index entries in general than current,

so the problem is by-design much smaller than current situation.
Indirect indexes can support killed-tuple interface, so scanning the
index by users will result in natural index maintenance, further
reducing the problem.

That makes sense. But that is not necessary true for any workload. For
instance, keys, which are frequently updated, are not necessary same that
keys, which are frequently selected. Thus, there is still some risk of
bloat.

So there will be a much reduced need for bulk maintenance. Bulk

maintainence of the index, when needed, can be performed by scanning
the whole table via the index, after the PK index has been vacuumed.

That's possible, but such vacuum is going to be very IO consuming when heap
doesn't fit cache. It's even possible that rebuilding of index would be
cheaper.

That can be optimized using an index-only scan of the PK to avoid
touching the heap, which should be effective since the VM has been so
recently refreshed.

But we can't get which of indirect index keys still persist in heap by
using index only scan by PK, because PK doesn't contain those keys. So, we
still need to scan heap for it.

For correctness it would require the index blocks
to be locked against write while checking for removal, so bulk
collection of values to optimize the underlying index doesn't seem
useful. The index scan could also be further optimized by introducing
a visibility map for the index, which is something that would also
optimize normal index VACUUMs as well, but that is a later project and
not something for 10.x

Visibility map for indexes sounds interesting. And that means including
visibility information into index. It's important property of current MVCC
implementation of PostgreSQL, that while updating heap tuple, we don't have
to find location of old index tuples referring it, we only have to insert
new index tuples. Finding location of old index tuples, even for barely
updating index visibility map, would be a substantial change.

At this stage, the discussion should be 1) can it work? 2) do we want

it?

I think that we definitely need indirect indexes and they might work. The
question is design. PostgreSQL MVCC is designed so that index contain no
visibility information. So, currently we're discussing approach which
implies expanding of this design to indirect indexes. The downsides of
this approach are (at least): 1) we should always recheck results obtained
from index, 2) VACUUM becomes very difficult.

There is also alternative approach: include visibility information into
indirect index. In this approach we should include fields required for
visibility (xmin, xmax, etc) into indirect index tuple and keep them up to
date. Then while updating indexed column we would have to update old index
tuple as well. This is the downside. But we would be able to scan without
recheck and VACUUM will be much more easier. We would be even able to
VACUUM indirect index independently from heap. Implementation of this
approach would be way more intrusive. But in my opinion it's much more
clear design.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#21Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alexander Korotkov (#6)
#22Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#18)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#23)
#25Joshua D. Drake
jd@commandprompt.com
In reply to: Alvaro Herrera (#23)
#26Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Joshua D. Drake (#25)
#27Petr Jelinek
petr@2ndquadrant.com
In reply to: Bruce Momjian (#22)
#28Bruce Momjian
bruce@momjian.us
In reply to: Petr Jelinek (#27)
#29Claudio Freire
klaussfreire@gmail.com
In reply to: Petr Jelinek (#27)
#30Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Petr Jelinek (#27)
#31Petr Jelinek
petr@2ndquadrant.com
In reply to: Bruce Momjian (#28)
#32Claudio Freire
klaussfreire@gmail.com
In reply to: Pavan Deolasee (#30)
#33Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Claudio Freire (#32)
#34Claudio Freire
klaussfreire@gmail.com
In reply to: Pavan Deolasee (#33)
#35Pantelis Theodosiou
ypercube@gmail.com
In reply to: Bruce Momjian (#28)
#36Sven R. Kunze
srkunze@mail.de
In reply to: Pantelis Theodosiou (#35)
#37Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Robert Haas (#10)
#38Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Sven R. Kunze (#36)
#39Adam Brusselback
adambrusselback@gmail.com
In reply to: Jim Nasby (#37)
#40Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Adam Brusselback (#39)
#41Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#10)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#41)
#43Sven R. Kunze
srkunze@mail.de
In reply to: Jim Nasby (#38)
#44Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#1)
#45Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#44)
#46Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Andres Freund (#45)
#47Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Haribabu Kommi (#46)
#48Haribabu Kommi
kommi.haribabu@gmail.com
In reply to: Alvaro Herrera (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#41)
#50Robert Haas
robertmhaas@gmail.com
In reply to: Pavan Deolasee (#30)
#51Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Robert Haas (#50)
#52Robert Haas
robertmhaas@gmail.com
In reply to: Pavan Deolasee (#51)
#53Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#1)
#54Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#53)
#55Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#54)
#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#55)
#57Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alvaro Herrera (#55)
#58Bruce Momjian
bruce@momjian.us
In reply to: Alvaro Herrera (#57)
#59Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#57)
#60Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#57)
#61David Steele
david@pgmasters.net
In reply to: Michael Paquier (#60)