Index Skip Scan

Started by Jesper Pedersenalmost 8 years ago152 messageshackers

jesper.pedersen@redhat.com

almost 8 years ago

Hi all,

I would like to start a discussion on Index Skip Scan referred to as
Loose Index Scan in the wiki [1]https://wiki.postgresql.org/wiki/Loose_indexscan.

My use-case is the simplest form of Index Skip Scan (B-Tree only),
namely going from

CREATE TABLE t1 (a integer PRIMARY KEY, b integer);
CREATE INDEX idx_t1_b ON t1 (b);
INSERT INTO t1 (SELECT i, i % 3 FROM generate_series(1, 10000000) as i);
ANALYZE;
EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
HashAggregate (cost=169247.71..169247.74 rows=3 width=4) (actual
time=4104.099..4104.099 rows=3 loops=1)
Output: b
Group Key: t1.b
Buffers: shared hit=44248
-> Seq Scan on public.t1 (cost=0.00..144247.77 rows=9999977
width=4) (actual time=0.059..1050.376 rows=10000000 loops=1)
Output: a, b
Buffers: shared hit=44248
Planning Time: 0.157 ms
Execution Time: 4104.155 ms
(9 rows)

CREATE TABLE t1 (a integer PRIMARY KEY, b integer);
CREATE INDEX idx_t1_b ON t1 (b);
INSERT INTO t1 (SELECT i, i % 3 FROM generate_series(1, 10000000) as i);
ANALYZE;
EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
Index Skip Scan using idx_t1_b on public.t1 (cost=0.43..1.30 rows=3
width=4) (actual time=0.061..0.137 rows=3 loops=1)
Output: b
Heap Fetches: 3
Buffers: shared hit=13
Planning Time: 0.155 ms
Execution Time: 0.170 ms
(6 rows)

I took Thomas Munro's previous patch [2]/messages/by-id/CADLWmXXbTSBxP-MzJuPAYSsL_2f0iPm5VWPbCvDbVvfX93FKkw@mail.gmail.com on the subject, added a GUC, a
test case, documentation hooks, minor code cleanups, and made the patch
pass an --enable-cassert make check-world run. So, the overall design is
the same.

However, as Robert Haas noted in the thread there are issues with the
patch as is, especially in relationship to the amcanbackward functionality.

A couple of questions to begin with.

Should the patch continue to "piggy-back" on T_IndexOnlyScan, or should
a new node (T_IndexSkipScan) be created ? If latter, then there likely
will be functionality that needs to be refactored into shared code
between the nodes.

Which is the best way to deal with the amcanbackward functionality ? Do
people see another alternative to Robert's idea of adding a flag to the
scan.

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. The patch is based on master/4c8156.

Any feedback, suggestions, design ideas and help with the patch in
general is greatly appreciated.

Thanks in advance !

[1]: https://wiki.postgresql.org/wiki/Loose_indexscan
[2]: /messages/by-id/CADLWmXXbTSBxP-MzJuPAYSsL_2f0iPm5VWPbCvDbVvfX93FKkw@mail.gmail.com
/messages/by-id/CADLWmXXbTSBxP-MzJuPAYSsL_2f0iPm5VWPbCvDbVvfX93FKkw@mail.gmail.com

Best regards,
Jesper

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Jesper Pedersen (#1)

Re: Index Skip Scan

Hi!

On Mon, Jun 18, 2018 at 6:26 PM Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:

I would like to start a discussion on Index Skip Scan referred to as
Loose Index Scan in the wiki [1].

Great, I glad to see you working in this!

However, as Robert Haas noted in the thread there are issues with the
patch as is, especially in relationship to the amcanbackward functionality.

A couple of questions to begin with.

Should the patch continue to "piggy-back" on T_IndexOnlyScan, or should
a new node (T_IndexSkipScan) be created ? If latter, then there likely
will be functionality that needs to be refactored into shared code
between the nodes.

Is skip scan only possible for index-only scan? I guess, that no. We
could also make plain index scan to behave like a skip scan. And it
should be useful for accelerating DISTINCT ON clause. Thus, we might
have 4 kinds of index scan: IndexScan, IndexOnlyScan, IndexSkipScan,
IndexOnlySkipScan. So, I don't think I like index scan nodes to
multiply this way, and it would be probably better to keep number
nodes smaller. But I don't insist on that, and I would like to hear
other opinions too.

Which is the best way to deal with the amcanbackward functionality ? Do
people see another alternative to Robert's idea of adding a flag to the
scan.

Supporting amcanbackward seems to be basically possible, but rather
complicated and not very efficient. So, it seems to not worth
implementing, at least in the initial version. And then the question
should how index access method report that it supports both skip scan
and backward scan, but not both together? What about turning
amcanbackward into a function which takes (bool skipscan) argument?
Therefore, this function will return whether backward scan is
supported depending of whether skip scan is used.

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. The patch is based on master/4c8156.

Please, register it on commitfest. If even there wouldn't be enough
of time for this patch on July commitfest, it's no problem to move it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Andrew Dunstan

andrew@dunslane.net

almost 8 years ago

In reply to: Jesper Pedersen (#1)

Re: Index Skip Scan

On 06/18/2018 11:25 AM, Jesper Pedersen wrote:

Hi all,

I would like to start a discussion on Index Skip Scan referred to as
Loose Index Scan in the wiki [1].

awesome

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. T

New large features are not appropriate for the July CF. September should
be your goal.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Alexander Korotkov

aekorotkov@gmail.com

almost 8 years ago

In reply to: Andrew Dunstan (#3)

Re: Index Skip Scan

On Mon, Jun 18, 2018 at 11:20 PM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 06/18/2018 11:25 AM, Jesper Pedersen wrote:

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. T

New large features are not appropriate for the July CF. September should
be your goal.

Assuming this, should we have possibility to register patch to
September CF from now?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Michael Paquier

michael@paquier.xyz

almost 8 years ago

In reply to: Alexander Korotkov (#4)

Re: Index Skip Scan

On Tue, Jun 19, 2018 at 12:06:59AM +0300, Alexander Korotkov wrote:

Assuming this, should we have possibility to register patch to
September CF from now?

There cannot be two commit fests marked as open at the same time as
Magnus mentions here:
/messages/by-id/CABUevEx1k+axZcV2t3wEYf5uLg72YbKSch_hUhFnZq+-KSoJsA@mail.gmail.com

There is no issue in registering new patches in future ones for admins,
but normal users cannot, right? In this case, could you wait that the
next CF is marked as in progress and that the one of September is
opened? You could also add it to the July one if you are not willing to
wait, and that will get bumped by one of the CFMs, but this makes the
whole process unnecessary noisy.
--
Michael

Dmitry Dolgov

9erthalion6@gmail.com

almost 8 years ago

In reply to: Michael Paquier (#5)

Re: Index Skip Scan

On 18 June 2018 at 19:31, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

A couple of questions to begin with.

Should the patch continue to "piggy-back" on T_IndexOnlyScan, or should
a new node (T_IndexSkipScan) be created ? If latter, then there likely
will be functionality that needs to be refactored into shared code
between the nodes.

Is skip scan only possible for index-only scan? I guess, that no. We
could also make plain index scan to behave like a skip scan. And it
should be useful for accelerating DISTINCT ON clause. Thus, we might
have 4 kinds of index scan: IndexScan, IndexOnlyScan, IndexSkipScan,
IndexOnlySkipScan. So, I don't think I like index scan nodes to
multiply this way, and it would be probably better to keep number
nodes smaller. But I don't insist on that, and I would like to hear
other opinions too.

In one of patches I'm working on I had similar situation, when I wanted to
split one node into two similar nodes (before I just extended it) and logically
it made perfect sense. But it turned out to be quite useless and the advantage
I've got wasn't worth it - and just to mention, those nodes had more differences
than in this patch. So I agree that probably it would be better to keep using
IndexOnlyScan.

On 19 June 2018 at 03:40, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Jun 19, 2018 at 12:06:59AM +0300, Alexander Korotkov wrote:

Assuming this, should we have possibility to register patch to
September CF from now?

There cannot be two commit fests marked as open at the same time as
Magnus mentions here:
/messages/by-id/CABUevEx1k+axZcV2t3wEYf5uLg72YbKSch_hUhFnZq+-KSoJsA@mail.gmail.com

In this case, could you wait that the next CF is marked as in progress and
that the one of September is opened?

Yep, since the next CF will start shortly that's the easiest thing to do.

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Alexander Korotkov (#2)

Re: Index Skip Scan

Hi Alexander,

On 06/18/2018 01:31 PM, Alexander Korotkov wrote:

<jesper.pedersen@redhat.com> wrote:

Should the patch continue to "piggy-back" on T_IndexOnlyScan, or should
a new node (T_IndexSkipScan) be created ? If latter, then there likely
will be functionality that needs to be refactored into shared code
between the nodes.

Is skip scan only possible for index-only scan? I guess, that no. We
could also make plain index scan to behave like a skip scan. And it
should be useful for accelerating DISTINCT ON clause. Thus, we might
have 4 kinds of index scan: IndexScan, IndexOnlyScan, IndexSkipScan,
IndexOnlySkipScan. So, I don't think I like index scan nodes to
multiply this way, and it would be probably better to keep number
nodes smaller. But I don't insist on that, and I would like to hear
other opinions too.

Yes, there are likely other use-cases for Index Skip Scan apart from the
simplest form. Which sort of suggests that having dedicated nodes would
be better in the long run.

My goal is to cover the simplest form, which can be handled by extending
the T_IndexOnlyScan node, or by having common functions that both use.
We can always improve the functionality with future patches.

Which is the best way to deal with the amcanbackward functionality ? Do
people see another alternative to Robert's idea of adding a flag to the
scan.

Supporting amcanbackward seems to be basically possible, but rather
complicated and not very efficient. So, it seems to not worth
implementing, at least in the initial version. > And then the question
should how index access method report that it supports both skip scan
and backward scan, but not both together? What about turning
amcanbackward into a function which takes (bool skipscan) argument?
Therefore, this function will return whether backward scan is
supported depending of whether skip scan is used.

The feedback from Robert Haas seems to suggest that it was a requirement
for the patch to be considered.

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. The patch is based on master/4c8156.

Please, register it on commitfest. If even there wouldn't be enough
of time for this patch on July commitfest, it's no problem to move it.

Based on the feedback from Andrew and Michael I won't register this
thread until the September CF.

Thanks for your feedback !

Best regards,
Jesper

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Dmitry Dolgov (#6)

Re: Index Skip Scan

Hi Dmitry,

On 06/19/2018 06:01 AM, Dmitry Dolgov wrote:

On 18 June 2018 at 19:31, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:
Is skip scan only possible for index-only scan? I guess, that no. We
could also make plain index scan to behave like a skip scan. And it
should be useful for accelerating DISTINCT ON clause. Thus, we might
have 4 kinds of index scan: IndexScan, IndexOnlyScan, IndexSkipScan,
IndexOnlySkipScan. So, I don't think I like index scan nodes to
multiply this way, and it would be probably better to keep number
nodes smaller. But I don't insist on that, and I would like to hear
other opinions too.

In one of patches I'm working on I had similar situation, when I wanted to
split one node into two similar nodes (before I just extended it) and logically
it made perfect sense. But it turned out to be quite useless and the advantage
I've got wasn't worth it - and just to mention, those nodes had more differences
than in this patch. So I agree that probably it would be better to keep using
IndexOnlyScan.

I looked at this today, and creating a new node (T_IndexOnlySkipScan)
would make the patch more complex.

The question is if the patch should create such a node such that future
patches didn't have to deal with refactoring to a new node to cover
additional functionality.

Thanks for your feedback !

Best regards,
Jesper

Bhushan Uparkar

bhushan.uparkar@gmail.com

almost 8 years ago

In reply to: Alexander Korotkov (#2)

Re: Index Skip Scan

Hello Jesper,

I was reviewing index-skip patch example and have a comment on it. Example query “select distinct b from t1” is equivalent to “select b from t1 group by b”. When I tried the 2nd form of query it came up with different plan, is it possible that index skip scan can address it as well?

postgres=# explain verbose select b from t1 group by b;
QUERY PLAN
----------------------------------------------------------------------------------------------------
Group (cost=97331.29..97332.01 rows=3 width=4)
Output: b
Group Key: t1.b
-> Gather Merge (cost=97331.29..97331.99 rows=6 width=4)
Output: b
Workers Planned: 2
-> Sort (cost=96331.27..96331.27 rows=3 width=4)
Output: b
Sort Key: t1.b
-> Partial HashAggregate (cost=96331.21..96331.24 rows=3 width=4)
Output: b
Group Key: t1.b
-> Parallel Seq Scan on public.t1 (cost=0.00..85914.57 rows=4166657 width=4)
Output: a, b
(14 rows)

Time: 1.167 ms

— And here is the original example
postgres=# explain verbose SELECT DISTINCT b FROM t1;
QUERY PLAN
-------------------------------------------------------------------------------
Index Skip Scan using idx_t1_b on public.t1 (cost=0.43..1.30 rows=3 width=4)
Output: b
(2 rows)

Time: 0.987 ms

Show quoted text

On Jun 18, 2018, at 10:31 AM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote:

Hi!

On Mon, Jun 18, 2018 at 6:26 PM Jesper Pedersen
<jesper.pedersen@redhat.com> wrote:

I would like to start a discussion on Index Skip Scan referred to as
Loose Index Scan in the wiki [1].

Great, I glad to see you working in this!

However, as Robert Haas noted in the thread there are issues with the
patch as is, especially in relationship to the amcanbackward functionality.

A couple of questions to begin with.

Should the patch continue to "piggy-back" on T_IndexOnlyScan, or should
a new node (T_IndexSkipScan) be created ? If latter, then there likely
will be functionality that needs to be refactored into shared code
between the nodes.

Is skip scan only possible for index-only scan? I guess, that no. We
could also make plain index scan to behave like a skip scan. And it
should be useful for accelerating DISTINCT ON clause. Thus, we might
have 4 kinds of index scan: IndexScan, IndexOnlyScan, IndexSkipScan,
IndexOnlySkipScan. So, I don't think I like index scan nodes to
multiply this way, and it would be probably better to keep number
nodes smaller. But I don't insist on that, and I would like to hear
other opinions too.

Which is the best way to deal with the amcanbackward functionality ? Do
people see another alternative to Robert's idea of adding a flag to the
scan.

Supporting amcanbackward seems to be basically possible, but rather
complicated and not very efficient. So, it seems to not worth
implementing, at least in the initial version. And then the question
should how index access method report that it supports both skip scan
and backward scan, but not both together? What about turning
amcanbackward into a function which takes (bool skipscan) argument?
Therefore, this function will return whether backward scan is
supported depending of whether skip scan is used.

I wasn't planning on making this a patch submission for the July
CommitFest due to the reasons mentioned above, but can do so if people
thinks it is best. The patch is based on master/4c8156.

Please, register it on commitfest. If even there wouldn't be enough
of time for this patch on July commitfest, it's no problem to move it.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#10

Thomas Munro

thomas.munro@gmail.com

almost 8 years ago

In reply to: Bhushan Uparkar (#9)

Re: Index Skip Scan

On Thu, Aug 16, 2018 at 5:44 PM, Bhushan Uparkar
<bhushan.uparkar@gmail.com> wrote:

I was reviewing index-skip patch example and have a comment on it. Example query “select distinct b from t1” is equivalent to “select b from t1 group by b”. When I tried the 2nd form of query it came up with different plan, is it possible that index skip scan can address it as well?

Yeah, there are a few tricks you can do with "index skip scans"
(Oracle name, or as IBM calls them, "index jump scans"... I was
slightly tempted to suggest we call ours "index hop scans"...). For
example:

* groups and certain aggregates (MIN() and MAX() of suffix index
columns within each group)
* index scans where the scan key doesn't include the leading columns
(but you expect there to be sufficiently few values)
* merge joins (possibly the trickiest and maybe out of range)

You're right that a very simple GROUP BY can be equivalent to a
DISTINCT query, but I'm not sure if it's worth recognising that
directly or trying to implement the more general grouping trick that
can handle MIN/MAX, and whether that should be the same executor
node... The idea of starting with DISTINCT was just that it's
comparatively easy. We should certainly try to look ahead and bear
those features in mind when figuring out the interfaces though. Would
the indexam skip(scan, direction, prefix_size) operation I proposed be
sufficient? Is there a better way?

I'm glad to see this topic come back!

--
Thomas Munro
http://www.enterprisedb.com

#11

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Bhushan Uparkar (#9)

Re: Index Skip Scan

Hi Bhushan,

On 08/16/2018 01:44 AM, Bhushan Uparkar wrote:

I was reviewing index-skip patch example and have a comment on it.

Thanks for your interest, and feedback on this patch !

Example query “select distinct b from t1” is equivalent to “select b from t1 group by b”. When I tried the 2nd form > of query it came up with different plan, is it possible that index skip scan can address it as well?

Like Thomas commented down-thread my goal is to keep this contribution
as simple as possible in order to get to something that can be
committed. Improvements can follow in future CommitFests, which may end
up in the same release.

However, as stated in my original mail my goal is the simplest form of
Index Skip Scan (or whatever we call it). I welcome any help with the patch.

Best regards,
Jesper

#12

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Thomas Munro (#10)

Re: Index Skip Scan

Hi Thomas,

On 08/16/2018 02:22 AM, Thomas Munro wrote:

The idea of starting with DISTINCT was just that it's
comparatively easy. We should certainly try to look ahead and bear
those features in mind when figuring out the interfaces though. Would
the indexam skip(scan, direction, prefix_size) operation I proposed be
sufficient? Is there a better way?

Yeah, I'm hoping that a Committer can provide some feedback on the
direction that this patch needs to take.

One thing to consider is the pluggable storage patch, which is a lot
more important than this patch. I don't want this patch to get in the
way of that work, so it may have to wait a bit in order to see any new
potential requirements.

I'm glad to see this topic come back!

You did the work, and yes hopefully we can get closer to this subject in
12 :)

Best regards,
Jesper

#13

Stephen Frost

sfrost@snowman.net

almost 8 years ago

In reply to: Jesper Pedersen (#12)

Re: Index Skip Scan

Greetings,

* Jesper Pedersen (jesper.pedersen@redhat.com) wrote:

On 08/16/2018 02:22 AM, Thomas Munro wrote:

The idea of starting with DISTINCT was just that it's
comparatively easy. We should certainly try to look ahead and bear
those features in mind when figuring out the interfaces though. Would
the indexam skip(scan, direction, prefix_size) operation I proposed be
sufficient? Is there a better way?

Yeah, I'm hoping that a Committer can provide some feedback on the direction
that this patch needs to take.

Thomas is one these days. :)

At least on first glance, that indexam seems to make sense to me, but
I've not spent a lot of time thinking about it. Might be interesting to
ask Peter G about it though.

One thing to consider is the pluggable storage patch, which is a lot more
important than this patch. I don't want this patch to get in the way of that
work, so it may have to wait a bit in order to see any new potential
requirements.

Not sure where this came from, but I don't think it's particularly good
to be suggesting that one feature is more important than another or that
we need to have one wait for another as this seems to imply. I'd
certainly really like to see PG finally have skipping scans, for one
thing, and it seems like with some effort that might be able to happen
for v12. I'm not convinced that we're going to get pluggable storage to
happen in v12 and I don't really agree with recommending that people
hold off on making improvements to things because it's coming.

Thanks!

Stephen

#14

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Stephen Frost (#13)

Re: Index Skip Scan

Hi Stephen,

On 08/16/2018 02:36 PM, Stephen Frost wrote:

Yeah, I'm hoping that a Committer can provide some feedback on the direction
that this patch needs to take.

Thomas is one these days. :)

I know :) However, there are some open questions from Thomas' original
submission that still needs to be ironed out.

At least on first glance, that indexam seems to make sense to me, but
I've not spent a lot of time thinking about it. Might be interesting to
ask Peter G about it though.

Yes, or Anastasia who also have done a lot of work on nbtree/.

One thing to consider is the pluggable storage patch, which is a lot more
important than this patch. I don't want this patch to get in the way of that
work, so it may have to wait a bit in order to see any new potential
requirements.

Not sure where this came from, but I don't think it's particularly good
to be suggesting that one feature is more important than another or that
we need to have one wait for another as this seems to imply. I'd
certainly really like to see PG finally have skipping scans, for one
thing, and it seems like with some effort that might be able to happen
for v12. I'm not convinced that we're going to get pluggable storage to
happen in v12 and I don't really agree with recommending that people
hold off on making improvements to things because it's coming.

My point was that I know this patch needs work, so any feedback that get
it closer to a solution will help.

Pluggable storage may / may not add new requirements, but it is up to
the people working on that, some of which are Committers, to take time
"off" to provide feedback for this patch in order steer me in right
direction.

Work can happen in parallel, and I'm definitely not recommending that
people hold off on any patches that they want to provide feedback for,
or submit for a CommitFest.

Best regards,
Jesper

#15

Stephen Frost

sfrost@snowman.net

almost 8 years ago

In reply to: Jesper Pedersen (#14)

Re: Index Skip Scan

Greetings,

* Jesper Pedersen (jesper.pedersen@redhat.com) wrote:

On 08/16/2018 02:36 PM, Stephen Frost wrote:

Not sure where this came from, but I don't think it's particularly good
to be suggesting that one feature is more important than another or that
we need to have one wait for another as this seems to imply. I'd
certainly really like to see PG finally have skipping scans, for one
thing, and it seems like with some effort that might be able to happen
for v12. I'm not convinced that we're going to get pluggable storage to
happen in v12 and I don't really agree with recommending that people
hold off on making improvements to things because it's coming.

My point was that I know this patch needs work, so any feedback that get it
closer to a solution will help.

Pluggable storage may / may not add new requirements, but it is up to the
people working on that, some of which are Committers, to take time "off" to
provide feedback for this patch in order steer me in right direction.

I don't think it's really necessary for this work to be suffering under
some concern that pluggable storage will make it have to change. Sure,
it might, but it also very well might not. For my 2c, anyway, this
seems likely to get into the tree before pluggable storage does and it's
pretty unlikely to be the only thing that that work will need to be
prepared to address when it happens.

Work can happen in parallel, and I'm definitely not recommending that people
hold off on any patches that they want to provide feedback for, or submit
for a CommitFest.

Yes, work can happen in parallel, and I don't really think there needs
to be a concern about some other patch set when it comes to getting this
patch committed.

Thanks!

Stephen

#16

Peter Geoghegan

pg@bowt.ie

almost 8 years ago

In reply to: Thomas Munro (#10)

Re: Index Skip Scan

On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Yeah, there are a few tricks you can do with "index skip scans"
(Oracle name, or as IBM calls them, "index jump scans"... I was
slightly tempted to suggest we call ours "index hop scans"...).

Hopscotch scans?

* groups and certain aggregates (MIN() and MAX() of suffix index
columns within each group)
* index scans where the scan key doesn't include the leading columns
(but you expect there to be sufficiently few values)
* merge joins (possibly the trickiest and maybe out of range)

FWIW, I suspect that we're going to have the biggest problems in the
optimizer. It's not as if ndistinct is in any way reliable. That may
matter more on average than it has with other path types.

--
Peter Geoghegan

#17

Andres Freund

andres@anarazel.de

almost 8 years ago

In reply to: Jesper Pedersen (#12)

Re: Index Skip Scan

On August 16, 2018 8:28:45 PM GMT+02:00, Jesper Pedersen <jesper.pedersen@redhat.com> wrote:

One thing to consider is the pluggable storage patch, which is a lot
more important than this patch. I don't want this patch to get in the
way of that work, so it may have to wait a bit in order to see any new
potential requirements.

I don't think this would be a meaningful, relative to the size of the patch sets, amount of conflict between the two. So I don't think we have to consider relative importance (which I don't think is that easy to assess in this case).

Fwiw, I've a significantly further revised version of the tableam patch that I plan to send in a few days. Ported the current zheap patch as a separate AM which helped weed out a lot of issues.

Andres

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

#18

Thomas Munro

thomas.munro@gmail.com

almost 8 years ago

In reply to: Peter Geoghegan (#16)

Re: Index Skip Scan

On Fri, Aug 17, 2018 at 7:48 AM, Peter Geoghegan <pg@bowt.ie> wrote:

On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

* groups and certain aggregates (MIN() and MAX() of suffix index
columns within each group)
* index scans where the scan key doesn't include the leading columns
(but you expect there to be sufficiently few values)
* merge joins (possibly the trickiest and maybe out of range)

FWIW, I suspect that we're going to have the biggest problems in the
optimizer. It's not as if ndistinct is in any way reliable. That may
matter more on average than it has with other path types.

Can you give an example of problematic ndistinct underestimation?

I suppose you might be able to defend against that in the executor: if
you find that you've done an unexpectedly high number of skips, you
could fall back to regular next-tuple mode. Unfortunately that's
require the parent plan node to tolerate non-unique results.

I noticed that the current patch doesn't care about restrictions on
the range (SELECT DISTINCT a FROM t WHERE a BETWEEN 500 and 600), but
that causes it to overestimate the number of btree searches, which is
a less serious problem (it might not chose a skip scan when it would
have been better).

--
Thomas Munro
http://www.enterprisedb.com

#19

Jesper Pedersen

jesper.pedersen@redhat.com

almost 8 years ago

In reply to: Peter Geoghegan (#16)

Re: Index Skip Scan

Hi Peter,

On 08/16/2018 03:48 PM, Peter Geoghegan wrote:

On Wed, Aug 15, 2018 at 11:22 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

* groups and certain aggregates (MIN() and MAX() of suffix index
columns within each group)
* index scans where the scan key doesn't include the leading columns
(but you expect there to be sufficiently few values)
* merge joins (possibly the trickiest and maybe out of range)

FWIW, I suspect that we're going to have the biggest problems in the
optimizer. It's not as if ndistinct is in any way reliable. That may
matter more on average than it has with other path types.

Thanks for sharing this; it is very useful to know.

Best regards,
Jesper

#20

Peter Geoghegan

pg@bowt.ie

almost 8 years ago

In reply to: Thomas Munro (#18)

Re: Index Skip Scan

On Thu, Aug 16, 2018 at 4:10 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:

Can you give an example of problematic ndistinct underestimation?

Yes. See /messages/by-id/CAKuK5J12QokFh88tQz-oJMSiBg2QyjM7K7HLnbYi3Ze+Y5BtWQ@mail.gmail.com,
for example. That's a complaint about an underestimation specifically.

This seems to come up about once every 3 years, at least from my
perspective. I'm always surprised that ndistinct doesn't get
implicated in bad query plans more frequently.

I suppose you might be able to defend against that in the executor: if
you find that you've done an unexpectedly high number of skips, you
could fall back to regular next-tuple mode. Unfortunately that's
require the parent plan node to tolerate non-unique results.

I like the idea of dynamic fallback in certain situations, but the
details always seem complicated.

--
Peter Geoghegan

#21

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Peter Geoghegan (#20)

#22

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Dmitry Dolgov (#21)

#23

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 7 years ago

In reply to: Jesper Pedersen (#22)

#24

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Alexander Kuzmenkov (#23)

#25

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 7 years ago

In reply to: Jesper Pedersen (#24)

#26

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Alexander Kuzmenkov (#25)

#27

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Dmitry Dolgov (#26)

#28

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Jesper Pedersen (#27)

#29

Pavel Stehule

pavel.stehule@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#28)

#30

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Pavel Stehule (#29)

#31

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Pavel Stehule (#29)

#32

Pavel Stehule

pavel.stehule@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#31)

#33

Pavel Stehule

pavel.stehule@gmail.com

over 7 years ago

In reply to: Pavel Stehule (#32)

#34

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Pavel Stehule (#33)

#35

Robert Haas

robertmhaas@gmail.com

over 7 years ago

In reply to: Jesper Pedersen (#24)

#36

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Robert Haas (#35)

#37

Sergei Kornilov

sk@zsrv.org

over 7 years ago

In reply to: Dmitry Dolgov (#36)

#38

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Sergei Kornilov (#37)

#39

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#38)

#40

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 7 years ago

In reply to: Jesper Pedersen (#27)

#41

James Coleman

jtc331@gmail.com

over 7 years ago

In reply to: Alexander Kuzmenkov (#40)

#42

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Alexander Kuzmenkov (#40)

#43

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Jesper Pedersen (#42)

#44

Alexander Kuzmenkov

a.kuzmenkov@postgrespro.ru

over 7 years ago

In reply to: Dmitry Dolgov (#43)

#45

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Alexander Kuzmenkov (#44)

#46

Peter Geoghegan

pg@bowt.ie

over 7 years ago

In reply to: Dmitry Dolgov (#45)

#47

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Peter Geoghegan (#46)

#48

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#47)

#49

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#48)

#50

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#49)

#51

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 7 years ago

In reply to: Dmitry Dolgov (#50)

#52

Jesper Pedersen

jesper.pedersen@redhat.com

over 7 years ago

In reply to: Kyotaro Horiguchi (#51)

#53

James Coleman

jtc331@gmail.com

over 7 years ago

In reply to: Kyotaro Horiguchi (#51)

#54

Andres Freund

andres@anarazel.de

over 7 years ago

In reply to: James Coleman (#53)

#55

Dmitry Dolgov

9erthalion6@gmail.com

over 7 years ago

In reply to: Andres Freund (#54)

#56

Jeff Janes

jeff.janes@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#55)

#57

Jeff Janes

jeff.janes@gmail.com

about 7 years ago

In reply to: Kyotaro Horiguchi (#51)

#58

Thomas Munro

thomas.munro@gmail.com

about 7 years ago

In reply to: Jeff Janes (#57)

#59

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 7 years ago

In reply to: Thomas Munro (#58)

#60

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Tomas Vondra (#59)

#61

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#60)

#62

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#61)

#63

Kyotaro Horiguchi

horikyota.ntt@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#61)

#64

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Kyotaro Horiguchi (#63)

#65

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#64)

#66

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#65)

#67

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#66)

#68

Dmitry Dolgov

9erthalion6@gmail.com

about 7 years ago

In reply to: Dmitry Dolgov (#67)

#69

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#68)

#70

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Floris Van Nee (#69)

#71

Rafia Sabih

rafia.pghackers@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#70)

#72

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#70)

#73

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#72)

#74

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#73)

#75

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Floris Van Nee (#70)

#76

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Rafia Sabih (#71)

#77

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#76)

#78

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Tomas Vondra (#59)

#79

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Floris Van Nee (#78)

#80

James Coleman

jtc331@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#79)

#81

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#79)

#82

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: James Coleman (#80)

#83

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: David Rowley (#81)

#84

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: David Rowley (#81)

#85

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Dmitry Dolgov (#84)

#86

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Dmitry Dolgov (#85)

#87

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Jesper Pedersen (#86)

#88

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#87)

#89

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#88)

#90

Peter Geoghegan

pg@bowt.ie

almost 7 years ago

In reply to: Floris Van Nee (#89)

#91

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Peter Geoghegan (#90)

#92

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dmitry Dolgov (#85)

#93

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Tomas Vondra (#92)

#94

Tomas Vondra

tomas.vondra@2ndquadrant.com

almost 7 years ago

In reply to: Dmitry Dolgov (#93)

#95

Thomas Munro

thomas.munro@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#86)

#96

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Thomas Munro (#95)

#97

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: David Rowley (#96)

#98

David Fetter

david@fetter.org

almost 7 years ago

In reply to: David Rowley (#96)

#99

James Coleman

jtc331@gmail.com

almost 7 years ago

In reply to: David Fetter (#98)

#100

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: James Coleman (#99)

#101

David Fetter

david@fetter.org

almost 7 years ago

In reply to: David Rowley (#100)

#102

Thomas Munro

thomas.munro@gmail.com

almost 7 years ago

In reply to: David Rowley (#96)

#103

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Thomas Munro (#102)

#104

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Thomas Munro (#102)

#105

Thomas Munro

thomas.munro@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#104)

#106

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Thomas Munro (#105)

#107

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#103)

#108

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#107)

#109

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#108)

#110

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#109)

#111

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Floris Van Nee (#109)

#112

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#111)

#113

Thomas Munro

thomas.munro@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#107)

#114

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Thomas Munro (#113)

#115

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: David Rowley (#114)

#116

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#115)

#117

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Dmitry Dolgov (#108)

#118

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: David Rowley (#117)

#119

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: David Rowley (#117)

#120

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#119)

#121

David Rowley

dgrowleyml@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#119)

#122

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: David Rowley (#121)

#123

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Jesper Pedersen (#122)

#124

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 7 years ago

In reply to: Dmitry Dolgov (#123)

#125

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 7 years ago

In reply to: Kyotaro Horiguchi (#124)

#126

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Kyotaro Horiguchi (#125)

#127

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 7 years ago

In reply to: Dmitry Dolgov (#126)

#128

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Dmitry Dolgov (#126)

#129

Jesper Pedersen

jesper.pedersen@redhat.com

almost 7 years ago

In reply to: Dmitry Dolgov (#128)

#130

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Jesper Pedersen (#129)

#131

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#130)

#132

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#131)

#133

Dmitry Dolgov

9erthalion6@gmail.com

almost 7 years ago

In reply to: Floris Van Nee (#132)

#134

Floris Van Nee

florisvannee@Optiver.com

almost 7 years ago

In reply to: Dmitry Dolgov (#133)

#135

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Floris Van Nee (#134)

#136

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Dmitry Dolgov (#133)

#137

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Alvaro Herrera (#136)

#138

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Dmitry Dolgov (#137)

#139

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Alvaro Herrera (#138)

#140

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Dmitry Dolgov (#139)

#141

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 6 years ago

In reply to: Alvaro Herrera (#140)

#142

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 6 years ago

In reply to: Kyotaro Horiguchi (#141)

#143

Alvaro Herrera

alvherre@2ndquadrant.com

over 6 years ago

In reply to: Kyotaro Horiguchi (#142)

#144

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 6 years ago

In reply to: Alvaro Herrera (#143)

#145

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Kyotaro Horiguchi (#144)

#146

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Dmitry Dolgov (#145)

#147

Peter Geoghegan

pg@bowt.ie

over 6 years ago

In reply to: Peter Geoghegan (#146)

#148

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Peter Geoghegan (#147)

#149

Dmitry Dolgov

9erthalion6@gmail.com

over 6 years ago

In reply to: Dmitry Dolgov (#148)

#150

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 6 years ago

In reply to: Dmitry Dolgov (#149)

#151

Jesper Pedersen

jesper.pedersen@redhat.com

over 6 years ago

In reply to: Tomas Vondra (#150)

#152

Jesper Pedersen

jesper.pedersen@redhat.com

over 6 years ago

In reply to: Jesper Pedersen (#151)

Index Skip Scan

Attachments: