BRIN indexes and ORDER BY

Started by Darren Lafreniereover 9 years ago8 messagesgeneral
Jump to latest
#1Darren Lafreniere
dlafreniere@onezero.com

Hello,

We're curious about the current behavior in 9.5.4, and possible future
enhancements, of BRIN indexes with respect to ordering.

In the docs, section 11.4. "Indexes and ORDER BY" (
https://www.postgresql.org/docs/9.5/static/indexes-ordering.html) is clear
that anything other than B-tree indexes have unspecified ordering:

"In addition to simply finding the rows to be returned by a query, an index
may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree can
produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Our current test shows that ordering by a BRIN indexed column still
performs an unoptimized sort:

SELECT generate_series(1, 10000000) AS id INTO test;
CREATE INDEX idx_test_id ON test USING BRIN (id);
EXPLAIN SELECT id FROM test ORDER BY id DESC LIMIT 20;

Limit (cost=410344.40..410344.45 rows=20 width=4)
-> Sort (cost=410344.40..435344.40 rows=1000000 width=4)"
Sort Key: id DESC
-> Seq Scan on test (cost=0.00..144248.00 rows=10000000 width=4)

Is there anything we're missing to speed this up? Or is it still a future
feature?

Thank you,
Darren Lafreniere

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Darren Lafreniere (#1)
Re: BRIN indexes and ORDER BY

Darren Lafreniere wrote:

"In addition to simply finding the rows to be returned by a query, an index
may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree can
produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Darren Lafreniere
dlafreniere@onezero.com
In reply to: Alvaro Herrera (#2)
Re: BRIN indexes and ORDER BY

Ahh, yes. I misread that. Thank you for the clarification.

On Wed, Oct 5, 2016 at 2:27 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Show quoted text

Darren Lafreniere wrote:

"In addition to simply finding the rows to be returned by a query, an

index

may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree

can

produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#2)
Re: BRIN indexes and ORDER BY

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Darren Lafreniere wrote:

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

Alvaro's reading is correct; I wasn't planning to work on any such thing,
and still am not.

Looking again at the original thread:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in working
on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Darren Lafreniere
dlafreniere@onezero.com
In reply to: Tom Lane (#4)
Re: BRIN indexes and ORDER BY

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

regards, tom lane

Thanks Tom,

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

Darren Lafreniere

#6Stephen Frost
sfrost@snowman.net
In reply to: Darren Lafreniere (#5)
Re: BRIN indexes and ORDER BY

Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

[...]

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Thanks!

Stephen

#7Darren Lafreniere
dlafreniere@onezero.com
In reply to: Stephen Frost (#6)
Re: BRIN indexes and ORDER BY

Stephen Frost <sfrost@snowman.net> wrote:

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Thanks Stephen, this is exactly our use case.

#8Merlin Moncure
mmoncure@gmail.com
In reply to: Stephen Frost (#6)
Re: BRIN indexes and ORDER BY

On Wed, Oct 5, 2016 at 3:27 PM, Stephen Frost <sfrost@snowman.net> wrote:

Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

[...]

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Yeah. If the brin average page overlap and % dead tuple coefficients
are low it absolutely makes sense to drive top N with brin. It will
never beat a btree but typically brin is used when the btree index is
no good for various reasons.

brin indexes are pretty neat; they can provide stupefying amounts of
optimization in many common warehousing workloads. They even beat
out index only scans for a tiny fraction of the storage. Of course,
you have to work around the limitations... :-)

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general