BRIN indexes and ORDER BY

Started by Darren Lafreniereover 9 years ago8 messagesgeneral

dlafreniere@onezero.com

over 9 years ago

Hello,

We're curious about the current behavior in 9.5.4, and possible future
enhancements, of BRIN indexes with respect to ordering.

In the docs, section 11.4. "Indexes and ORDER BY" (
https://www.postgresql.org/docs/9.5/static/indexes-ordering.html) is clear
that anything other than B-tree indexes have unspecified ordering:

"In addition to simply finding the rows to be returned by a query, an index
may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree can
produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Our current test shows that ordering by a BRIN indexed column still
performs an unoptimized sort:

SELECT generate_series(1, 10000000) AS id INTO test;
CREATE INDEX idx_test_id ON test USING BRIN (id);
EXPLAIN SELECT id FROM test ORDER BY id DESC LIMIT 20;

Limit (cost=410344.40..410344.45 rows=20 width=4)
-> Sort (cost=410344.40..435344.40 rows=1000000 width=4)"
Sort Key: id DESC
-> Seq Scan on test (cost=0.00..144248.00 rows=10000000 width=4)

Is there anything we're missing to speed this up? Or is it still a future
feature?

Thank you,
Darren Lafreniere

Alvaro Herrera

alvherre@2ndquadrant.com

over 9 years ago

In reply to: Darren Lafreniere (#1)

Re: BRIN indexes and ORDER BY

Darren Lafreniere wrote:

"In addition to simply finding the rows to be returned by a query, an index
may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree can
produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Darren Lafreniere

dlafreniere@onezero.com

over 9 years ago

In reply to: Alvaro Herrera (#2)

Re: BRIN indexes and ORDER BY

Ahh, yes. I misread that. Thank you for the clarification.

On Wed, Oct 5, 2016 at 2:27 PM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

Show quoted text

Darren Lafreniere wrote:

"In addition to simply finding the rows to be returned by a query, an

index

may be able to deliver them in a specific sorted order. This allows a
query's ORDER BY specification to be honored without a separate sorting
step. Of the index types currently supported by PostgreSQL, only B-tree

can

produce sorted output — the other index types return matching rows in an
unspecified, implementation-dependent order."

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Alvaro Herrera (#2)

Re: BRIN indexes and ORDER BY

Alvaro Herrera <alvherre@2ndquadrant.com> writes:

Darren Lafreniere wrote:

We found a pgsql-hackers thread from about a year ago about optimizing
ORDER BY for BRIN indexes. Tom Lane suggested that he was working on it:
/messages/by-id/11881.1443393360@sss.pgh.pa.us

Tom said he was working on some infrastructure planner changes
("upper-planner path-ification"), not that he was working on improving
usage of BRIN indexes. As far as I know, nobody has worked on that.

Alvaro's reading is correct; I wasn't planning to work on any such thing,
and still am not.

Looking again at the original thread:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in working
on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Darren Lafreniere

dlafreniere@onezero.com

over 9 years ago

In reply to: Tom Lane (#4)

Re: BRIN indexes and ORDER BY

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

regards, tom lane

Thanks Tom,

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

Darren Lafreniere

Stephen Frost

sfrost@snowman.net

over 9 years ago

In reply to: Darren Lafreniere (#5)

Re: BRIN indexes and ORDER BY

Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

[...]

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Thanks!

Stephen

Darren Lafreniere

dlafreniere@onezero.com

over 9 years ago

In reply to: Stephen Frost (#6)

Re: BRIN indexes and ORDER BY

Stephen Frost <sfrost@snowman.net> wrote:

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Thanks Stephen, this is exactly our use case.

Merlin Moncure

mmoncure@gmail.com

over 9 years ago

In reply to: Stephen Frost (#6)

Re: BRIN indexes and ORDER BY

On Wed, Oct 5, 2016 at 3:27 PM, Stephen Frost <sfrost@snowman.net> wrote:

Darren,

* Darren Lafreniere (dlafreniere@onezero.com) wrote:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Gavin Wahl wrote:

It seems trivial to accelerate a MAX or MIN query with a BRIN index. You
just find the page range with the largest/smallest value, and then only
scan that one. Would that be hard to implement? I'm interested in

working

on it if someone can give me some pointers.

I think this proposal is fairly broken anyway. The page range with the
largest max-value may once have contained the largest live row, but
there's no guarantee that it still does. It might even be completely
empty. You could imagine an algorithm like this:

1. Find page-range with largest max. Scan it to identify live row with
largest value. If *no* live values, find page-range with next largest
max, repeat until no page ranges remain (whereupon return NULL).

2. For each remaining page-range whose indexed max exceeds the value
currently in hand, scan that page-range to see if any value exceeds
the one in hand, replacing the value if so.

This'd probably allow you to omit scanning some of the page-ranges
in the table, but in a lot of cases you'd end up scanning many of them;
and you'd need a lot of working state to remember which ranges you'd
already looked at. It'd certainly always be a lot more expensive than
answering the same question with a btree index, because in no case do
you get to avoid scanning the entire contents of the index.

[...]

A b-tree index would certainly be faster for ordering. But in scenarios
where you have huge datasets that can't afford the space or update time
required for b-tree, could such a BRIN-accelerated ordering algorithm at
least be faster than ordering with no index?

For at least some of the common BRIN use-cases, where the rows are
inserted in-order and never/very-rarely modified or deleted, this
approach would work very well.

Certainly, using this would be much cheaper than a seqscan/top-N sort,
for small values of 'N', relative to the number of rows in the table,
in those cases.

In general, I like the idea of supporting this as BRIN indexes strike me
as very good for very large tables which have highly clumped data in
them and being able to do a top-N query on those can be very useful at
times.

Yeah. If the brin average page overlap and % dead tuple coefficients
are low it absolutely makes sense to drive top N with brin. It will
never beat a btree but typically brin is used when the btree index is
no good for various reasons.

brin indexes are pretty neat; they can provide stupefying amounts of
optimization in many common warehousing workloads. They even beat
out index only scans for a tiny fraction of the storage. Of course,
you have to work around the limitations... :-)

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general