ctid ranges

Started by Thomas Munroabout 14 years ago7 messagesgeneral

Jump to latest

Thomas Munro

thomas.munro@gmail.com

about 14 years ago

In 9.1.3, this is fast, handled with a tid scan using the physical address:

SELECT ... FROM ... WHERE ctid = ...;

This is slow, handled with a seq scan (as are various rephrasing with
<, <=, etc):

SELECT ... FROM ... WHERE ctid BETWEEN ... AND ...;

Is there a way to retrieve the rows in a physical range quickly?

(I realise this is a pretty odd thing to want to do... I was
experimenting with a crackpot idea for storing some data in a known
physical order and finding the beginning of ends ranges by binary
chop, instead of using a btree.)

Thanks
Thomas Munro

Jeff Davis

pgsql@j-davis.com

about 14 years ago

In reply to: Thomas Munro (#1)

Re: ctid ranges

On Fri, 2012-06-08 at 22:27 +0100, Thomas Munro wrote:

This is slow, handled with a seq scan (as are various rephrasing with
<, <=, etc):

SELECT ... FROM ... WHERE ctid BETWEEN ... AND ...;

Is there a way to retrieve the rows in a physical range quickly?

Interesting idea. However, as far as I know, there is no such support.

Regards,
Jeff Davis

Merlin Moncure

mmoncure@gmail.com

about 14 years ago

In reply to: Jeff Davis (#2)

Re: ctid ranges

On Mon, Jun 11, 2012 at 7:57 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2012-06-08 at 22:27 +0100, Thomas Munro wrote:

This is slow, handled with a seq scan (as are various rephrasing with
<, <=, etc):

SELECT ... FROM ... WHERE ctid BETWEEN ... AND ...;

Is there a way to retrieve the rows in a physical range quickly?

Interesting idea. However, as far as I know, there is no such support.

yeah -- and I think it's a great thing to want to be able to do. it
could be used in parallelizing tricks for example: divide up a table
into N approximately equal parts and hand each one off to a work
thread.

merlin

Bruce Momjian

bruce@momjian.us

about 14 years ago

In reply to: Merlin Moncure (#3)

Re: ctid ranges

On Wed, Jun 13, 2012 at 03:15:14PM -0500, Merlin Moncure wrote:

On Mon, Jun 11, 2012 at 7:57 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, 2012-06-08 at 22:27 +0100, Thomas Munro wrote:

This is slow, handled with a seq scan (as are various rephrasing with
<, <=, etc):

SELECT ... FROM ... WHERE ctid BETWEEN ... AND ...;

Is there a way to retrieve the rows in a physical range quickly?

Interesting idea. However, as far as I know, there is no such support.

yeah -- and I think it's a great thing to want to be able to do. it
could be used in parallelizing tricks for example: divide up a table
into N approximately equal parts and hand each one off to a work
thread.

Can we add this as a TODO? It would basically be adding
less/greater-than comparisons for the 'tid' data type.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Merlin Moncure

mmoncure@gmail.com

about 14 years ago

In reply to: Bruce Momjian (#4)

Re: ctid ranges

On Wed, Jun 13, 2012 at 3:18 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Jun 13, 2012 at 03:15:14PM -0500, Merlin Moncure wrote:

yeah -- and I think it's a great thing to want to be able to do. it
could be used in parallelizing tricks for example: divide up a table
into N approximately equal parts and hand each one off to a work
thread.

Can we add this as a TODO? It would basically be adding
less/greater-than comparisons for the 'tid' data type.

IMNSHO, it's a no-brainer for the todo (but I think it's more
complicated than adding some comparisons -- which are working now):

postgres=# explain select ctid from foo where ctid >= '(3786,67)'::tid limit 1;
QUERY PLAN
------------------------------------------------------------------
Limit (cost=0.00..0.05 rows=1 width=6)
-> Seq Scan on foo (cost=0.00..16422.00 rows=333333 width=6)
Filter: (ctid >= '(3786,67)'::tid)
(3 rows)

merlin

Bruce Momjian

bruce@momjian.us

about 14 years ago

In reply to: Merlin Moncure (#5)

Re: ctid ranges

On Wed, Jun 13, 2012 at 03:21:17PM -0500, Merlin Moncure wrote:

On Wed, Jun 13, 2012 at 3:18 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Jun 13, 2012 at 03:15:14PM -0500, Merlin Moncure wrote:

yeah -- and I think it's a great thing to want to be able to do. ï¿½it
could be used in parallelizing tricks for example: divide up a table
into N approximately equal parts and hand each one off to a work
thread.

Can we add this as a TODO? ï¿½It would basically be adding
less/greater-than comparisons for the 'tid' data type.

IMNSHO, it's a no-brainer for the todo (but I think it's more
complicated than adding some comparisons -- which are working now):

postgres=# explain select ctid from foo where ctid >= '(3786,67)'::tid limit 1;
QUERY PLAN
------------------------------------------------------------------
Limit (cost=0.00..0.05 rows=1 width=6)
-> Seq Scan on foo (cost=0.00..16422.00 rows=333333 width=6)
Filter: (ctid >= '(3786,67)'::tid)
(3 rows)

I see. Seems we have to add index smarts to those comparisons. That
might be complicated.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Tom Lane

tgl@sss.pgh.pa.us

about 14 years ago

In reply to: Bruce Momjian (#6)

Re: ctid ranges

Bruce Momjian <bruce@momjian.us> writes:

On Wed, Jun 13, 2012 at 03:21:17PM -0500, Merlin Moncure wrote:

IMNSHO, it's a no-brainer for the todo (but I think it's more
complicated than adding some comparisons -- which are working now):

I see. Seems we have to add index smarts to those comparisons. That
might be complicated.

Uh, the whole point of a TID scan is to *not* need an index.

What would be needed is for tidpath.c to let through more kinds of TID
comparison quals than it does now, and then for nodeTidscan.c to know
what to do with them. The latter logic might well look something like
btree indexscan qual preparation, but it wouldn't be the same code.

regards, tom lane