CLUSTER and clustered indices
When a table has been CLUSTERed on a particular index AND that index
values is monotonically increasing, then it would be a bad move to use
blocks from the FSM since this would tend to destroy the natural
clustering sequence.
The index values will be monotonically increasing if a datatype is
defined as SERIAL or if the default value is defined as the nextval of a
sequence.
Does anybody think it would be a good idea to not use the FSM if
- we have a CLUSTER defined on an index
- for the indexed column we have default value set of nextval()
Best Regards, Simon Riggs
Simon Riggs wrote:
When a table has been CLUSTERed on a particular index AND that index
values is monotonically increasing, then it would be a bad move to use
blocks from the FSM since this would tend to destroy the natural
clustering sequence.The index values will be monotonically increasing if a datatype is
defined as SERIAL or if the default value is defined as the nextval of a
sequence.Does anybody think it would be a good idea to not use the FSM if
- we have a CLUSTER defined on an index
- for the indexed column we have default value set of nextval()
That's a nice idea, but what's the cost? You will have to check every
insert: does the table has indexes? Is any of them clustered? Is the
clustered index attached to a sequence? It seems quite an expensive
check to be making.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Simon Riggs <simon@2ndquadrant.com> writes:
When a table has been CLUSTERed on a particular index AND that index
values is monotonically increasing, then it would be a bad move to use
blocks from the FSM since this would tend to destroy the natural
clustering sequence.
By the time there are any blocks in FSM to take, the original clean
index page sequence is doubtless history. The pure-increasing-key
scenario you are thinking of never will have any FSM entries, so it's
moot.
regards, tom lane
On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
When a table has been CLUSTERed on a particular index AND that index
values is monotonically increasing, then it would be a bad move to use
blocks from the FSM since this would tend to destroy the natural
clustering sequence.By the time there are any blocks in FSM to take, the original clean
index page sequence is doubtless history. The pure-increasing-key
scenario you are thinking of never will have any FSM entries, so it's
moot.
Well, just because its a pure-increasing-key doesn't imply rows don't
get deleted. Look at the NewOrder table in DBT-2/TPC-C. Order numbers
increase, but mostly they get delivered in the end.
In the case I describe, tuples are added always to the rightmost edge of
the index, so it seems worthwhile to always add tuples to the top of the
heap only so that the order is consistent.
On Thu, 2005-11-17 at 12:45 -0300, Alvaro Herrera wrote:
That's a nice idea, but what's the cost? You will have to check every
insert: does the table has indexes? Is any of them clustered? Is the
clustered index attached to a sequence? It seems quite an expensive
check to be making.
I'd make the check when the relcache is loaded and store it there as a
boolean. heap_insert already has a boolean on it for use_fsm, so it
would transfer very simply at insert time with almost no overhead.
The use case exists and the technique is low overhead, but the main
question is: Does anybody think this behaviour would be beneficial for
them? (I'm actually in two minds myself, but once the idea has arisen,
it seems sensible to discuss this for everybody's sake).
The trade-off is a table that keeps growing in size, even though you
VACUUM it, with the benefit that the clustering is maintained.
So how would you maintain it? Looks like you'd still have to use regular
CLUSTER commands, but at least it would stay good in between.
Best Regards, Simon Riggs
Simon Riggs wrote:
On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
The use case exists and the technique is low overhead, but the main
question is: Does anybody think this behaviour would be beneficial for
them? (I'm actually in two minds myself, but once the idea has arisen,
it seems sensible to discuss this for everybody's sake).
I have no use for it but I see it would be beneficial in some cases.
The trade-off is a table that keeps growing in size, even though you
VACUUM it, with the benefit that the clustering is maintained.So how would you maintain it? Looks like you'd still have to use regular
CLUSTER commands, but at least it would stay good in between.
Yeah, this is a problem. The growth is unbounded. Even if there's a
completely empty page somewhere, it can't be used because all tuples
will go to the last page. The problem with using CLUSTER for
maintenance is that it takes an exclusive lock on the table, which is a
thing we've been running away from. You are right in that it's much
cheaper than CLUSTERing a table that isn't ordered, because there's much
more locality. But I don't think it's a big enough win.
Because of the drawbacks (unbounded growth being the most prominent one)
this would have to be an optional thing. This means we would need an
additional system catalog column to keep whether it's active or not.
And a user command to activate it. So it's starting to be a more
invasive thing. Not that these things matter a whole lot, but anyway.
Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
I agree, keeping it clustered would be very nice.
Show quoted text
On 11/17/05, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Simon Riggs wrote:
On Thu, 2005-11-17 at 10:58 -0500, Tom Lane wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
The use case exists and the technique is low overhead, but the main
question is: Does anybody think this behaviour would be beneficial for
them? (I'm actually in two minds myself, but once the idea has arisen,
it seems sensible to discuss this for everybody's sake).I have no use for it but I see it would be beneficial in some cases.
The trade-off is a table that keeps growing in size, even though you
VACUUM it, with the benefit that the clustering is maintained.So how would you maintain it? Looks like you'd still have to use regular
CLUSTER commands, but at least it would stay good in between.Yeah, this is a problem. The growth is unbounded. Even if there's a
completely empty page somewhere, it can't be used because all tuples
will go to the last page. The problem with using CLUSTER for
maintenance is that it takes an exclusive lock on the table, which is a
thing we've been running away from. You are right in that it's much
cheaper than CLUSTERing a table that isn't ordered, because there's much
more locality. But I don't think it's a big enough win.Because of the drawbacks (unbounded growth being the most prominent one)
this would have to be an optional thing. This means we would need an
additional system catalog column to keep whether it's active or not.
And a user command to activate it. So it's starting to be a more
invasive thing. Not that these things matter a whole lot, but anyway.Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
On Thu, 2005-11-17 at 21:57 -0300, Alvaro Herrera wrote:
Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.
Agreed. (I think thats case-closed on the previous proposal.)
As an aside, Index Organized Tables (IOTs) isn't just an Oracle term.
They first used the term, but the concept had already been implemented
in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
as well as numerous OLAP systems. The concept doesn't look to be
patented.
If anybody is looking for a justification for IOTs, the reduction in
table volume for large tables is very high. IOTs are the equivalent of
removing all of the leaf blocks of the clustered index.
Best Regards, Simon Riggs
That sounds very much like a CLUSTERED INDEX under Sybase ASE
(or the derivative Microsoft SQL Server). In those products, when you
create a clustered index, the data pages are sorted according to the
index sequence, and are used as the leaf pages in the index. A
clustered index does not have another leaf level.
Simon Riggs <simon@2ndquadrant.com> >>>
As an aside, Index Organized Tables (IOTs) isn't just an Oracle term.
They first used the term, but the concept had already been implemented
in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
as well as numerous OLAP systems. The concept doesn't look to be
patented.
If anybody is looking for a justification for IOTs, the reduction in
table volume for large tables is very high. IOTs are the equivalent of
removing all of the leaf blocks of the clustered index.
Import Notes
Resolved by subject fallback
+1, and I know Sybase had this in 11.0.3, which IIRC is over 10 years
old now.
BTW,
http://archives.postgresql.org/pgsql-performance/2004-08/msg00492.php is
one discussion about this from the past. I seem to recall that there was
an objection to true Index Organized Tables because it would be too
dificult to make that work with MVCC. If that's the case then what I
laid out in that email might get some of the benefit without the
difficulty. But hopefully it's easy to just store heap values in the
leaf nodes of an index.
FWIW, I know that Sybase required that an IOT be clustered on a unique
index. I think Oracle has the same requirement as well.
On Fri, Nov 18, 2005 at 08:30:14AM +0000, Simon Riggs wrote:
On Thu, 2005-11-17 at 21:57 -0300, Alvaro Herrera wrote:
Personally I'd prefer to see index-ordered heaps, where the heap is
itself an index, so the ordering it automatically kept.Agreed. (I think thats case-closed on the previous proposal.)
As an aside, Index Organized Tables (IOTs) isn't just an Oracle term.
They first used the term, but the concept had already been implemented
in both Tandem (value-ordered) and Teradata (hash-ordered) before this,
as well as numerous OLAP systems. The concept doesn't look to be
patented.If anybody is looking for a justification for IOTs, the reduction in
table volume for large tables is very high. IOTs are the equivalent of
removing all of the leaf blocks of the clustered index.Best Regards, Simon Riggs
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461