partial index

Started by Bruce Momjianover 27 years ago7 messages
#1Bruce Momjian
maillist@candle.pha.pa.us

What is a partial index? I have never known.

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#2Noname
aoki@CS.Berkeley.EDU
In reply to: Bruce Momjian (#1)
Re: partial index

Bruce Momjian <maillist@candle.pha.pa.us> writes:

What is a partial index? I have never known.

it's an index built over a subset of a table; the subset is defined by
a predicate. postgres supported partial indices with arbitrary
predicates. i believe ibm's db2 for as/400 supports partial indices
using single-clause predicates.

the main motivation is: if all of the queries you ask that can
profitably use an index fall into a certain range, why build an index
over the whole table and suffer the associated space/time costs?
(there are other reasons; see the first paper referenced below.)

the machinery to build, update and query partial indices isn't too
bad. the hairy parts are index selection (which indices do i build)
and query optimization (which indices do i use) -- i.e., the parts
that involve deciding what predicate(s) match the workload/query in
some useful way. for those who are into database theory, the problems
are basically analogous to the corresponding materialized view
problems, albeit with different cost parameters and formulae. these
are, in the general case, hard problems for the standard ordinal sql
types; they're super-hard problems with black-box extension types,
because the selectivity estimation technology is so crude.

1. Stonebraker, M.
The case for partial indexes (DBMS).
SIGMOD Record, Dec. 1989, vol.18, (no.4):4-11.
http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/papers/ERL-M89-17.pdf

2. Olson, Nels Edward.
Partial indexing in POSTGRES : research project / by Nels Edward Olson.
1993.
UCB Engin T7.49.1993 O676

1. CONFERENCE PAPER
Seshadri, P.; Swami, A.
Generalized partial indexes.
IN: Proceedings of the Eleventh International Conference on Data
Engineering (Cat. No.95CH35724). (Proceedings of the Eleventh International
Conference on Data Engineering (Cat. No.95CH35724)Proceedings of the
Eleventh International Conference on Data Engineering, Taipei, Taiwan, 6-10
March 1995). Edited by: Yu, P.S.; Chen, A.L.P. Los Alamitos, CA, USA: IEEE
Comput. Soc. Press, 1995. p. 420-7.
http://simon.cs.cornell.edu/home/praveen/papers/partindex.de95.ps.Z
--
Paul M. Aoki | University of California at Berkeley
aoki@CS.Berkeley.EDU | Dept. of EECS, Computer Science Division #1776
| Berkeley, CA 94720-1776

#3Bruce Momjian
maillist@candle.pha.pa.us
In reply to: Noname (#2)
Re: [HACKERS] Re: partial index

Bruce Momjian <maillist@candle.pha.pa.us> writes:

What is a partial index? I have never known.

it's an index built over a subset of a table; the subset is defined by
a predicate. postgres supported partial indices with arbitrary
predicates. i believe ibm's db2 for as/400 supports partial indices
using single-clause predicates.

the main motivation is: if all of the queries you ask that can
profitably use an index fall into a certain range, why build an index
over the whole table and suffer the associated space/time costs?
(there are other reasons; see the first paper referenced below.)

I had suspected that's what they were, but never really was sure. Now
the next question, "Should we rip them out?" No one uses them, and
they seem to be of very limited usefulness.

I am inclinded to keep them, but I am not sure.

the machinery to build, update and query partial indices isn't too
bad. the hairy parts are index selection (which indices do i build)
and query optimization (which indices do i use) -- i.e., the parts
that involve deciding what predicate(s) match the workload/query in
some useful way. for those who are into database theory, the problems
are basically analogous to the corresponding materialized view
problems, albeit with different cost parameters and formulae. these
are, in the general case, hard problems for the standard ordinal sql
types; they're super-hard problems with black-box extension types,
because the selectivity estimation technology is so crude.

1. Stonebraker, M.
The case for partial indexes (DBMS).
SIGMOD Record, Dec. 1989, vol.18, (no.4):4-11.
http://s2k-ftp.CS.Berkeley.EDU:8000/postgres/papers/ERL-M89-17.pdf

2. Olson, Nels Edward.
Partial indexing in POSTGRES : research project / by Nels Edward Olson.
1993.
UCB Engin T7.49.1993 O676

1. CONFERENCE PAPER
Seshadri, P.; Swami, A.
Generalized partial indexes.
IN: Proceedings of the Eleventh International Conference on Data
Engineering (Cat. No.95CH35724). (Proceedings of the Eleventh International
Conference on Data Engineering (Cat. No.95CH35724)Proceedings of the
Eleventh International Conference on Data Engineering, Taipei, Taiwan, 6-10
March 1995). Edited by: Yu, P.S.; Chen, A.L.P. Los Alamitos, CA, USA: IEEE
Comput. Soc. Press, 1995. p. 420-7.
http://simon.cs.cornell.edu/home/praveen/papers/partindex.de95.ps.Z
--
Paul M. Aoki | University of California at Berkeley
aoki@CS.Berkeley.EDU | Dept. of EECS, Computer Science Division #1776
| Berkeley, CA 94720-1776

-- 
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)
#4Andreas Zeugswetter
andreas.zeugswetter@telecom.at
In reply to: Bruce Momjian (#3)
AW: [HACKERS] partial index

Bruce wrote:

What is a partial index? I have never known.

A partial index is one that only indexes a subset of a table (e.g. restricted by a where clause).

This can be useful in many cases, one typical case is where all rows that are not null
in the index fields are to be indexed:
create index people_idx on people (childname) where childname is not null;

Note that I just made up the syntax (or is it postgresql syntax?). Some code for it lurks around.

Informix uses another syntax with a complete select statement, so you can also index a join:

CREATE GK INDEX gki
(SELECT A.col1, A.col2 FROM A, B, C
WHERE A.col1 = B.col1 AND B.col1 = C.col1)

Andreas

#5Jackson, DeJuan
djackson@cpsgroup.com
In reply to: Andreas Zeugswetter (#4)
RE: [HACKERS] Re: partial index

I had suspected that's what they were, but never really was sure. Now
the next question, "Should we rip them out?" No one uses them, and
they seem to be of very limited usefulness.

I am inclinded to keep them, but I am not sure.

Do we have syntax for their creation and is it in the docs?
If not I say just take them out, unless someone can think of a use that
wouldn't be served by normal indexes.
-DEJ

#6Noname
jwieck@debis.com
In reply to: Jackson, DeJuan (#5)
Re: [HACKERS] Re: partial index

I had suspected that's what they were, but never really was sure. Now
the next question, "Should we rip them out?" No one uses them, and
they seem to be of very limited usefulness.

I am inclinded to keep them, but I am not sure.

Do we have syntax for their creation and is it in the docs?
If not I say just take them out, unless someone can think of a use that
wouldn't be served by normal indexes.
-DEJ

I can think of a situation where this is useful (even if very
seldom).

Have a table with many rows, indexed by a char(80) field. In
99% of the selects the same 50 rows are searched (all having
an 'A' as first character of the key).

If you could only have these 50 in the index for fast access,
the complete index would fit into a few blocks and can be
searched faster. All other rows will be searched by a
seqscan.

Well, there are limits where this all gets useless. The
speedup by having a small index (=faster index) is eaten up
by the longer time needed by seqscans very quickly. And it's
a hard job to keep the predicates for the partial indices
appropriate, so some overall speedup is gained.

The possible speedup from this compared against the danger of
having a dramatic slowdown on the other side is a clear
drawback from my point of view.

So the only argument for having a partial index can be saved
disk space. A bad argument when looking at the actual pricing
of disks.

Don't force it - use a bigger hammer!

Result: Kick the partial indices out.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #

#7Thomas G. Lockhart
lockhart@alumni.caltech.edu
In reply to: Noname (#6)
Re: [HACKERS] Re: partial index

I had suspected that's what they were, but never really was sure. > > > Now the next question, "Should we rip them out?" No one uses
them, and they seem to be of very limited usefulness.
I am inclined to keep them, but I am not sure.

Do we have syntax for their creation and is it in the docs?
If not I say just take them out, unless someone can think of a use
that wouldn't be served by normal indexes.

So the only argument for having a partial index can be saved
disk space. A bad argument when looking at the actual pricing
of disks.
Don't force it - use a bigger hammer!
Result: Kick the partial indices out.

???

Why remove another feature from Postgres when there isn't a clear
benefit to removing it? It's yet another discriminator separating
Postgres from ordinary database systems.

Now that we know what they are, we should figure out how to use them,
and document it as DeJuan suggests.

- Tom