pgstatindex

Started by Tatsuo Ishiiover 23 years ago5 messages
#1Tatsuo Ishii
t-ishii@sra.co.jp
1 attachment(s)

Here is a new contrib function called "pgstatindex", similar to
pgstattuple but different in that it returns the percentage of the
dead tuples of an index. I am posting this for review purpose.

Installation of pgstatindex is pretty easy:

unpack the tar package in contrib directory.
cd into pgstatindex directory.
make
make install
psql -f /usr/local/pgsql/share/contrib/pgstatindex.sql your_database

Note:

(1) I think I have adopted to the recent Tom's changes to index access
routines, but if you find anything is wrong, plese let me know.

(2) pgstatindex probably does not work with rtree and gist indexes.
--
Tatsuo Ishii

Attachments:

pgstatindex.tar.gzapplication/octet-streamDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#1)
Re: pgstatindex

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

Here is a new contrib function called "pgstatindex", similar to
pgstattuple but different in that it returns the percentage of the
dead tuples of an index. I am posting this for review purpose.

Um ... what's the point? Isn't this always the same as the percentage
for the underlying table?

regards, tom lane

#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#2)
Re: pgstatindex

Um ... what's the point? Isn't this always the same as the percentage
for the underlying table?

Sure. In my understanding, unlike tables "free/reusable space" is
actually not reused in index. pgstatindex would be usefull to judge if
REINDEX is needed by showing the growth of physical length and
"free/reusable space".

Maybe "free/reusable space" is not appropriate wording, "dead space"
is better?
--
Tatsuo Ishii

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#3)
Re: pgstatindex

Tatsuo Ishii <t-ishii@sra.co.jp> writes:

Sure. In my understanding, unlike tables "free/reusable space" is
actually not reused in index. pgstatindex would be usefull to judge if
REINDEX is needed by showing the growth of physical length and
"free/reusable space".

Oh. Hmm, if that's what you want then I do not think an indexscan is
the way to go about it. The indexscan will only visit leaf pages
(and not, for example, internal nodes of a btree). Also the
free-space-counting code you're using seems pretty unworkable since the
indexscan is unlikely to visit leaf pages in anything like sequential
order.

I think the only reasonable way to get useful statistics would be to
read the index directly --- page by page, no indexscan, distinguishing
leaf pages, internal pages, and overhead pages for yourself. This would
require index-AM-specific knowledge about how to tell which type each
page is, but I believe all the index AMs make that possible.

Also, I'd suggest that visiting the heap is just useless overhead. A
person who wants to know whether the heap needs to be vacuumed can get
that data from pgstattuple. Reading the heap to check tuple state will
make this function orders of magnitude slower, while not producing much
useful info that I can see.

Something else to think about is how to present the results. As soon
as you release this we will have people bleating about how come their
btrees always show at least 1/3rd free space :-( unless we can think
of a way to highlight the fact that that's the expected loading factor
for a btree...

regards, tom lane

#5Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Tom Lane (#4)
Re: pgstatindex

Oh. Hmm, if that's what you want then I do not think an indexscan is
the way to go about it. The indexscan will only visit leaf pages
(and not, for example, internal nodes of a btree). Also the
free-space-counting code you're using seems pretty unworkable since the
indexscan is unlikely to visit leaf pages in anything like sequential
order.

Oh I was not aware of this.

I think the only reasonable way to get useful statistics would be to
read the index directly --- page by page, no indexscan, distinguishing
leaf pages, internal pages, and overhead pages for yourself. This would
require index-AM-specific knowledge about how to tell which type each
page is, but I believe all the index AMs make that possible.

That's what I'm afraid of.

Also, I'd suggest that visiting the heap is just useless overhead. A
person who wants to know whether the heap needs to be vacuumed can get
that data from pgstattuple. Reading the heap to check tuple state will
make this function orders of magnitude slower, while not producing much
useful info that I can see.

Ok let me think about this. Thank you for the suggestion!
--
Tatsuo Ishii