tsearch2 and gist index bloat

Started by George Essigover 22 years ago6 messagesgeneral
Jump to latest
#1George Essig
george_essig@yahoo.com

I have installed tsearch2 and have noticed that the gist index used to do searches grows and grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details:

PostgreSQL 7.4RC1
Red Hat 9

Table "public.series"
Column | Type | Modifiers
---------------+-------------------+--------------------------------------------------------
id | integer | not null default nextval('public.series_id_seq'::text)
database | character varying | not null
name | character varying | not null
description | character varying |
documentation | character varying |
ts_vec | tsvector |
Indexes:
"series_pkey" primary key, btree (id)
"series_db_name_un" unique, btree ("database", name)
"ts_in" gist (ts_vec)
Triggers:
ts_update BEFORE INSERT OR UPDATE ON series FOR EACH ROW EXECUTE PROCEDURE tsearch2('ts_vec',
'description', 'documentation')

There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name, I
monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Below is some output from VACUUM FULL ANALYZE VERBOSE after the 40,422 rows were deleted.

INFO: index "ts_in" now contains 70451 row versions in 2969 pages
DETAIL: 40422 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.27s/0.44u sec elapsed 1.03 sec.

As the index size grows, search performance slows to a crawl because it's too to fit in RAM. Is
the only solution to drop and recreate the index after large updates?

Thanks,

George Essig

#2George Essig
george_essig@yahoo.com
In reply to: George Essig (#1)
Re: tsearch2 and gist index bloat
--- George Essig <george_essig@yahoo.com> wrote:

I have installed tsearch2 and have noticed that the gist index used to do searches grows and
grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details:

....

There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name, I
monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Sorry, I mis-reported the index sizes. They are about 1/10 the size:

15 MB After the index was created.
19 MB After updating 40,422 rows.
24 MB After VACUUM FULL
27 MB After deleting 40,422 rows & again VACUUM FULL

I still have a problem that the index size grows and grows and eventually searches slow to a
crawl.

George Essig

#3Joshua D. Drake
jd@commandprompt.com
In reply to: George Essig (#2)
Re: tsearch2 and gist index bloat

Hello,

I don't know if you can do this with a gist index but try using the
REINDEX command.

J

George Essig wrote:

Show quoted text
--- George Essig <george_essig@yahoo.com> wrote:

I have installed tsearch2 and have noticed that the gist index used to do searches grows and
grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details:

....

There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name, I
monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Sorry, I mis-reported the index sizes. They are about 1/10 the size:

15 MB After the index was created.
19 MB After updating 40,422 rows.
24 MB After VACUUM FULL
27 MB After deleting 40,422 rows & again VACUUM FULL

I still have a problem that the index size grows and grows and eventually searches slow to a
crawl.

George Essig

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#4George Essig
george_essig@yahoo.com
In reply to: Joshua D. Drake (#3)
Re: tsearch2 and gist index bloat

Thanks for the reply. For this project, I can update the data and reindex during off-peak hours.
I was just surprised to see the size of the index double after heavy write activity.

George Essig

--- "Joshua D. Drake" <jd@commandprompt.com> wrote:
Show quoted text

Hello,

I don't know if you can do this with a gist index but try using the
REINDEX command.

J

George Essig wrote:

--- George Essig <george_essig@yahoo.com> wrote:

I have installed tsearch2 and have noticed that the gist index used to do searches grows and
grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details:

....

There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name,

I

monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Sorry, I mis-reported the index sizes. They are about 1/10 the size:

15 MB After the index was created.
19 MB After updating 40,422 rows.
24 MB After VACUUM FULL
27 MB After deleting 40,422 rows & again VACUUM FULL

I still have a problem that the index size grows and grows and eventually searches slow to a
crawl.

George Essig

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#5Oleg Bartunov
oleg@sai.msu.su
In reply to: George Essig (#4)
Re: tsearch2 and gist index bloat

On Thu, 6 Nov 2003, George Essig wrote:

Thanks for the reply. For this project, I can update the data and reindex during off-peak hours.
I was just surprised to see the size of the index double after heavy write activity.

This is not tsearch specific problem. It was discussed several times, ]
see index bloat subject in archives.

Oleg

George Essig

--- "Joshua D. Drake" <jd@commandprompt.com> wrote:

Hello,

I don't know if you can do this with a gist index but try using the
REINDEX command.

J

George Essig wrote:

--- George Essig <george_essig@yahoo.com> wrote:

I have installed tsearch2 and have noticed that the gist index used to do searches grows and
grows
as I update rows, delete rows, or run VACUUM FULL ANALYZE. Below are some details:

....

There are 110,873 rows in this table and 13398 unique words indexed by ts_in. Using oid2name,

I

monitored the size of the index ts_in as I performed different operations:

154 MB After the index was created.
190 MB After updating 40,422 rows.
243 MB After VACUUM FULL
275 MB After deleting 40,422 rows & again VACUUM FULL

Sorry, I mis-reported the index sizes. They are about 1/10 the size:

15 MB After the index was created.
19 MB After updating 40,422 rows.
24 MB After VACUUM FULL
27 MB After deleting 40,422 rows & again VACUUM FULL

I still have a problem that the index size grows and grows and eventually searches slow to a
crawl.

George Essig

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#6George Essig
george_essig@yahoo.com
In reply to: Oleg Bartunov (#5)
Re: tsearch2 and gist index bloat
--- Oleg Bartunov <oleg@sai.msu.su> wrote:

On Thu, 6 Nov 2003, George Essig wrote:

Thanks for the reply. For this project, I can update the data and reindex during off-peak

hours.

I was just surprised to see the size of the index double after heavy write activity.

This is not tsearch specific problem. It was discussed several times, ]
see index bloat subject in archives.

Oleg

I understand that index bloat is a general problem, but is this particular problem more severe
because of a gist index? By the way, I'm running PostgreSQL 7.4 Release Candidate 1.

Thanks,
George Essig