Postgres 9.1: Adding rows to table causing too much latency in other queries

Started by Sushant Sinhaabout 14 years ago8 messages
#1Sushant Sinha
sushant354@gmail.com

I recently upgraded my postgres server from 9.0 to 9.1.2 and I am
finding a peculiar problem.I have a program that periodically adds rows
to this table using INSERT. Typically the number of rows is just 1-2
thousand when the table already has 500K rows. Whenever the program is
adding rows, the performance of the search query on the same table is
very bad. The query uses the gin index and the tsearch ranking function
ts_rank_cd.

This never happened earlier with postgres 9.0 Is there a known issue
with Postgres 9.1? Or how to report this problem?

-Sushant.

In reply to: Sushant Sinha (#1)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

On 19-12-2011 12:30, Sushant Sinha wrote:

I recently upgraded my postgres server from 9.0 to 9.1.2 and I am
finding a peculiar problem.I have a program that periodically adds rows
to this table using INSERT. Typically the number of rows is just 1-2
thousand when the table already has 500K rows. Whenever the program is
adding rows, the performance of the search query on the same table is
very bad. The query uses the gin index and the tsearch ranking function
ts_rank_cd.

How bad is bad? It seems you are suffering from don't-fit-on-cache problem, no?

This never happened earlier with postgres 9.0 Is there a known issue
with Postgres 9.1? Or how to report this problem?

Test case? Query times? Query plans? Are you sure you the compile options are
the same? What about the configuration parameters? What is the exact version
of both installations?

--
Euler Taveira de Oliveira - Timbira http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

#3Marti Raudsepp
marti@juffo.org
In reply to: Sushant Sinha (#1)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

On Mon, Dec 19, 2011 at 17:30, Sushant Sinha <sushant354@gmail.com> wrote:

This never happened earlier with postgres 9.0 Is there a known issue
with Postgres 9.1? Or how to report this problem?

What *did* change in 9.1 is that there's new GIN cost estimation in
the planner. If you do EXPLAIN ANALYZE for your queries, do the plans
differ for 9.0 or 9.1? E.g. doing an index scan instead of a seq scan
or vice versa.

The query uses the gin index and the tsearch ranking function
ts_rank_cd.

Another thought -- have you read about the GIN "fast updates" feature?
This existed in 9.0 too. Instead of updating the index directly, GIN
appends all changes to a sequential list, which needs to be scanned in
whole for read queries. The periodic autovacuum process has to merge
these values back into the index.

Maybe the solution is to tune autovacuum to run more often on the table.

http://www.postgresql.org/docs/9.1/static/gin-implementation.html

Regards,
Marti

#4Sushant Sinha
sushant354@gmail.com
In reply to: Marti Raudsepp (#3)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

On Mon, 2011-12-19 at 19:08 +0200, Marti Raudsepp wrote:

Another thought -- have you read about the GIN "fast updates" feature?
This existed in 9.0 too. Instead of updating the index directly, GIN
appends all changes to a sequential list, which needs to be scanned in
whole for read queries. The periodic autovacuum process has to merge
these values back into the index.

Maybe the solution is to tune autovacuum to run more often on the
table.

http://www.postgresql.org/docs/9.1/static/gin-implementation.html

Regards,
Marti

Probably this is the problem. Is running "vacuum analyze" under psql is
the same as "autovacuum"?

-Sushant.

#5Sushant Sinha
sushant354@gmail.com
In reply to: Euler Taveira de Oliveira (#2)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

On Mon, 2011-12-19 at 12:41 -0300, Euler Taveira de Oliveira wrote:

On 19-12-2011 12:30, Sushant Sinha wrote:

I recently upgraded my postgres server from 9.0 to 9.1.2 and I am
finding a peculiar problem.I have a program that periodically adds

rows

to this table using INSERT. Typically the number of rows is just 1-2
thousand when the table already has 500K rows. Whenever the program

is

adding rows, the performance of the search query on the same table

is

very bad. The query uses the gin index and the tsearch ranking

function

ts_rank_cd.

How bad is bad? It seems you are suffering from don't-fit-on-cache
problem, no?

The memory is 32GB and the entire database is just 22GB. Even "vmstat 1"
does not show any disk activity.

I was not able to isolate the performance numbers since I have observed
this only on the production box where the number of requests keep
increasing as the box gets loaded. But a query that takes 1sec normally
is taking more than 10secs (not sure whether it got the same number of
CPU cycles). Is there a way to find that?

-Sushant.

#6Jesper Krogh
jesper@krogh.cc
In reply to: Marti Raudsepp (#3)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

On 2011-12-19 18:08, Marti Raudsepp wrote:

The query uses the gin index and the tsearch ranking function
ts_rank_cd.

Another thought -- have you read about the GIN "fast updates" feature?
This existed in 9.0 too. Instead of updating the index directly, GIN
appends all changes to a sequential list, which needs to be scanned in
whole for read queries. The periodic autovacuum process has to merge
these values back into the index.

Maybe the solution is to tune autovacuum to run more often on the table.

http://www.postgresql.org/docs/9.1/static/gin-implementation.html

I have to say that I consistently have to turn "fastupdate" off for
our heavily updated gin-indexes. The overall performance gain
may be measurable, but its not intolerable without. The spikes seen
from the applications, when cleanup happens. Either in the foreground
or in the background are not tolerable. (multiple seconds).

I may just not have experienced suffienctly with the various sizes of
work_mem,
but I would indeed love to have a "connection only fastupdate", so
within a single
transaction it could use the fastupdate technique, but not stuffing
index-updates
onto unreleated queries by random.

Jesper
--
Jesper

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Marti Raudsepp (#3)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

Marti Raudsepp <marti@juffo.org> writes:

On Mon, Dec 19, 2011 at 17:30, Sushant Sinha <sushant354@gmail.com> wrote:

This never happened earlier with postgres 9.0 Is there a known issue
with Postgres 9.1? Or how to report this problem?

What *did* change in 9.1 is that there's new GIN cost estimation in
the planner. If you do EXPLAIN ANALYZE for your queries, do the plans
differ for 9.0 or 9.1?

I trolled the commit log a bit, and AFAICS the only significant GIN
changes between 9.1 and reasonably late-model 9.0 are the cost
estimation patch and this one:
http://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=73912e7fbd1b52c51d914214abbec1cda64595f2

which makes me wonder if maybe the OP has a very large fraction of empty
or null entries in his data. Previously those would have resulted in no
insertion traffic on a GIN index, but now they do.

Another thought -- have you read about the GIN "fast updates" feature?
This existed in 9.0 too.

Yeah, so it seems unlikely to be that, or at least not that by itself.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jesper Krogh (#6)
Re: Postgres 9.1: Adding rows to table causing too much latency in other queries

Jesper Krogh <jesper@krogh.cc> writes:

I have to say that I consistently have to turn "fastupdate" off for
our heavily updated gin-indexes. The overall performance gain
may be measurable, but its not intolerable without. The spikes seen
from the applications, when cleanup happens. Either in the foreground
or in the background are not tolerable. (multiple seconds).

Well, that's why there's a provision to turn it off: if response time
spikes are a bigger deal to you than overall performance, you probably
don't want bulk updates.

The theory is that you should be able to tune things so that the bulk
updates are done by autovacuum, but if you can't get that to work
sufficiently reliably, fastupdate=off is the best answer.

regards, tom lane