VACUUM ANALYZE extremely slow

Started by Sergei Shelukhinalmost 19 years ago12 messagesgeneral
Jump to latest
#1Sergei Shelukhin
realgeek@gmail.com

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.

#2Chris Browne
cbbrowne@acm.org
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

A long time ago, in a galaxy far, far away, Sergei Shelukhin <realgeek@gmail.com> wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.

That does seem surprisingly slow fo that sort of hardware.

One thing I would suggest: Try JUST doing ANALYZE, with no VACUUM
involved.

That will merely do statistical sampling on the tables, and should
complete quite quickly, even for large tables.

That ought to fix the statistics problem.

Those tables that see frequent UPDATE/DELETE requests will still need
to be vacuumed, but fixing the stats doesn't require this.
--
let name="cbbrowne" and tld="gmail.com" in String.concat "@" [name;tld];;
http://linuxdatabases.info/info/postgresql.html
"Learning is weightless, a treasure you can always carry easily."
-- Chinese Proverb

#3Michael Fuhr
mike@fuhr.org
In reply to: Chris Browne (#2)
Re: VACUUM ANALYZE extremely slow

Sergei Shelukhin <realgeek@gmail.com> wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.

Please post an example query and its EXPLAIN ANALYZE output. The
pgsql-performance mailing list is a good place to discuss performance
problems.

This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

What other non-default configuration settings do you have? What
version of PostgreSQL are you using and on what OS? What kind of
disks and controllers do you have?

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

Why does the system have to go down? Are you running VACUUM FULL
ANALYZE? If so then drop the FULL and do an ordinary VACUUM ANALYZE
instead -- it should run faster and it doesn't require exclusive
access to the table.

As Christopher Browne mentioned, a bare ANALYZE (without VACUUM)
should be fast even on large tables so if necessary you could run
ANALYZE more often than VACUUM ANALYZE.

Have you enabled autovacuum (or contrib/pg_autovacuum in 8.0 and
earlier)? I sometimes prefer to run VACUUM ANALYZE manually but
for many databases autovacuum is a good way to maintain statistics
and clean up dead rows automatically.

--
Michael Fuhr

#4Steve Atkins
steve@blighty.com
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

On Jun 17, 2007, at 2:15 PM, Sergei Shelukhin wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

There's no need to stop the database to run vacuum or analyze (heck,
with autovacuum in modern versions of postgresql you can get away
without running them at all, sometimes).

I suspect you're doing the wrong thing (perhaps running vacuum full,
rather than plain vacuum). Can you tell us what version of postgresql
you're running and how you're vacuuming it (what commands you're
running)?

Cheers,
Steve

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

Sergei Shelukhin escribi�:

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.

Why waste time on it? Just move it to mysql since it's so good, and be
done with it.

--
Alvaro Herrera Valdivia, Chile ICBM: S 39� 49' 18.1", W 73� 13' 56.4"
"Someone said that it is at least an order of magnitude more work to do
production software than a prototype. I think he is wrong by at least
an order of magnitude." (Brian Kernighan)

#6David Wall
d.wall@computer.org
In reply to: Steve Atkins (#4)
Re: VACUUM ANALYZE extremely slow

On Jun 17, 2007, at 2:15 PM, Sergei Shelukhin wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

I have a 23 GB PG 8.1 db running on a 3 year old dual processor Dell
running Red Hat Linux, 2GB RAM. My tests ran on our production database
while it was running (albeit not under a heavy load) with no obvious
slowdown for users using our application. PG was using 1.2% of the CPU,
and noted the server had been running for 322 days. I'm not sure if
there's a way to make vacuum use more processor time to speed it up or not.

"ANALYZE" took 1 minute 14 seconds.

"VACUUM" took significantly longer at 24 minutes 22 seconds.

I noted that we hadn't set maintenance_work_mem so it presumably is
using the default of 16384 (16MB).

I'm sure much depends on how many tables, how many deleted rows, etc.
are in your db, but "several hours" certainly is nothing like what we see.

David

#7Bill Moran
wmoran@potentialtech.com
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

In response to Sergei Shelukhin <realgeek@gmail.com>:

This is my first (and, by the love of the God, last) project w/pgsql

One has to ask, are you actually looking for help, or trolling?

If you honestly want help, I would suggest you work on your communication
skills first. If you're a troll, go away.

--
Bill Moran
http://www.potentialtech.com

#8Greg Smith
gsmith@gregsmith.com
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

On Sun, 17 Jun 2007, Sergei Shelukhin wrote:

Is there any way to speed up ANALYZE? Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

I've found I cry a lot less if I actually spend a minute educating myself
about things instead of complaining to a mailing list in a fashion
offensive to the people who might help me. You should start with
http://www.postgresql.org/docs/current/static/sql-vacuum.html where you'll
discover the amazing fact that VACUUM ANALYZE doesn't require taking down
the system at all. The way you get it to take less time is to do it more
often. In fact, if you're using a PostgreSQL version where you can set up
auto-vacuum correctly, you might not ever need to run it manually at all.

Here are the things to consider if you actually want some help here:

0) You might as well keep this threading going, but next time, post to the
performance list instead of the general one; it's more appropriate and
you'll get a better mix of people familiar with this sort of topic.

1) Give some more details about the non-default values in your
postgresql.conf file.
http://www.westnet.com/~gsmith/content/postgresql/pg-5minute.htm will get
you started on the most important ones to worry about and pointers to
additional resources. You say you've increased the maintenance working
mem; that's a good start, but if everything else is at the default you're
not taking advantage of all the memory in your system. MySQL manages
memory very differently, and those differences may be the root cause of
your issue.

2) State what version of PostgreSQL you're running. If it's older than a
recent 8.1 release, there are many known and unresolvable performance
issues and you may have to upgrade to get what you're looking for.

3) Give some information about your disk configuration, and some tests
results to confirm they're working normally if possible.
http://www.westnet.com/~gsmith/content/postgresql/pg-disktesting.htm has
an outline of how to do those tests.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#9Sergei Shelukhin
realgeek@gmail.com
In reply to: Greg Smith (#8)
Re: VACUUM ANALYZE extremely slow

Hi. Sorry for being a bit emotional, I was pretty constructive in my
earlier posts (the earlier, the more constructive if you care to
search) but I am progressively getting pissed off :(

Thanks for the initial tip, running ANALYZE w/o vacuum is faster. Are
frequent vacuums even necessary if there are virtually no deletes in
the system and updates are much less frequent compared to inserts?

Now for the answers.
* What other non-default configuration settings do you have?
I played w/shared buffers, setting them between 16k and 32k,~ 24k
seems to be the best but the difference is minimal. The work_mem
setting is 256kb, and I increased effective cache size to ~700Mb (~35%
ram) based on a tip from some article.
max_fsm_* is increased too to accomodate vacuum analyze runs.

* What version of PostgreSQL are you using and on what OS?
8.1 and Debian

What kind of disks and controllers do you have?
Umm... Unfortunately I do not know, and do not have access to check :)
THe development server probably has a fast consumer grade HDD, not a
RAID I suppose.

* My "system has to go down" remark was probably a bit exaggerated;
the system is noticeably slower when the vacuum is running and that is
basically without workload; it will probably become unusable under
load.

* Specific queries that were slow: there are generally no specific
queries, everything runs slowly, mostly due to estimates being grossly
inaccurate (like 800k cost for a complex query based on Explain and
then 5, actual cost based on e.analyze)

I was not able to find any special bottlenecks in the queries. I come
from MSSQL (and a bit of MySQL) background, and the only thing I
noticed is that there are a lot of seqscans in the places where
indexes should be used (imo), when I turn seqscans off indexes are
ridiculously slow (I have already posted about it), presumably due to
random access used to build a big result set; however, result set is
not in fact big in real queries and postgres supposedly doesn't do
some magic that MSSQL does (join order or something) that makes the
join that should utilize the index narrower and thus faster.

There are also some weird development-stage-specific bottlenecks like
adding a column to 34m row table and updating it to default value (I
wasn't able to wait till the end of that query, it probably runs for
days); I think I understand why it is slow based on forum posts, but
again, MSSQL and MySQL do it better.

Also there's a general impression of everything being slow even after
MySQL on the same server; starting from seqscans to aggregate
operations; I won't be able to supply concrete evidence tho (not yet
anyway), it's just the gut feeling.

#10Sergei Shelukhin
realgeek@gmail.com
In reply to: Sergei Shelukhin (#9)
Re: VACUUM ANALYZE extremely slow

errr... workmem is 256Mb of course, and 5m for explain analyze costs.

#11Scott Marlowe
smarlowe@g2switchworks.com
In reply to: Sergei Shelukhin (#1)
Re: VACUUM ANALYZE extremely slow

Sergei Shelukhin wrote:

This is my first (and, by the love of the God, last) project w/pgsql
and everything but the simplest selects is so slow I want to cry.
This is especially bad with vacuum analyze - it takes several hours
for a database of mere 15 Gb on a fast double-core server w/2Gb of RAM
and virtually no workload at the moment. Maintenance work mem is set
to 512 Mb.

I noticed you didn't mention your disk subsystem. PostgreSQL tends to
use a fair bit of disk I/O when running vacuum and / or analyze. If you
check with top / iostat while vacuum analyze is running, I'm betting
you'll see a lot of waiting on I/O going on.

You do know those two commands (vacuum and analyze) aren't married
anymore, right? You can run analyze all by itself if you want?

And I hope you're not running vacuum analyze full all the time, cause
there's usually no need for that.

Look up pg_autovacuum. Saves a lot of har pulling.

Is there any way to speed up ANALYZE?

Analyze is usually REALY fast. Even on my rather pokey workstation,
with a single SATA hard drive and other things to do, I can run analyze
on a 31 Gig database in

Without it all the queries run
so slow that I want to cry after a couple of hours of operation and
with it system has to go down for hours per day and that is
unacceptable.

You should only need to run analyze every so often. You should only
need vacuum after lots of updates / deletes. You should not need to
take the system down to vacuum, as vacuum doesn't block. Vacuum full
does block, but if you need that you either aren't vacuuming often
enough or you don't have the autovacuum daemon configured.

The same database running on mysql on basically the same server used
to run optimize table on every table every half an hour without any
problem, I am actually pondering scraping half the work on the
conversion and stuff and going back to mysql but I wonder if there's
some way to improve it.

And when you ran optimize on those tables, were they not locked for
regular users the whole time?

There may be a way to improve it. Tell us, what OS are you running,
what are your non-default postgresql.conf settings, what ACTUAL commands
are you running here? Vacuum, vacuum analyze, vacuum full analyze? Are
you inserting / deleting / updating tons of rows between vacuums and /
or analyzes?

Show quoted text

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Sergei Shelukhin (#9)
Re: VACUUM ANALYZE extremely slow

Sergei Shelukhin escribi�:

* What other non-default configuration settings do you have?
I played w/shared buffers, setting them between 16k and 32k,~ 24k
seems to be the best but the difference is minimal. The work_mem
setting is 256kb, and I increased effective cache size to ~700Mb (~35%
ram) based on a tip from some article.
max_fsm_* is increased too to accomodate vacuum analyze runs.

work_mem = 256 MB is probably too much; you might be filling your RAM
with it, causing the system to swap. Try decreasing it. This would be
particularly noticeable under load. Keep in mind that work_mem is per
sort/hash and per backend, i.e. if you have a query which does 3 sorts
and is being executed by 5 processes in parallel, you will have 3 * 5 *
256 MB = 3840 MB of RAM in active use.

* My "system has to go down" remark was probably a bit exaggerated;
the system is noticeably slower when the vacuum is running and that is
basically without workload; it will probably become unusable under
load.

You can set the vacuum_cost_* settings in order to reduce the impact of
vacuum in the available I/O.

* Specific queries that were slow: there are generally no specific
queries, everything runs slowly, mostly due to estimates being grossly
inaccurate (like 800k cost for a complex query based on Explain and
then 5, actual cost based on e.analyze)

You might want to try increasing statistic targets for the problematic
columns. It would be useful to choose one of these and send them along
for dissection here on the lists, to investigate the actual problems.
Misestimations are a problem at times, but most of the time there are
workarounds.

--
Alvaro Herrera http://www.advogato.org/person/alvherre
"Oh, great altar of passive entertainment, bestow upon me thy discordant images
at such speed as to render linear thought impossible" (Calvin a la TV)