7.2.1 optimises very badly against 7.2

Started by Sam Liddicottalmost 24 years ago4 messagesgeneral
Jump to latest
#1Sam Liddicott
sam.liddicott@ananova.com

I have a 10GB database which serves up customised tv listings selections at
www.ananova.com/tv <http://www.ananova.com/tv&gt; (see "Personalise Listings"
in the left column)

We had it running very well under postgres 7.2 on a 4xPentium 700mhz with
8GB RAM

For a personalised selection from "start" to "end" of 7 channels [url1
below]
takes a fraction of a second to do the query and if I'm lucky enough to spot
it on "top" it uses low CPU percentage.

Under 7.2.1 it takes 99% CPU for between 5-9 seconds.
Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling
back, then it is very fast again.
[We normally vacuum analyse every 24 hours]

We have max connections=128, 4520 shared buffers 91268 sortmem and collect
row level and block level stats.

I also have this anecodotal information;
we installed 7.2.1 on a fresh box and then restored a pg_dump of the tv
database and found it always very slow, stopping postgres and transferring
the binary DB files and restarting was very fast but degraded slowly over a
few days.

So I imagine there is something about stats gathering and use changed with
7.2.1 (I hear it has a new optimiser).

The query we do is complex and really does need a good optimiser (why we
don't use mysql) as it has to join programmes against broadcasts (restricted
by time) to channels (restricted to interested channels). It has to be
careful not to initially start with all broadcasts of a interested channel
as well as not all broadcasts on the interested channels.

[url1, 7 channels from midnight to about 6:00am day after]
http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1
<http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1&amp;start=Start
&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1>
&start=Start&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1

_____

Samuel Liddicott
Support Consultant
sam@ananova.com <mailto:sam@ananova.com>
Direct Dial: +44 (0)113 367 4523
Fax: +44 (0)113 367 4680
Switchboard: +44 (0)113 367 4600

Ananova Limited
Marshall Mill
Marshall Street
Leeds
LS11 9YJ

http://www.ananova.com

Registered Office:
St James Court
Great Park Road
Almondsbury Park
Bradley Stoke
Bristol BS32 4QJ
Registered in England No.2858918

The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you receive
this in error, please contact the sender and delete the material from any
computer.

#2Nigel J. Andrews
nandrews@investsystems.co.uk
In reply to: Sam Liddicott (#1)
Re: 7.2.1 optimises very badly against 7.2

On Fri, 28 Jun 2002, Sam Liddicott wrote:

I have a 10GB database which serves up customised tv listings selections at
www.ananova.com/tv <http://www.ananova.com/tv&gt; (see "Personalise Listings"
in the left column)

We had it running very well under postgres 7.2 on a 4xPentium 700mhz with
8GB RAM

For a personalised selection from "start" to "end" of 7 channels [url1
below]
takes a fraction of a second to do the query and if I'm lucky enough to spot
it on "top" it uses low CPU percentage.

Under 7.2.1 it takes 99% CPU for between 5-9 seconds.
Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling
back, then it is very fast again.
[We normally vacuum analyse every 24 hours]

We have max connections=128, 4520 shared buffers 91268 sortmem and collect
row level and block level stats.

I'm thinking that you need to run some queries with explain analyze ... to see
what the planner thinks it should do and how it's estimates differ from
reality under both 7.2 and 7.2.1.

I also have this anecodotal information;
we installed 7.2.1 on a fresh box and then restored a pg_dump of the tv
database and found it always very slow, stopping postgres and transferring
the binary DB files and restarting was very fast but degraded slowly over a
few days.

I know nothing of the filesystem structure but that sounds odd. Like there's
something introduced into 7.2.1 that's inherently slow. I assume there's also
data loads over those few days.

So I imagine there is something about stats gathering and use changed with
7.2.1 (I hear it has a new optimiser).

Well you'll be able to see if it's the stats. that are causing this by doing
explains and comparing pg_stats for the 7.2 and pg_restored 7.2.1.

You could also do the binary transfer from 7.2 to 7.2.1 again and check the
explains and pg_stats immediately after and then after it's slowed down. This
is probably most useful if there are _no_ data loads in the meantime.

The query we do is complex and really does need a good optimiser (why we
don't use mysql) as it has to join programmes against broadcasts (restricted
by time) to channels (restricted to interested channels). It has to be
careful not to initially start with all broadcasts of a interested channel
as well as not all broadcasts on the interested channels.

[url1, 7 channels from midnight to about 6:00am day after]
http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1
<http://www.ananova.com/tv_listings/tv_mainlisting.html?day=day1&amp;start=Start
&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1>
&start=Start&end=End&tvregion=n_14&i=1&p=0&h=&c=12.25.36.53.54.86.33&S=1

Interesting that your stuff completes so quick normally. I worked on a large TV
listings site that had complex queries and ran with Oracle. It's queries took
ages to run mostly. I didn't design the system btw although I did have a hand
in some lucky chap winning two or three short breaks one week when we tweaked a
slow query and it turned out to be slightly more complex than we thought when
we tweaked it.

--
Nigel J. Andrews
Director

---
Logictree Systems Limited
Computer Consultants

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Sam Liddicott (#1)
Re: 7.2.1 optimises very badly against 7.2

"Sam Liddicott" <sam.liddicott@ananova.com> writes:

Under 7.2.1 it takes 99% CPU for between 5-9 seconds.
Rolling back to 7.2 is also very slow unless we vacuum analyse after rolling
back, then it is very fast again.
[We normally vacuum analyse every 24 hours]

AFAIK, the only change from 7.2 to 7.2.1 that would be likely to have
anything to do with this was a rework of the code in ANALYZE that
estimates the number of distinct values in a column. (See pghackers
archives from around 18-Feb.) Although the revised version did better
on some test cases, it sounds like it's doing worse for you.

As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE
results. I'd also like to see the pg_stats entries for the table(s)
used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE.

regards, tom lane

#4Sam Liddicott
sam.liddicott@ananova.com
In reply to: Tom Lane (#3)
Re: 7.2.1 optimises very badly against 7.2

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: 28 June 2002 15:13
To: Sam Liddicott
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] 7.2.1 optimises very badly against 7.2

As Nigel commented, we can't do much without seeing EXPLAIN ANALYZE
results. I'd also like to see the pg_stats entries for the table(s)
used by the query, as produced by both 7.2 ANALYZE and 7.2.1 ANALYZE.

Thanks for your comments (and everyone else).
I'll get these stats and explain analyse out on Monday and we can have a
good luck.
Just doing some swig-java work for my boss.

Sam