High-CPU consumption on information_schema (only) query

Started by Robins Tharakanover 9 years ago7 messages
#1Robins Tharakan
tharakan@gmail.com

Hi,

An SQL (with only information_schema related JOINS) when triggered, runs
with max CPU (and never ends - killed after 2 days).
- It runs similarly (very slow) on a replicated server that acts as a
read-only slave.
- Top shows only postgres as hitting max CPU (nothing else). When query
killed, CPU near 0%.
- When the DB is restored on a separate test server (with the exact
postgresql.conf) the same query works fine.
- There is no concurrent usage on the replicated / test server (although
the primary is a Production server and has concurrent users).

Questions:
- If this was a postgres bug or a configuration issue, query on the
restored DB should have been slow too. Is there something very basic I am
missing here?

If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
here. I amn't looking for a specific solution but what else should I be
looking for here?

p.s.: All postgres servers are running the v9.3.10

-
robins
--

-
robins

#2Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Robins Tharakan (#1)
Re: High-CPU consumption on information_schema (only) query

On 8 Sep. 2016 7:38 am, "Robins Tharakan" <tharakan@gmail.com> wrote:

Hi,

An SQL (with only information_schema related JOINS) when triggered, runs

with max CPU (and never ends - killed after 2 days).

- It runs similarly (very slow) on a replicated server that acts as a

read-only slave.

- Top shows only postgres as hitting max CPU (nothing else). When query

killed, CPU near 0%.

- When the DB is restored on a separate test server (with the exact

postgresql.conf) the same query works fine.

- There is no concurrent usage on the replicated / test server (although

the primary is a Production server and has concurrent users).

Questions:
- If this was a postgres bug or a configuration issue, query on the

restored DB should have been slow too. Is there something very basic I am
missing here?

If someone asks for I could provide SQL + EXPLAIN, but it feels

irrelevant here. I amn't looking for a specific solution but what else
should I be looking for here?

Get a series of stack traces.

Perf with stack output would be good too.

You need debug info for both.

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Robins Tharakan (#1)
Re: High-CPU consumption on information_schema (only) query

On Wed, Sep 7, 2016 at 4:37 PM, Robins Tharakan <tharakan@gmail.com> wrote:

Hi,

An SQL (with only information_schema related JOINS) when triggered, runs

with max CPU (and never ends - killed after 2 days).

- It runs similarly (very slow) on a replicated server that acts as a

read-only slave.

- Top shows only postgres as hitting max CPU (nothing else). When query

killed, CPU near 0%.

- When the DB is restored on a separate test server (with the exact

postgresql.conf) the same query works fine.

- There is no concurrent usage on the replicated / test server (although

the primary is a Production server and has concurrent users).

Questions:
- If this was a postgres bug or a configuration issue, query on the

restored DB should have been slow too. Is there something very basic I am
missing here?

If someone asks for I could provide SQL + EXPLAIN, but it feels

irrelevant here. I amn't looking for a specific solution but what else
should I be looking for here?

strace -ttt -T -y the process to see what system calls it is making. If it
is not doing many systme calls, or they are uninformative, then attach the
gdb debugger to it and periodically interrupt the process (ctrl c) and take
a back trace (bt), then restart it (c) and repeat. If all the stack traces
look similar, you will know where the time is going.

Cheers,

Jeff

#4Andres Freund
andres@anarazel.de
In reply to: Robins Tharakan (#1)
Re: High-CPU consumption on information_schema (only) query

On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote:

If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
here.

Why is that? information_schema are normal sql queries, and some of them
far from trivial.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robins Tharakan
tharakan@gmail.com
In reply to: Andres Freund (#4)
Re: High-CPU consumption on information_schema (only) query

On Fri, 9 Sep 2016 at 09:39 Andres Freund <andres@anarazel.de> wrote:

On 2016-09-07 23:37:31 +0000, Robins Tharakan wrote:

If someone asks for I could provide SQL + EXPLAIN, but it feels

irrelevant

here.

Why is that? information_schema are normal sql queries, and some of them
far from trivial.

Andres

Hi Andres,

I completely agree. With 'irrelevant' I was only trying to imply that
irrespective of the complexity of the query, a replicated box was seeing
similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
isn't to blame here...

In effect, I was trying to ask if I am forgetting / missing something very
obvious / important that could cause such an observation.

As others recommended, I am unable to have direct access to the production
(master / slave) instances and so GDB / stack trace options are out of
bounds at this time. I'll revert if I am able to do that.

-
thanks
robins

--

-
robins

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robins Tharakan (#5)
Re: High-CPU consumption on information_schema (only) query

Robins Tharakan <tharakan@gmail.com> writes:

I completely agree. With 'irrelevant' I was only trying to imply that
irrespective of the complexity of the query, a replicated box was seeing
similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
isn't to blame here...

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Robins Tharakan
tharakan@gmail.com
In reply to: Tom Lane (#6)
Re: High-CPU consumption on information_schema (only) query

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.

regards, tom lane

Thanks. EXPLAIN plans were different but (don't have them now and) didn't
know system catalogs were so severely affected by outdated statistics as
well (which is why I was looking for any other reasons I might be missing). I
reckon an ANALYSE; should solve this? ... Would get back if I have
something else to offer.

-
robins
--

-
robins