Everlasting SQL query

Started by Joost Kraaijeveldover 21 years ago5 messagesgeneral
Jump to latest
#1Joost Kraaijeveld
J.Kraaijeveld@Askesis.nl

Hi everyone,

I have a customer table (17518 records) and an orders table (88393 records). One of the columns of orders is customerid, containing the customerid (what else, but it is not a foreign key as this table is imported from a database that did not support foreign keys).

If I do this query (with pgadmin III):

select customer.id, customer.name, orders.id
from customers, orders
order by customer.id, orders.id
limit 25

The query runs forever (the longest I let it run is 500 seconds).

Explain gives me this (why 7 rows?):

QUERY PLAN
-------------------------------------------------------------------------------------
Limit (cost=722506879.16..722506879.22 rows=25 width=44)
-> Sort (cost=722506879.16..726378050.59 rows=1548468574 width=44)
Sort Key: klt_alg.klantnummer, orders.ordernummer
-> Nested Loop (cost=9408.93..36288661.59 rows=1548468574 width=44)
-> Seq Scan on klt_alg (cost=0.00..1927.18 rows=17518 width=40)
-> Materialize (cost=9408.93..10595.86 rows=88393 width=4)
-> Seq Scan on orders (cost=0.00..9105.93 rows=88393 width=4)
(7 rows)

If I only order by customer.id or by orders.is the query return within a second.

Can anyone give me a reason why this all happens?

Groeten,

Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: J.Kraaijeveld@Askesis.nl
web: www.askesis.nl

#2Csaba Nagy
nagy@ecircle-ag.com
In reply to: Joost Kraaijeveld (#1)
Re: Everlasting SQL query

Joost,

The spelling mistake from below will cause you an unwanted join with the
"customers" table if you have one, which is unconstrained by any where
clause, so it will take a carthesian product of the rest and the
customers table.
If this spelling mistake is just in the mail, please disregard.

Cheers,
Csaba.

select customer.id, customer.name, orders.id
from customers, orders

^^^^^^^^^

Show quoted text

order by customer.id, orders.id
limit 25

#3Michal Taborsky
michal@taborsky.cz
In reply to: Joost Kraaijeveld (#1)
Re: Everlasting SQL query

Hi Joost.

Joost Kraaijeveld wrote:

I have a customer table (17518 records) and an orders table (88393
records). One of the columns of orders is customerid, containing the
customerid (what else, but it is not a foreign key as this table is
imported from a database that did not support foreign keys).

If I do this query (with pgadmin III):

select customer.id, customer.name, orders.id from customers, orders
order by customer.id, orders.id limit 25

The query runs forever (the longest I let it run is 500 seconds).

No wonder. You are retrieving 1548468574 rows. You are trying to perform
a JOIN, but without specifying which fields to join on. So the query
works with cartesian product of these two table (all possible
combinantions), which is 17518 * 88393 = 1548468574 rows.

You want:

select customer.id, customer.name, orders.id
from customers JOIN orders ON customers.id=orders.customerid
order by customer.id, orders.id
limit 25

or alternatively:

select customer.id, customer.name, orders.id
from customers, orders
where customers.id=orders.customerid
order by customer.id, orders.id
limit 25

I prefer the first notation, though.

--
Michal Taborsky
http://www.taborsky.cz

#4Michael Glaesemann
grzm@seespotcode.net
In reply to: Joost Kraaijeveld (#1)
Re: Everlasting SQL query

On Jul 28, 2004, at 7:08 PM, Joost Kraaijeveld wrote:

select customer.id, customer.name, orders.id
from customers, orders
order by customer.id, orders.id
limit 25

The query runs forever (the longest I let it run is 500 seconds).

You have no join condition, so it's doing a full cartesian join (17518
x 88393 = 1,548,468,574 rows before the limit). Try this:

select customer.id, customer.name, orders.id
from customers c, orders o
where c.id = o.customerid
order by customer.id, orders.id
limit 25

Michael Glaesemann
grzm myrealbox com

#5Chris Mair
list@1006.org
In reply to: Joost Kraaijeveld (#1)
Re: Everlasting SQL query

On Wed, 2004-07-28 at 12:08, Joost Kraaijeveld wrote:

Hi everyone,

I have a customer table (17518 records) and an orders table (88393 records). One of the columns of orders is customerid, containing the customerid (what else, but it is not a foreign key as this table is imported from a database that did not support foreign keys).

If I do this query (with pgadmin III):

select customer.id, customer.name, orders.id
from customers, orders
order by customer.id, orders.id
limit 25

The query runs forever (the longest I let it run is 500 seconds).
[...]

If I only order by customer.id or by orders.is the query return within a second.

Can anyone give me a reason why this all happens?

This is an inner join without a where clause.
It gives the crsoo product of 17518*88393 = 1548468574 results.

If you order by just one, PG can manage to give you the first 25 results
out of the 1.5 billion (!). If you order by both there's no other
way than to (try) computing everything - which PG does.

You most likely don't want this. Add a where clause:
where order.customer_id = customer.id
or something like that (I'm just guessing your scheme).

Bye, Chris.