good experience with performance in 8.2 for multi column indexes

Started by Michael Enkeover 18 years ago4 messagesgeneral

michael.enke@wincor-nixdorf.com

over 18 years ago

Hi lists,
I want to let you take part in my experience of performance boost for delete operations
where more than one column is part of a primary key.

For my setup, in 8.1 a delete query which deletes 200000 entries depending on rows in another table
runs about 7h, in 8.2 (and later) it runs 9s!

I have two tables looking exactly the same, with two columns in the pk, one varchar(20) and one char(1).
Both tables contain the same contents.

Explain produces the following difference:
8.1:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in (select plunmbr,pluexttype from pluext2);
QUERY PLAN
----------------------------------------------------------------------------------------
Hash Join (cost=24267.10..155886.35 rows=48236 width=6)
Hash Cond: ("outer".pluexttype = "inner".pluexttype)
Join Filter: ("outer".plunmbr = "inner".plunmbr)
-> Seq Scan on pluext1 (cost=0.00..6945.00 rows=138900 width=46)
-> Hash (cost=24116.37..24116.37 rows=13891 width=40)
-> Unique (cost=23074.62..24116.37 rows=13891 width=40)
-> Sort (cost=23074.62..23421.87 rows=138900 width=40)
Sort Key: pluext2.plunmbr, pluext2.pluexttype
-> Seq Scan on pluext2 (cost=0.00..6945.00 rows=138900 width=40)
(9 rows)

(this delete took 7h)

8.2:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in (select plunmbr,pluexttype from pluext2);
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Nested Loop IN Join (cost=0.00..13362.14 rows=41106 width=6)
-> Seq Scan on pluext1 (cost=0.00..6411.25 rows=128225 width=46)
-> Index Scan using pluext2_pk on pluext2 (cost=0.00..0.50 rows=3 width=40)
Index Cond: ((pluext1.plunmbr = pluext2.plunmbr) AND (pluext2.pluexttype = pluext1.pluexttype))
(4 rows)

(this delete took 9s)

I could not find an explanation for this in the release notes for 8.2,
I thought it was much earlier that multi column indexes could be used.
Anyway, it saved my life that new version is fast. Many thanks to the developer!

BTW I do not understand the output of the 8.2 explain:
From my understanding it should do a seq scan on pluext2 (and not pluext1)
and an index scan on pluext1 (and not pluext2).

Regards,
Michael

--
Wincor Nixdorf International GmbH
Sitz der Gesellschaft: Paderborn
Registergericht Paderborn HRB 3507
Gesch��ftsf��hrer: Eckard Heidloff (Vorsitzender), Stefan Auerbach, Dr. J��rgen Wunram
Vorsitzender des Aufsichtsrats: Karl-Heinz Stiller
Steuernummer: 339/5884/0020 - Ust-ID Nr.: DE812927716 - WEEE-Reg.-Nr. DE44477193

Diese E-Mail enth��lt vertrauliche Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrt��mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser E-Mail ist nicht gestattet.

This e-mail may contain confidential information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Thomas Markus

t.markus@proventis.net

over 18 years ago

In reply to: Michael Enke (#1)

Re: good experience with performance in 8.2 for multi column indexes

Hi,

your query cant perform well on 8.1
better use a query like

delete from pluext1 using pluext2 where pluext1.plunmbr =
pluext2.plunmbr and pluext1.pluexttype = pluext2.pluexttype

it should perform much faster. be sure to use indizes

regards
thomas

Michael Enke schrieb:

Show quoted text

Hi lists,
I want to let you take part in my experience of performance boost for
delete operations
where more than one column is part of a primary key.

For my setup, in 8.1 a delete query which deletes 200000 entries
depending on rows in another table
runs about 7h, in 8.2 (and later) it runs 9s!

I have two tables looking exactly the same, with two columns in the
pk, one varchar(20) and one char(1).
Both tables contain the same contents.

Explain produces the following difference:
8.1:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in
(select plunmbr,pluexttype from pluext2);
QUERY PLAN
----------------------------------------------------------------------------------------

Hash Join (cost=24267.10..155886.35 rows=48236 width=6)
Hash Cond: ("outer".pluexttype = "inner".pluexttype)
Join Filter: ("outer".plunmbr = "inner".plunmbr)
-> Seq Scan on pluext1 (cost=0.00..6945.00 rows=138900 width=46)
-> Hash (cost=24116.37..24116.37 rows=13891 width=40)
-> Unique (cost=23074.62..24116.37 rows=13891 width=40)
-> Sort (cost=23074.62..23421.87 rows=138900 width=40)
Sort Key: pluext2.plunmbr, pluext2.pluexttype
-> Seq Scan on pluext2 (cost=0.00..6945.00
rows=138900 width=40)
(9 rows)

(this delete took 7h)

8.2:

tplinux=> explain delete from pluext1 where (plunmbr,pluexttype) in
(select plunmbr,pluexttype from pluext2);
QUERY PLAN
---------------------------------------------------------------------------------------------------------

Nested Loop IN Join (cost=0.00..13362.14 rows=41106 width=6)
-> Seq Scan on pluext1 (cost=0.00..6411.25 rows=128225 width=46)
-> Index Scan using pluext2_pk on pluext2 (cost=0.00..0.50 rows=3
width=40)
Index Cond: ((pluext1.plunmbr = pluext2.plunmbr) AND
(pluext2.pluexttype = pluext1.pluexttype))
(4 rows)

(this delete took 9s)

I could not find an explanation for this in the release notes for 8.2,
I thought it was much earlier that multi column indexes could be used.
Anyway, it saved my life that new version is fast. Many thanks to the
developer!

BTW I do not understand the output of the 8.2 explain:
From my understanding it should do a seq scan on pluext2 (and not
pluext1)
and an index scan on pluext1 (and not pluext2).

Regards,
Michael

Tom Lane

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Michael Enke (#1)

Re: good experience with performance in 8.2 for multi column indexes

Michael Enke <michael.enke@wincor-nixdorf.com> writes:

For my setup, in 8.1 a delete query which deletes 200000 entries depending on rows in another table
runs about 7h, in 8.2 (and later) it runs 9s!

If those rowcount estimates are even close to accurate, there's no
reason for the hash plan to be so much faster than the indexscan.
I think the real issue is somewhere else. Do you have any triggers
or foreign keys on this table?

regards, tom lane

Michael Enke

michael.enke@wincor-nixdorf.com

over 18 years ago

In reply to: Tom Lane (#3)

Re: good experience with performance in 8.2 for multi column indexes

No, both negative.

Michael

Tom Lane wrote:

Michael Enke <michael.enke@wincor-nixdorf.com> writes:

For my setup, in 8.1 a delete query which deletes 200000 entries depending on rows in another table
runs about 7h, in 8.2 (and later) it runs 9s!

If those rowcount estimates are even close to accurate, there's no
reason for the hash plan to be so much faster than the indexscan.
I think the real issue is somewhere else. Do you have any triggers
or foreign keys on this table?

regards, tom lane