performance advice needed: join vs explicit subselect

Started by Karsten Hilbertabout 17 years ago11 messagesgeneral
Jump to latest
#1Karsten Hilbert
Karsten.Hilbert@gmx.net

Hello all,

maybe some general advice can be had on this:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

Is there general advice as to which of the alternatives is
worse under most if not all circumstances ?

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#2Justin
justin@emproshunts.com
In reply to: Karsten Hilbert (#1)
Re: performance advice needed: join vs explicit subselect

Karsten Hilbert wrote:

Hello all,

maybe some general advice can be had on this:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

Select testresults.*, Modifer.Name, Intended.name, Actual.name from
testresults
left join (Select pk, name from staff) Modifer
on Modifer.pk = testresults.modified_by
left join (Select pk, name from staff) Intended
on Reviewer.pk = testresults.intended_reviewer
left join (Select pk, name from staff) Actual
on pk = testresults.actual_reviewer

This is what i think you are after. You can do this via nested queries
also for each name

#3Justin
justin@emproshunts.com
In reply to: Justin (#2)
Re: performance advice needed: join vs explicit subselect

typo sorry

justin wrote:

Show quoted text

Karsten Hilbert wrote:

Hello all,

maybe some general advice can be had on this:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

Select testresults.*, Modifer.Name, Intended.name, Actual.name from
testresults
left join (Select pk, name from staff) Modifer
on Modifer.pk = testresults.modified_by
left join (Select pk, name from staff) Intended
on Inteded.pk = testresults.intended_reviewer
left join (Select pk, name from staff) Actual
on Actual.pk = testresults.actual_reviewer

This is what i think you are after. You can do this via nested
queries also for each name

#4Sam Mason
sam@samason.me.uk
In reply to: Karsten Hilbert (#1)
Re: performance advice needed: join vs explicit subselect

On Tue, Jan 27, 2009 at 07:12:05PM +0100, Karsten Hilbert wrote:

Hello all,

maybe some general advice can be had on this:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

if you've only got three columns it shouldn't be too bad should it?

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

This would look a bit prettier, but PG tends not to optimize at all. It
always executes it as a subplan and hence will only work nicely when
you've got a very small subset of the test_results coming back. PG will
*sometimes* remove subexpressions, but doesn't seem very predictable
about it:

SELECT id
FROM (
SELECT a.id, (SELECT b.name FROM bar b WHERE a.tid = b.tid)
FROM foo a) x;

PG seems to recognize that it can remove the subselect in the above
which is nice, but in other situations it doesn't seem to.

--
Sam http://samason.me.uk/

#5Alban Hertroys
dalroi@solfertje.student.utwente.nl
In reply to: Karsten Hilbert (#1)
Re: performance advice needed: join vs explicit subselect

On Jan 27, 2009, at 7:12 PM, Karsten Hilbert wrote:

Hello all,

maybe some general advice can be had on this:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

Is there general advice as to which of the alternatives is
worse under most if not all circumstances ?

I did something similar once using expression logic for my aggregates:

SELECT
SUM(CASE WHEN modified_by = pk THEN 1 ELSE 0 END) AS modified_by_count,
SUM(CASE WHEN intended_reviewer = pk THEN 1 ELSE 0 END) AS
intended_reviewer_count,
SUM(CASE WHEN actual_reviewer = pk THEN 1 ELSE 0 END) AS
actual_reviewer_count
FROM test_results, staff
WHERE pk IN (modified_by, intended_reviewer, actual_reviewer)

Mind, this will very probably do a sequential scan over the product of
both tables, but at least now the staff table is in that product only
once.

In actuality I didn't use CASE statements but cast the boolean results
of the expressions directly to integer, something like
SUM((modified_by = pk)::int), but that cast may no longer work since
8.3.

I no longer have access to the project that I used this on, so I can't
verify unfortunately.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,497f5aa8747035160810079!

#6Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Sam Mason (#4)
Re: performance advice needed: join vs explicit subselect

On Tue, Jan 27, 2009 at 06:48:11PM +0000, Sam Mason wrote:

table test_results
modified_by integer foreign key staff(pk),
intended_reviewer integer foreign key staff(pk),
actual_reviewer integer foreign key staff(pk)

(this table will contain millions of rows)

table staff
pk integer
name text

(this table will contain at most 50 rows)

Now I want to set up a view which aggregates test results
with staff names for all three foreign keys. This would mean
I would either have to

- join test_results to staff three times, once for each
of the foreign keys, this is going to be messy with
tracking table aliases, duplicate column names etc

if you've only got three columns it shouldn't be too bad should it?

This is what one deserves for thinking to be able to distill
the essence of a problem :-)

The view in question is in fact a lot more complicated. This
is the best I've been able to come up with so far (and it is
still slow - slow as in 3-4 seconds for 20 records out of
(currently only) 50 !):

create view clin.v_test_results as

select
cenc.fk_patient
as pk_patient,
-- test_result
tr.pk as pk_test_result,
tr.clin_when,
-- unified
vttu.unified_code,
vttu.unified_name,
case when coalesce(trim(both from tr.val_alpha), '') = ''
then tr.val_num::text
else case when tr.val_num is null
then tr.val_alpha
else tr.val_num::text || ' (' || tr.val_alpha || ')'
end
end as unified_val,
coalesce(tr.val_target_min, tr.val_normal_min)
as unified_target_min,
coalesce(tr.val_target_max, tr.val_normal_max)
as unified_target_max,
coalesce(tr.val_target_range, tr.val_normal_range)
as unified_target_range,
tr.soap_cat,
tr.narrative
as comment,
-- test result data
tr.val_num,
tr.val_alpha,
tr.val_unit,
vttu.conversion_unit,
tr.val_normal_min,
tr.val_normal_max,
tr.val_normal_range,
tr.val_target_min,
tr.val_target_max,
tr.val_target_range,
tr.abnormality_indicator,
tr.norm_ref_group,
tr.note_test_org,
tr.material,
tr.material_detail,
-- test type data
vttu.code_tt,
vttu.name_tt,
vttu.coding_system_tt,
vttu.comment_tt,
vttu.code_unified,
vttu.name_unified,
vttu.coding_system_unified,
vttu.comment_unified,

-- episode/issue data
epi.description
as episode,

-- status of last review
coalesce(rtr.fk_reviewed_row, 0)::bool
as reviewed,
rtr.is_technically_abnormal
as is_technically_abnormal,
rtr.clinically_relevant
as is_clinically_relevant,
rtr.comment
as review_comment,

(select
short_alias || ' (' ||
coalesce(title || ' ', '') ||
coalesce(firstnames || ' ', '') ||
coalesce(lastnames, '') ||
')'
from dem.v_staff
where pk_staff = rtr.fk_reviewer
) as last_reviewer,

rtr.modified_when
as last_reviewed,

coalesce (
(rtr.fk_reviewer = (select pk from dem.staff where db_user = current_user)),
False
)
as review_by_you,

coalesce (
(tr.fk_intended_reviewer = rtr.fk_reviewer),
False
)
as review_by_responsible_reviewer,

-- potential review status
(select
short_alias || ' (' ||
coalesce(title || ' ', '') ||
coalesce(firstnames || ' ', '') ||
coalesce(lastnames, '') ||
')'
from dem.v_staff
where pk_staff = tr.fk_intended_reviewer
) as responsible_reviewer,

coalesce (
(tr.fk_intended_reviewer = (select pk from dem.staff where db_user = current_user)),
False
)
as you_are_responsible,

case when ((select 1 from dem.staff where db_user = tr.modified_by) is null)
then '<' || tr.modified_by || '>'
else (select short_alias from dem.staff where db_user = tr.modified_by)
end
as modified_by,

tr.modified_when,
tr.row_version as row_version,

-- management keys
-- clin.clin_root_item
tr.pk_item,
tr.fk_encounter as pk_encounter,
tr.fk_episode as pk_episode,
-- test_result
tr.fk_type as pk_test_type,
tr.fk_intended_reviewer as pk_intended_reviewer,
tr.xmin as xmin_test_result,
-- v_unified_test_types
vttu.pk_test_org,
vttu.pk_test_type_unified,
-- v_pat_episodes
epi.fk_health_issue
as pk_health_issue,
-- reviewed_test_results
rtr.fk_reviewer as pk_last_reviewer
from
clin.test_result tr
left join clin.encounter cenc on (tr.fk_encounter = cenc.pk)
left join clin.episode epi on (tr.fk_episode = epi.pk)
left join clin.reviewed_test_results rtr on (tr.pk = rtr.fk_reviewed_row)
,
clin.v_unified_test_types vttu
where
tr.fk_type = vttu.pk_test_type
;

- write three explicit sub-selects for the columns I want
to denormalize into the view definition

This would look a bit prettier, but PG tends not to optimize at all. It
always executes it as a subplan and hence will only work nicely when
you've got a very small subset of the test_results coming back.

Potentially in the low hundreds.

Thanks !
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Karsten Hilbert (#6)
Re: performance advice needed: join vs explicit subselect

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:

The view in question is in fact a lot more complicated. This
is the best I've been able to come up with so far (and it is
still slow - slow as in 3-4 seconds for 20 records out of
(currently only) 50 !):

What does EXPLAIN ANALYZE say about it? Also, what is the use-case
you are concerned about --- selecting the whole view contents, or
selecting WHERE something-or-other?

regards, tom lane

#8Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Tom Lane (#7)
Re: performance advice needed: join vs explicit subselect

On Tue, Jan 27, 2009 at 04:42:02PM -0500, Tom Lane wrote:

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:

The view in question is in fact a lot more complicated. This
is the best I've been able to come up with so far (and it is
still slow - slow as in 3-4 seconds for 20 records out of
(currently only) 50 !):

What does EXPLAIN ANALYZE say about it? Also, what is the use-case
you are concerned about --- selecting the whole view contents, or
selecting WHERE something-or-other?

The query that's run by my application (wiki.gnumed.de) is

select *, xmin_test_result from clin.v_test_results
where pk_patient = 138 <--- this is a variable
order by clin_when desc, pk_episode, unified_name
;

the explain analyze of which is (I've actually gotten it to
work better in the meantime as you can see):

SET
BEGIN
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=8512.91..8512.92 rows=1 width=721) (actual time=2039.771..2039.787 rows=14 loops=1)
Sort Key: tr.clin_when, tr.fk_episode, (COALESCE(ttu.name, tt1.name))
Sort Method: quicksort Memory: 22kB
InitPlan
-> Seq Scan on staff (cost=0.00..1.07 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
Filter: (db_user = "current_user"())
-> Seq Scan on staff (cost=0.00..1.07 rows=1 width=4) (actual time=0.014..0.014 rows=0 loops=1)
Filter: (db_user = "current_user"())
-> Nested Loop Left Join (cost=3.29..8510.75 rows=1 width=721) (actual time=145.824..2039.427 rows=14 loops=1)
-> Nested Loop Left Join (cost=3.29..27.82 rows=1 width=671) (actual time=4.230..5.298 rows=14 loops=1)
-> Nested Loop Left Join (cost=3.29..23.66 rows=1 width=646) (actual time=4.209..5.061 rows=14 loops=1)
Join Filter: (tt1.pk = ltt2ut.fk_test_type)
-> Nested Loop (cost=2.20..21.42 rows=1 width=565) (actual time=4.089..4.444 rows=14 loops=1)
-> Merge Join (cost=2.20..20.79 rows=1 width=469) (actual time=4.069..4.201 rows=14 loops=1)
Merge Cond: (cenc.pk = tr.fk_encounter)
-> Index Scan using encounter_pkey on encounter cenc (cost=0.00..294.43 rows=16 width=8) (actual time=1.470..3.691 rows=29 loops=1)
Filter: (fk_patient = 138)
-> Sort (cost=2.20..2.29 rows=34 width=465) (actual time=0.279..0.330 rows=34 loops=1)
Sort Key: tr.fk_encounter
Sort Method: quicksort Memory: 25kB
-> Seq Scan on test_result tr (cost=0.00..1.34 rows=34 width=465) (actual time=0.027..0.141 rows=34 loops=1)
-> Index Scan using test_type_pkey on test_type tt1 (cost=0.00..0.62 rows=1 width=96) (actual time=0.007..0.009 rows=1 loops=14)
Index Cond: (tt1.pk = tr.fk_type)
-> Hash Join (cost=1.09..2.19 rows=4 width=89) (actual time=0.012..0.031 rows=4 loops=14)
Hash Cond: (ttu.pk = ltt2ut.fk_test_type_unified)
-> Seq Scan on test_type_unified ttu (cost=0.00..1.04 rows=4 width=85) (actual time=0.003..0.008 rows=4 loops=14)
-> Hash (cost=1.04..1.04 rows=4 width=8) (actual time=0.023..0.023 rows=4 loops=1)
-> Seq Scan on lnk_ttype2unified_type ltt2ut (cost=0.00..1.04 rows=4 width=8) (actual time=0.006..0.013 rows=4 loops=1)
-> Index Scan using episode_pkey on episode epi (cost=0.00..4.15 rows=1 width=29) (actual time=0.009..0.011 rows=1 loops=14)
Index Cond: (tr.fk_episode = epi.pk)
-> Index Scan using unique_review_per_row on reviewed_test_results rtr (cost=0.00..0.62 rows=1 width=50) (actual time=0.005..0.008 rows=1 loops=14)
Index Cond: (tr.pk = rtr.fk_reviewed_row)
SubPlan
-> Seq Scan on staff (cost=0.00..1.06 rows=1 width=4) (actual time=0.005..0.006 rows=1 loops=14)
Filter: (db_user = $20)
-> Seq Scan on staff (cost=0.00..1.06 rows=1 width=0) (actual time=0.006..0.007 rows=1 loops=14)
Filter: (db_user = $20)
-> Subquery Scan v_staff (cost=15.73..4240.07 rows=1 width=128) (actual time=29.739..74.520 rows=1 loops=14)
-> Nested Loop (cost=15.73..4240.04 rows=1 width=279) (actual time=29.731..74.510 rows=1 loops=14)
Join Filter: (s.fk_identity = i.pk)
-> Nested Loop (cost=0.00..2.31 rows=1 width=143) (actual time=0.015..0.051 rows=1 loops=14)
Join Filter: (s.fk_role = sr.pk)
-> Seq Scan on staff s (cost=0.00..1.06 rows=1 width=131) (actual time=0.006..0.008 rows=1 loops=14)
Filter: (pk = $12)
-> Seq Scan on staff_role sr (cost=0.00..1.11 rows=11 width=16) (actual time=0.002..0.018 rows=11 loops=14)
-> Hash Join (cost=15.73..4210.50 rows=207 width=120) (actual time=0.409..73.865 rows=209 loops=14)
Hash Cond: (n.id_identity = i.pk)
-> Seq Scan on names n (cost=0.00..4.27 rows=210 width=29) (actual time=0.007..0.387 rows=211 loops=14)
Filter: active
-> Hash (cost=13.12..13.12 rows=209 width=95) (actual time=0.883..0.883 rows=209 loops=1)
-> Seq Scan on identity i (cost=0.00..13.12 rows=209 width=95) (actual time=0.007..0.510 rows=209 loops=1)
Filter: ((deleted IS FALSE) AND (deceased IS NULL))
SubPlan
-> Nested Loop (cost=0.00..9.86 rows=1 width=8) (never executed)
-> Seq Scan on marital_status ms1 (cost=0.00..1.07 rows=1 width=8) (never executed)
Filter: (pk = $9)
-> Index Scan using identity_pkey on identity i1 (cost=0.00..8.27 rows=1 width=0) (never executed)
Index Cond: (i1.pk = $10)
-> Nested Loop (cost=0.00..9.35 rows=1 width=8) (never executed)
-> Seq Scan on marital_status ms (cost=0.00..1.07 rows=1 width=8) (never executed)
Filter: (pk = $9)
-> Index Scan using identity_pkey on identity i1 (cost=0.00..8.27 rows=1 width=0) (never executed)
Index Cond: (i1.pk = $10)
SubPlan
-> Result (cost=22.06..22.07 rows=1 width=0) (actual time=0.140..0.141 rows=1 loops=14)
InitPlan
-> Result (cost=11.02..11.03 rows=1 width=0) (actual time=0.068..0.069 rows=1 loops=14)
InitPlan
-> Index Scan using pg_authid_rolname_index on pg_authid (cost=1.25..11.02 rows=1 width=0) (actual time=0.062..0.062 rows=1 loops=14)
Index Cond: (rolname = 'gnumed_v10'::name)
Filter: ((NOT rolcanlogin) AND ($13 = ANY ((subplan))))
InitPlan
-> Seq Scan on pg_authid (cost=0.00..1.25 rows=1 width=4) (actual time=0.011..0.015 rows=1 loops=14)
Filter: (rolcanlogin AND (rolname = $1))
SubPlan
-> Seq Scan on pg_auth_members (cost=0.00..1.49 rows=4 width=4) (actual time=0.015..0.021 rows=4 loops=14)
Filter: (roleid = $3)
-> Result (cost=11.02..11.03 rows=1 width=0) (actual time=0.060..0.061 rows=1 loops=14)
InitPlan
-> Index Scan using pg_authid_rolname_index on pg_authid (cost=1.25..11.02 rows=1 width=0) (actual time=0.056..0.056 rows=1 loops=14)
Index Cond: (rolname = 'gm-logins'::name)
Filter: ((NOT rolcanlogin) AND ($16 = ANY ((subplan))))
InitPlan
-> Seq Scan on pg_authid (cost=0.00..1.25 rows=1 width=4) (actual time=0.010..0.013 rows=1 loops=14)
Filter: (rolcanlogin AND (rolname = $1))
SubPlan
-> Seq Scan on pg_auth_members (cost=0.00..1.49 rows=4 width=4) (actual time=0.005..0.021 rows=5 loops=14)
Filter: (roleid = $3)
-> Subquery Scan v_staff (cost=15.73..4240.07 rows=1 width=128) (actual time=29.500..70.712 rows=1 loops=14)
-> Nested Loop (cost=15.73..4240.04 rows=1 width=279) (actual time=29.492..70.701 rows=1 loops=14)
Join Filter: (s.fk_identity = i.pk)
-> Nested Loop (cost=0.00..2.31 rows=1 width=143) (actual time=0.016..0.048 rows=1 loops=14)
Join Filter: (s.fk_role = sr.pk)
-> Seq Scan on staff s (cost=0.00..1.06 rows=1 width=131) (actual time=0.007..0.008 rows=1 loops=14)
Filter: (pk = $0)
-> Seq Scan on staff_role sr (cost=0.00..1.11 rows=11 width=16) (actual time=0.003..0.016 rows=11 loops=14)
-> Hash Join (cost=15.73..4210.50 rows=207 width=120) (actual time=0.562..70.059 rows=209 loops=14)
Hash Cond: (n.id_identity = i.pk)
-> Seq Scan on names n (cost=0.00..4.27 rows=210 width=29) (actual time=0.009..0.356 rows=211 loops=14)
Filter: active
-> Hash (cost=13.12..13.12 rows=209 width=95) (actual time=0.925..0.925 rows=209 loops=1)
-> Seq Scan on identity i (cost=0.00..13.12 rows=209 width=95) (actual time=0.012..0.554 rows=209 loops=1)
Filter: ((deleted IS FALSE) AND (deceased IS NULL))
SubPlan
-> Nested Loop (cost=0.00..9.86 rows=1 width=8) (never executed)
-> Seq Scan on marital_status ms1 (cost=0.00..1.07 rows=1 width=8) (never executed)
Filter: (pk = $9)
-> Index Scan using identity_pkey on identity i1 (cost=0.00..8.27 rows=1 width=0) (never executed)
Index Cond: (i1.pk = $10)
-> Nested Loop (cost=0.00..9.35 rows=1 width=8) (never executed)
-> Seq Scan on marital_status ms (cost=0.00..1.07 rows=1 width=8) (never executed)
Filter: (pk = $9)
-> Index Scan using identity_pkey on identity i1 (cost=0.00..8.27 rows=1 width=0) (never executed)
Index Cond: (i1.pk = $10)
SubPlan
-> Result (cost=22.06..22.07 rows=1 width=0) (actual time=0.144..0.145 rows=1 loops=14)
InitPlan
-> Result (cost=11.02..11.03 rows=1 width=0) (actual time=0.070..0.071 rows=1 loops=14)
InitPlan
-> Index Scan using pg_authid_rolname_index on pg_authid (cost=1.25..11.02 rows=1 width=0) (actual time=0.064..0.064 rows=1 loops=14)
Index Cond: (rolname = 'gnumed_v10'::name)
Filter: ((NOT rolcanlogin) AND ($2 = ANY ((subplan))))
InitPlan
-> Seq Scan on pg_authid (cost=0.00..1.25 rows=1 width=4) (actual time=0.011..0.015 rows=1 loops=14)
Filter: (rolcanlogin AND (rolname = $1))
SubPlan
-> Seq Scan on pg_auth_members (cost=0.00..1.49 rows=4 width=4) (actual time=0.016..0.022 rows=4 loops=14)
Filter: (roleid = $3)
-> Result (cost=11.02..11.03 rows=1 width=0) (actual time=0.062..0.063 rows=1 loops=14)
InitPlan
-> Index Scan using pg_authid_rolname_index on pg_authid (cost=1.25..11.02 rows=1 width=0) (actual time=0.057..0.057 rows=1 loops=14)
Index Cond: (rolname = 'gm-logins'::name)
Filter: ((NOT rolcanlogin) AND ($6 = ANY ((subplan))))
InitPlan
-> Seq Scan on pg_authid (cost=0.00..1.25 rows=1 width=4) (actual time=0.010..0.013 rows=1 loops=14)
Filter: (rolcanlogin AND (rolname = $1))
SubPlan
-> Seq Scan on pg_auth_members (cost=0.00..1.49 rows=4 width=4) (actual time=0.006..0.022 rows=5 loops=14)
Filter: (roleid = $3)
Total runtime: 2041.314 ms
(140 Zeilen)

count
-------
14
(1 Zeile)

ROLLBACK

(the count is simply there to verify the view selects the
same number of rows as I am expecting from the base table,
the rollback is there because I have been experimenting
with additional indices)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#9Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Tom Lane (#7)
Re: performance advice needed: join vs explicit subselect

On Tue, Jan 27, 2009 at 04:42:02PM -0500, Tom Lane wrote:

What does EXPLAIN ANALYZE say about it? Also, what is the use-case
you are concerned about --- selecting the whole view contents, or
selecting WHERE something-or-other?

Oh, and the use case is to select all the test_results which
belong to a certain patient:

where pk_patient = <scalar>

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Karsten Hilbert (#8)
Re: performance advice needed: join vs explicit subselect

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:

the explain analyze of which is (I've actually gotten it to
work better in the meantime as you can see):

Looks like most of the problem is in the subquery scans on v_staff,
which seems to be a rather expensive view :-(. Maybe you can
simplify that a bit.

regards, tom lane

#11Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Tom Lane (#10)
Re: performance advice needed: join vs explicit subselect

On Tue, Jan 27, 2009 at 05:30:23PM -0500, Tom Lane wrote:

Karsten Hilbert <Karsten.Hilbert@gmx.net> writes:

the explain analyze of which is (I've actually gotten it to
work better in the meantime as you can see):

Looks like most of the problem is in the subquery scans on v_staff,
which seems to be a rather expensive view :-(. Maybe you can
simplify that a bit.

Thanks so much. I wasn't quite sure how to correlate the
seemingly expensive parts of the explain with the view/query
parts. Will experiment with that...

Well, going directly to the dem.staff table below
dem.v_staff forces me to forego the actual name of the staff
entry - but the alias will need to suffice ;-)

This brings down query time from 2000ms to 7ms.

Our doctors won't complain about slow lab data retrieval
anymore ... ;-)

Thanks,
Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346