Surprised by index choice for count(*)

Started by Rob Sargentabout 8 years ago4 messagesgeneral

robjsargent@gmail.com

about 8 years ago

Should I be? I would have thought the pk would have been chosen v.
function index?

explain analyse select count(*) from bc.segment s;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4428009.24..4428009.25 rows=1 width=8)
(actual time=14786.395..14786.395 rows=1 loops=1)
   -> Gather (cost=4428009.03..4428009.24 rows=2 width=8) (actual
time=14786.358..14786.386 rows=3 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         -> Partial Aggregate (cost=4427009.03..4427009.04 rows=1
width=8) (actual time=14782.167..14782.167 rows=1 loops=3)
               -> Parallel Index Only Scan using fpv on segment s
(cost=0.57..4210177.14 rows=86732753 width=0) (actual
time=0.061..11352.855 rows=69386204 loops=3)
                     Heap Fetches: 1780
Planning time: 0.221 ms
Execution time: 14815.939 ms
(9 rows)

create or replace function public.pv(l bigint, e bigint, g bigint, o int)
returns float
as
$$
select 1.0*(g+e+o)/(l+e+g+o)::float;
$$
language sql
;

David G. Johnston

david.g.johnston@gmail.com

about 8 years ago

In reply to: Rob Sargent (#1)

Re: Surprised by index choice for count(*)

On Tue, May 1, 2018 at 8:46 AM, Rob Sargent <robjsargent@gmail.com> wrote:

Should I be? I would have thought the pk would have been chosen v.
function index?
Indexes:
"segment_pkey" PRIMARY KEY, btree (id)
"useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
"fpv" btree (pv(events_less, events_equal, events_greater, 0))
"segment_markerset_id_probandset_id_idx" btree (markerset_id,
probandset_id)

Haven't looked at the code but among those 4 indexes the one that was
chosen is the one that comes first alphabetically. Ideally it would avoid
multi-column indexes since they are larger; and I believe that a float is
smaller than a uuid so not only alphabetically but the fpv index wins on
size too.

That the index is functional is immaterial here.

David J.

Tom Lane

tgl@sss.pgh.pa.us

about 8 years ago

In reply to: Rob Sargent (#1)

Re: Surprised by index choice for count(*)

Rob Sargent <robjsargent@gmail.com> writes:

Should I be? I would have thought the pk would have been chosen v.
function index?

If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size. So scanning it is a lot cheaper, at least according to
the planner's cost model.

regards, tom lane

Rob Sargent

robjsargent@gmail.com

about 8 years ago

In reply to: Tom Lane (#3)

Re: Surprised by index choice for count(*)

Thank you both. Simple, as expected. And I’m easily surprised.
Version 10 (perhaps obviously) for those scoring at home.

Show quoted text

On May 1, 2018, at 10:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Rob Sargent <robjsargent@gmail.com> writes:

Should I be? I would have thought the pk would have been chosen v.
function index?

If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size. So scanning it is a lot cheaper, at least according to
the planner's cost model.

regards, tom lane