Surprised by index choice for count(*)
Should I be? I would have thought the pk would have been chosen v.
function index?
explain analyse select count(*) from bc.segment s;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4428009.24..4428009.25 rows=1 width=8)
(actual time=14786.395..14786.395 rows=1 loops=1)
-> Gather (cost=4428009.03..4428009.24 rows=2 width=8) (actual
time=14786.358..14786.386 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=4427009.03..4427009.04 rows=1
width=8) (actual time=14782.167..14782.167 rows=1 loops=3)
-> Parallel Index Only Scan using fpv on segment s
(cost=0.57..4210177.14 rows=86732753 width=0) (actual
time=0.061..11352.855 rows=69386204 loops=3)
Heap Fetches: 1780
Planning time: 0.221 ms
Execution time: 14815.939 ms
(9 rows)
\d bc.segment
Table "bc.segment"
Column | Type | Modifiers
----------------+---------+--------------------
id | uuid | not null
chrom | integer | not null
markerset_id | uuid | not null
probandset_id | uuid | not null
startbase | integer | not null
endbase | integer | not null
firstmarker | integer | not null
lastmarker | integer | not null
events_less | bigint | not null default 0
events_equal | bigint | not null default 0
events_greater | bigint | not null default 0
Indexes:
"segment_pkey" PRIMARY KEY, btree (id)
"useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
"fpv" btree (pv(events_less, events_equal, events_greater, 0))
"segment_markerset_id_probandset_id_idx" btree (markerset_id,
probandset_id)
create or replace function public.pv(l bigint, e bigint, g bigint, o int)
returns float
as
$$
select 1.0*(g+e+o)/(l+e+g+o)::float;
$$
language sql
;
On Tue, May 1, 2018 at 8:46 AM, Rob Sargent <robjsargent@gmail.com> wrote:
Should I be? I would have thought the pk would have been chosen v.
function index?
Indexes:
"segment_pkey" PRIMARY KEY, btree (id)
"useg" UNIQUE, btree (probandset_id, chrom, startbase, endbase)
"fpv" btree (pv(events_less, events_equal, events_greater, 0))
"segment_markerset_id_probandset_id_idx" btree (markerset_id,
probandset_id)
Haven't looked at the code but among those 4 indexes the one that was
chosen is the one that comes first alphabetically. Ideally it would avoid
multi-column indexes since they are larger; and I believe that a float is
smaller than a uuid so not only alphabetically but the fpv index wins on
size too.
That the index is functional is immaterial here.
David J.
Rob Sargent <robjsargent@gmail.com> writes:
Should I be? I would have thought the pk would have been chosen v.
function index?
If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size. So scanning it is a lot cheaper, at least according to
the planner's cost model.
regards, tom lane
Thank you both. Simple, as expected. And I’m easily surprised.
Version 10 (perhaps obviously) for those scoring at home.
Show quoted text
On May 1, 2018, at 10:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Rob Sargent <robjsargent@gmail.com> writes:
Should I be? I would have thought the pk would have been chosen v.
function index?If I'm reading this correctly, the PK index contains uuids while
the fpv index contains float4s, meaning the latter is probably half
the size. So scanning it is a lot cheaper, at least according to
the planner's cost model.regards, tom lane