Matching indexe for timestamp

Started by Jobabout 9 years ago5 messagesgeneral

Job@colliniconsulting.it

about 9 years ago

Hello,

on this table:

Table "public.gruorari_tmp"
Column | Type | Modifiers
-----------+------------------------+--------------------------------------------------------------------------
id | numeric(1000,1) | not null default function_get_next_sequence('gruorari_tmp_id_seq'::text)
idgrucate | numeric(1000,1) |
dalle | time without time zone |
alle | time without time zone |
gg_sett | integer |
azione | character varying |
Indexes:
"keygruorari_tmp" PRIMARY KEY, btree (id)
"gruorari_tmp_alle_idx" btree (alle)
"gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
"gruorari_tmp_dalle_idx" btree (dalle)
"gruorari_tmp_gg_sett_idx" btree (gg_sett)
"gruorari_tmp_idgrucate_idx" btree (idgrucate)

i have a specific condition (i report example value):
"and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time between gruorari_tmp.dalle and gruorari_tmp.alle ) )"

But in the query planner, at that point, Postgresql 9.6.1 seems not to use any index (single on dalle / alle field and combindex index on dalle+alle) but it use seqscan:

Seq Scan on gruorari_tmp (cost=0.00..5.90 rows=290 width=68) (actual time=0.014..0.062 rows=290 loops=1)
-> Hash (cost=164.06..164.06 rows=1 width=29) (actual time=0.770..0.770 rows=1 loops=1)
And it seems to be the main cost for the whole query.
Which kind of index should i use for that condition/fields?

Thank you!

Adrian Klaver

adrian.klaver@aklaver.com

about 9 years ago

In reply to: Job (#1)

Re: Matching indexe for timestamp

On 01/09/2017 01:58 PM, Job wrote:

Hello,

on this table:

Table "public.gruorari_tmp"
Column | Type |
Modifiers
-----------+------------------------+--------------------------------------------------------------------------
id | numeric(1000,1) | not null default
function_get_next_sequence('gruorari_tmp_id_seq'::text)
idgrucate | numeric(1000,1) |
dalle | time without time zone |
alle | time without time zone |
gg_sett | integer |
azione | character varying |
Indexes:
"keygruorari_tmp" PRIMARY KEY, btree (id)
"gruorari_tmp_alle_idx" btree (alle)
"gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
"gruorari_tmp_dalle_idx" btree (dalle)
"gruorari_tmp_gg_sett_idx" btree (gg_sett)
"gruorari_tmp_idgrucate_idx" btree (idgrucate)

i have a specific condition (i report example value):
"and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time
between gruorari_tmp.dalle and gruorari_tmp.alle ) )"

But in the query planner, at that point, Postgresql 9.6.1 seems not to
use any index (single on dalle / alle field and combindex index on
dalle+alle) but it use seqscan:

Seq Scan on gruorari_tmp (cost=0.00..5.90 rows=290 width=68) (actual
time=0.014..0.062 rows=290 loops=1)
-> Hash (cost=164.06..164.06 rows=1
width=29) (actual time=0.770..0.770 rows=1 loops=1)
And it seems to be the main cost for the whole query.
Which kind of index should i use for that condition/fields?

As before, giving just snippets of the information is going to prolong
or prevent arriving at an answer. So:

1) What is the complete query?

2) What is the complete EXPLAIN ANALYZE?

Thank you!

/F

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

about 9 years ago

In reply to: Job (#1)

Re: R: Matching indexe for timestamp

On 01/09/2017 03:38 PM, Job wrote:

Please also reply to list. I do not have time at the moment to go
through this, someone else on the list might.

Hi Adrian,

You are right; here is the query and the planner.
I think indexes are not used at all!

/F

EXPLAIN ANALYZE select
webrecord.dominio
from webrecord
left join grucategorie on grucategorie.codcategoria=webrecord.categoria and grucategorie.codgruppo='f50147_01'
left join grulist on grulist.nome=webrecord.dominio and grulist.codgruppo='f50147_01' and grulist.stato in (1)
left join firewall_geo_reject on firewall_geo_reject.country=webrecord.country and firewall_geo_reject.codgruppo='f50147_01'
left join gruorari_tmp on gruorari_tmp.idgrucate=grucategorie.id
where dominio='PATTERN'
and ( grulist.stato=1 OR grucategorie.codcategoria is not null OR firewall_geo_reject.country is not null )
and ( gruorari_tmp.id is null or ( 1 = gg_sett and '17:23:00'::time between gruorari_tmp.dalle and gruorari_tmp.alle ) )
and NOT EXISTS (select 1 from grulist where stato=2 and codgruppo='f50147_01' and nome='PATTERN')
limit 1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=130.51..172.16 rows=1 width=14) (actual time=436.537..436.538 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Index Only Scan using aaa_idx on grulist grulist_1 (cost=0.29..80.31 rows=1 width=0) (actual time=0.062..0.062 rows=0 loops=1)
Index Cond: ((stato = '2'::numeric) AND (codgruppo = 'f50147_01'::text) AND (nome = 'PATTERN'::text))
Heap Fetches: 0
-> Result (cost=50.21..303115.67 rows=7277 width=14) (actual time=436.534..436.534 rows=1 loops=1)
One-Time Filter: (NOT $0)
-> Nested Loop Left Join (cost=50.21..303115.67 rows=7277 width=14) (actual time=436.463..436.463 rows=1 loops=1)
Join Filter: ((grulist.nome)::text = (webrecord.dominio)::text)
Filter: ((grulist.stato = '1'::numeric) OR (grucategorie.codcategoria IS NOT NULL) OR (firewall_geo_reject.country IS NOT NULL))
-> Nested Loop Left Join (cost=49.92..302908.01 rows=7277 width=25) (actual time=436.347..436.347 rows=1 loops=1)
Join Filter: ((firewall_geo_reject.country)::text = (webrecord.country)::text)
Rows Removed by Join Filter: 13
-> Nested Loop Left Join (cost=49.92..300318.08 rows=7277 width=46) (actual time=431.407..431.407 rows=1 loops=1)
Join Filter: ((grucategorie.codcategoria)::text = (webrecord.categoria)::text)
Rows Removed by Join Filter: 18
Filter: ((gruorari_tmp.id IS NULL) OR ((1 = gruorari_tmp.gg_sett) AND ('17:23:00'::time without time zone >= gruorari_tmp.dalle) AND ('17:23:00'::time without time zone <= gruorari_tmp.alle)))
-> Seq Scan on webrecord (cost=0.00..249584.12 rows=159614 width=70) (actual time=430.696..430.696 rows=1 loops=1)
Filter: ((dominio)::text = 'PATTERN'::text)
Rows Removed by Filter: 596858
-> Materialize (cost=49.92..455.58 rows=14 width=35) (actual time=0.663..0.689 rows=19 loops=1)
-> Hash Left Join (cost=49.92..455.51 rows=14 width=35) (actual time=0.639..0.656 rows=19 loops=1)
Hash Cond: (grucategorie.id = gruorari_tmp.idgrucate)
-> Bitmap Heap Scan on grucategorie (cost=40.40..445.70 rows=14 width=17) (actual time=0.142..0.143 rows=19 loops=1)
Recheck Cond: ((codgruppo)::text = 'f50147_01'::text)
Heap Blocks: exact=5
-> Bitmap Index Scan on grucategorie_codgruppo_idx (cost=0.00..40.39 rows=14 width=0) (actual time=0.084..0.084 rows=83 loops=1)
Index Cond: ((codgruppo)::text = 'f50147_01'::text)
-> Hash (cost=5.90..5.90 rows=290 width=36) (actual time=0.381..0.381 rows=290 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 29kB
-> Seq Scan on gruorari_tmp (cost=0.00..5.90 rows=290 width=36) (actual time=0.023..0.176 rows=290 loops=1)
-> Materialize (cost=0.00..297.73 rows=21 width=3) (actual time=0.151..4.928 rows=13 loops=1)
-> Seq Scan on firewall_geo_reject (cost=0.00..297.62 rows=21 width=3) (actual time=0.141..4.912 rows=13 loops=1)
Filter: ((codgruppo)::text = 'f50147_01'::text)
Rows Removed by Filter: 15717
-> Materialize (cost=0.29..80.31 rows=1 width=19) (actual time=0.109..0.109 rows=0 loops=1)
-> Index Scan using bbb_idx on grulist (cost=0.29..80.31 rows=1 width=19) (actual time=0.095..0.095 rows=0 loops=1)
Index Cond: (((codgruppo)::text = 'f50147_01'::text) AND (stato = '1'::numeric))
Filter: ((nome)::text = 'PATTERN'::text)
Rows Removed by Filter: 1
Planning time: 14.996 ms
Execution time: 436.840 ms

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Import Notes

Reply to msg id not found: 88EF58F000EC4B4684700C2AA3A73D7A08054EACC2A3@W2008DC01.ColliniConsulting.lan

David G. Johnston

david.g.johnston@gmail.com

about 9 years ago

In reply to: Job (#1)

Re: Matching indexe for timestamp

On Mon, Jan 9, 2017 at 2:58 PM, Job <Job@colliniconsulting.it> wrote:

But in the query planner, at that point, Postgresql 9.6.1 seems not to use
any index (single on dalle / alle field and combindex index on dalle+alle)
but it use seqscan:

Seq Scan on gruorari_tmp (cost=0.00..5.90 rows=290 width=68) (actual
time=0.014..0.062 rows=290 loops=1)
-> Hash (cost=164.06..164.06 rows=1
width=29) (actual time=0.770..0.770 rows=1 loops=1)

Given a query with expected "rows=290" I am not surprised that it would
simply scan the entire relation. Especially since you have four columns in
your where clause and so any one index would be insufficient.

You sound as if you believe that any query that doesn't use an index is
flawed. That is not the case.

David J.

Vitaly Burovoy

vitaly.burovoy@gmail.com

about 9 years ago

In reply to: Job (#1)

Re: Matching indexe for timestamp

On 1/9/17, Job <Job@colliniconsulting.it> wrote:

Hello,

on this table:

Table "public.gruorari_tmp"
Column | Type |
Modifiers
-----------+------------------------+--------------------------------------------------------------------------
id | numeric(1000,1) | not null default
function_get_next_sequence('gruorari_tmp_id_seq'::text)
idgrucate | numeric(1000,1) |
dalle | time without time zone |
alle | time without time zone |
gg_sett | integer |
azione | character varying |
Indexes:
"keygruorari_tmp" PRIMARY KEY, btree (id)
"gruorari_tmp_alle_idx" btree (alle)
"gruorari_tmp_dalle_alle_idx" btree (dalle, alle)
"gruorari_tmp_dalle_idx" btree (dalle)
"gruorari_tmp_gg_sett_idx" btree (gg_sett)
"gruorari_tmp_idgrucate_idx" btree (idgrucate)

i have a specific condition (i report example value):
"and ( gruorari_tmp.id is null or ( 2 = gg_sett and '16:00:00'::time between
gruorari_tmp.dalle and gruorari_tmp.alle ) )"

But in the query planner, at that point, Postgresql 9.6.1 seems not to use
any index (single on dalle / alle field and combindex index on dalle+alle)

Of course. There is no reason to use any index because the condition
"gruorari_tmp.id is null" is not covered by any of them. To find such
rows you have to scan all table (because there can be with any
"alle".."dalle" values), that's why Postgres uses SeqScan.

but it use seqscan:

Seq Scan on gruorari_tmp (cost=0.00..5.90 rows=290 width=68) (actual
time=0.014..0.062 rows=290 loops=1)
-> Hash (cost=164.06..164.06 rows=1
width=29) (actual time=0.770..0.770 rows=1 loops=1)
And it seems to be the main cost for the whole query.

Which kind of index should i use for that condition/fields?

for _big_ tables Postgres can use "bitmap OR" node if there are two
indexes which can be used.
So create index for "gruorari_tmp.id is null" and gin/gist "(gg_sett,
timerange(gruorari_tmp.dalle, gruorari_tmp.alle))" (see below).

If you use a condition like "<value> between colA and colB", some sort
of a "timerange" is the best case. Unfortunately there is no such
type, but it is easy to create it by an example[1]https://www.postgresql.org/docs/9.6/static/rangetypes.html#RANGETYPES-DEFINING.
Then you can use btree_gin or btree_gist (depending on a base index
type) extension to use an ordinary type column(s) with range type
column(s).

P.S.: Postgres can not to use indexes even if they are right because
according to a statistics SeqScan will take similar access time.

[1]: https://www.postgresql.org/docs/9.6/static/rangetypes.html#RANGETYPES-DEFINING

--
Best regards,
Vitaly Burovoy

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general