Help on Index only scan

Started by Ertan Küçükoğlualmost 9 years ago5 messagesgeneral

ertan.kucukoglu@1nar.com.tr

almost 9 years ago

Hello,

robox=# select count(autoinc) from updates;
count
-------
2003
(1 row)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
Sort (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
Sort Key: autoinc DESC
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on updates (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
Recheck Cond: (filename = 'Robox.exe'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on update_filename (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
Index Cond: (filename = 'Robox.exe'::text)
Planning time: 1.873 ms
Execution time: 0.076 ms
(10 rows)

I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Ertan Küçükoğlu (#1)

Re: Help on Index only scan

On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <
ertan.kucukoglu@1nar.com.tr> wrote:

Hello,

My table details:
robox=# \dS+ updates
Table "public.updates"
Column | Type | Modifiers
| Storage | Stats target | Description
---------------+---------+----------------------------------
----------------
---------+----------+--------------+-------------
autoinc | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain | |
filename | text |
| extended | |
dateofrelease | date |
| plain | |
fileversion | text |
| extended | |
afile | text |
| extended | |
filehash | text |
| extended | |
active | boolean |
| plain | |
Indexes:
"updates_pkey" PRIMARY KEY, btree (autoinc)
"update_filename" btree (filename)
"updates_autoinc" btree (autoinc DESC)
"updates_dateofrelease" btree (dateofrelease)
"updates_filename_dateofrelease" btree (filename, dateofrelease)

robox=# select count(autoinc) from updates;
count
-------
2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
autoinc | filename | fileversion
---------+----------------------------------+-------------
18 | Robox.exe | 1.0.1.218
19 | Robox.exe | 1.0.1.220
20 | Robox.exe | 1.0.1.220
21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for
it.
Below index names "update_filename" and "updates_autoinc" are added just
for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful
plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
QUERY PLAN
------------------------------------------------------------
----------------
--------------------------------------------------
Sort (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
Sort Key: autoinc DESC
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on updates (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
Recheck Cond: (filename = 'Robox.exe'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on update_filename (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
Index Cond: (filename = 'Robox.exe'::text)
Planning time: 1.873 ms
Execution time: 0.076 ms
(10 rows)

I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

*First, you do not need index "updates_autoinc", since autoinc is the
Primary Key, you are just duplicating the index.*

*As far as "Index only scan" , since the table only has 2003 rows, the
optimizer has determined it is faster just to*
*load all the rows into memory and then filter. If you really want to force
an index scan, then you would have to do*
*SET enable_seqscan = off; Before doing the query, however you are just
shooting yourself in the foot by doing that*

*as it will make the query slower.*

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Ertan Küçükoğlu

ertan.kucukoglu@1nar.com.tr

almost 9 years ago

In reply to: Melvin Davidson (#2)

Re: Help on Index only scan

On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:

On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <ertan.kucukoglu@1nar.com.tr> wrote:
Hello,

My table details:
robox=# \dS+ updates
Table "public.updates"
Column | Type | Modifiers
| Storage | Stats target | Description
---------------+---------+--------------------------------------------------
---------+----------+--------------+-------------
autoinc | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain | |
filename | text |
| extended | |
dateofrelease | date |
| plain | |
fileversion | text |
| extended | |
afile | text |
| extended | |
filehash | text |
| extended | |
active | boolean |
| plain | |
Indexes:
"updates_pkey" PRIMARY KEY, btree (autoinc)
"update_filename" btree (filename)
"updates_autoinc" btree (autoinc DESC)
"updates_dateofrelease" btree (dateofrelease)
"updates_filename_dateofrelease" btree (filename, dateofrelease)

robox=# select count(autoinc) from updates;
count
-------
2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
autoinc | filename | fileversion
---------+----------------------------------+-------------
18 | Robox.exe | 1.0.1.218
19 | Robox.exe | 1.0.1.220
20 | Robox.exe | 1.0.1.220
21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

I simply could not understand planner and cannot provide right index for it.
Below index names "update_filename" and "updates_autoinc" are added just for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
QUERY PLAN
----------------------------------------------------------------------------
--------------------------------------------------
Sort (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047 rows=3
loops=1)
Sort Key: autoinc DESC
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on updates (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
Recheck Cond: (filename = 'Robox.exe'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on update_filename (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
Index Cond: (filename = 'Robox.exe'::text)
Planning time: 1.873 ms
Execution time: 0.076 ms
(10 rows)

I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

First, you do not need index "updates_autoinc", since autoinc is the Primary Key, you are just duplicating the index.

Is that true even if that index is a descending one?

As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
load all the rows into memory and then filter. If you really want to force an index scan, then you would have to do
SET enable_seqscan = off; Before doing the query, however you are just shooting yourself in the foot by doing that
as it will make the query slower.

I will try to load up more dummy rows to overflow the work_mem and observe results.

Sorry, my question was misleading. I do not want to use "set enable_seqscan = off" I want to be sure that when necessary (record count increases) relevant index(es) will be used.

Obviously I still can't read query plan as I did not understand that operation is still in memory.

Just for complete information, this is 64bit PostgreSQL 9.6.4 on Windows 10, EDB binaries.

Thanks.
Ertan Küçükoğlu

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Ertan Küçükoğlu (#3)

Re: Help on Index only scan

On Sun, Aug 13, 2017 at 7:37 PM, Ertan Küçükoğlu <
ertan.kucukoglu@1nar.com.tr> wrote:

On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:

On Sun, Aug 13, 2017 at 5:59 PM, Ertan Küçükoğlu <
ertan.kucukoglu@1nar.com.tr> wrote:

Hello,

My table details:
robox=# \dS+ updates
Table "public.updates"
Column | Type | Modifiers
| Storage | Stats target | Description
---------------+---------+----------------------------------
----------------
---------+----------+--------------+-------------
autoinc | integer | not null default
nextval('updates_autoinc_seq'::regclass) | plain | |
filename | text |
| extended | |
dateofrelease | date |
| plain | |
fileversion | text |
| extended | |
afile | text |
| extended | |
filehash | text |
| extended | |
active | boolean |
| plain | |
Indexes:
"updates_pkey" PRIMARY KEY, btree (autoinc)
"update_filename" btree (filename)
"updates_autoinc" btree (autoinc DESC)
"updates_dateofrelease" btree (dateofrelease)
"updates_filename_dateofrelease" btree (filename, dateofrelease)

robox=# select count(autoinc) from updates;
count
-------
2003
(1 row)

robox=# select autoinc, filename, fileversion from updates limit 10;
autoinc | filename | fileversion
---------+----------------------------------+-------------
18 | Robox.exe | 1.0.1.218
19 | Robox.exe | 1.0.1.220
20 | Robox.exe | 1.0.1.220
21 | 8423bfc5a669864f9b66b6b15ce908b9 | 1.1.1.1
22 | 4fdabb0c7adbc5a89fbe679ce76ccef9 | 1.1.1.1
23 | f469d77bfa86c8917c7846c0f871137c | 1.1.1.1
24 | bc10af4c8789718a9ca6565ea14cb17d | 1.1.1.1
25 | d9f87ee46cdb41cd15c2f71ed599faf9 | 1.1.1.1
26 | 6f7428a5364aae1d5914a66cba3e6f3b | 1.1.1.1
27 | 66ec4cdb8d64ca1414f75c1fb9eaa518 | 1.1.1.1
(10 rows)

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe'
order
by autoinc desc;

I simply could not understand planner and cannot provide right index for
it.
Below index names "update_filename" and "updates_autoinc" are added just
for
the query that I would like to have a index only scan plan. I also failed
with following indexes
"autoinc desc, filename, fileversion"
"autoinc desc, filename"

First 3 rows in above select results are actual data. You will find that I
have inserted about 2000 rows of dummy data to have somewhat meaningful
plan
for the query.

Current planner result:
robox=# vacuum full;
VACUUM
robox=# explain analyze
robox-# select autoinc, fileversion
robox-# from updates
robox-# where filename = 'Robox.exe'
robox-# order by autoinc desc;
QUERY PLAN
------------------------------------------------------------
----------------
--------------------------------------------------
Sort (cost=12.79..12.79 rows=3 width=12) (actual time=0.047..0.047
rows=3
loops=1)
Sort Key: autoinc DESC
Sort Method: quicksort Memory: 25kB
-> Bitmap Heap Scan on updates (cost=4.30..12.76 rows=3 width=12)
(actual time=0.040..0.040 rows=3 loops=1)
Recheck Cond: (filename = 'Robox.exe'::text)
Heap Blocks: exact=1
-> Bitmap Index Scan on update_filename (cost=0.00..4.30 rows=3
width=0) (actual time=0.035..0.035 rows=3 loops=1)
Index Cond: (filename = 'Robox.exe'::text)
Planning time: 1.873 ms
Execution time: 0.076 ms
(10 rows)

I appreciate any help on having right index(es) as I simply failed myself.

Regards,
Ertan Küçükoğlu

*First, you do not need index "updates_autoinc", since autoinc is the

Primary Key, you are just duplicating the index.*

Is that true even if that index is a descending one?

*As far as "Index only scan" , since the table only has 2003 rows, the
optimizer has determined it is faster just to*
*load all the rows into memory and then filter. If you really want to
force an index scan, then you would have to do*
*SET enable_seqscan = off; Before doing the query, however you are just
shooting yourself in the foot by doing that*
*as it will make the query slower.*

I will try to load up more dummy rows to overflow the work_mem and observe
results.

Sorry, my question was misleading. I do not want to use "set
enable_seqscan = off" I want to be sure that when necessary (record count
increases) relevant index(es) will be used.

Obviously I still can't read query plan as I did not understand that
operation is still in memory.

Just for complete information, this is 64bit PostgreSQL 9.6.4 on Windows
10, EDB binaries.

Thanks.
Ertan Küçükoğlu

*>I will try to load up more dummy rows to overflow the work_mem and
observe results.*

*If you are going to do that, don't forget to run ANALYZE on your table
afterwards, as the optimizer users the statistics to determine the best
query plan.*

*I strongly recommend that you acquaint yourself with how the optimizer
works.*

*https://www.postgresql.org/docs/9.6/static/geqo.html
<https://www.postgresql.org/docs/9.6/static/geqo.html>*

*One final suggestion, it is not a good idea to create indexes on every
column. You should only create indexes for columns (or combinations*

*of columns) the will be used often in query WHERE clauses.*

*I also recommend you consider purchasing PostgreSQL High Performance*
*from either Amazon *

*https://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=Postgres+high+performance
<https://www.amazon.com/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=Postgres+high+performance>*

*or PACKT *

*https://www.packtpub.com/big-data-and-business-intelligence/postgresql-high-performance-cookbook
<https://www.packtpub.com/big-data-and-business-intelligence/postgresql-high-performance-cookbook>*

*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Ertan Küçükoğlu (#3)

Re: Help on Index only scan

=?utf-8?Q?Ertan_K=C3=BC=C3=A7=C3=BCko=C4=9Flu?= <ertan.kucukoglu@1nar.com.tr> writes:

I want to have an index only scan for my below query:
select autoinc, fileversion from updates where filename = 'Robox.exe' order
by autoinc desc;

On 14 Aug 2017, at 01:15, Melvin Davidson <melvin6925@gmail.com> wrote:
As far as "Index only scan" , since the table only has 2003 rows, the optimizer has determined it is faster just to
load all the rows into memory and then filter.

Sorry, my question was misleading. I do not want to use "set enable_seqscan = off" I want to be sure that when necessary (record count increases) relevant index(es) will be used.

There's a considerable distance between "is the planner making appropriate
use of indexes" and "I insist on an index-only scan". The reason you're
not getting an index-only scan here is that that requires an index that
includes every column referenced in the query, which you don't have. At
minimum you'd need an index including all of autoinc, fileversion, and
filename to do this query with an IOS. If you want it to be particularly
efficient for this query then you'd need the index's column order to be
(filename, autoinc, fileversion) --- putting filename means the entries
satisfying WHERE will be clumped in the index, and putting autoinc second
means that a backwards scan on that portion of the index is enough to
produce the requested sort ordering without an explicit sort step.

Whether it's worth maintaining an index this specialized depends on how
much update traffic you have versus how often you want to do this
particular query. Often it's not worth the extra disk space and update
overhead to have such an index.

In any case, I wouldn't worry about it until you have an actual
performance problem. Trying to tell on toy data what the planner
will do with production-sized data is usually a losing game.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general