bloat indexes - opinion
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg
Index: ix_filter_by_tree
Times_used: 1018082183
Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real
size is 2TB
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
What is the real impact of a bloat index? If I reindex it, queries will be
faster?
Thanks
Patrick
On 02/21/2017 01:44 PM, Patrick B wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB
How do you know one number is right and the other is wrong?
Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is the real impact of a bloat index? If I reindex it, queries will
be faster?Thanks
Patrick
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBHow do you know one number is right and the other is wrong?
1. on that table (seg) i store binary data. It is impossible to have only
18GB of it.
2.
SELECT schema_name,
pg_size_pretty(sum(table_size)::bigint),
(sum(table_size) / pg_database_size(current_database())) * 100
FROM (
SELECT pg_catalog.pg_namespace.nspname as schema_name,
pg_relation_size(pg_catalog.pg_class.oid) as table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace ON relnamespace =
pg_catalog.pg_namespace.oid
) t
GROUP BY schema_name
ORDER BY schema_name
pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg table.
Show quoted text
Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.
html#FUNCTIONS-ADMIN-DBOBJECTIndex_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is the real impact of a bloat index? If I reindex it, queries will
be faster?Thanks
Patrick--
Adrian Klaver
adrian.klaver@aklaver.com
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 01:44 PM, Patrick B wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBHow do you know one number is right and the other is wrong?
1. on that table (seg) i store binary data. It is impossible to have only
18GB of it.
2.SELECT schema_name,
pg_size_pretty(sum(table_size)::bigint),
(sum(table_size) / pg_database_size(current_database())) * 100
FROM (
SELECT pg_catalog.pg_namespace.nspname as schema_name,
pg_relation_size(pg_catalog.pg_class.oid) as table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace ON relnamespace =
pg_catalog.pg_namespace.oid) t
GROUP BY schema_name
ORDER BY schema_name
pg_toast 2706 GB 82.62112838877240860000 <-- this belongs to the seg
table.Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.h
tml#FUNCTIONS-ADMIN-DBOBJECTIndex_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is the real impact of a bloat index? If I reindex it, queries will
be faster?Thanks
Patrick
I ran the query before and after the reindex, and it seems it did not help
on performance.
*The query I used:*
explain analyze select * from seg where full_path = '/userfile/123';
*Before reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87 rows=215
width=8) (actual time=0.047..0.047 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.059 ms
(3 rows)
*After reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83 rows=220
width=8) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.036 ms
(3 rows)
Note that the '*cost*' is pretty much the same.
*My question is:*
If I have a bloat index. Why do I need to reindex it if I got none
performance improvements?
Cheers
Patrick
On 02/21/2017 03:41 PM, Patrick B wrote:
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com
<mailto:patrickbakerbr@gmail.com>>:2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>>:On 02/21/2017 01:44 PM, Patrick B wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TBHow do you know one number is right and the other is wrong?
1. on that table (seg) i store binary data. It is impossible to have
only 18GB of it.
2.SELECT schema_name,
pg_size_pretty(sum(table_size)::bigint),
(sum(table_size) /
pg_database_size(current_database())) * 100FROM (
SELECT pg_catalog.pg_namespace.nspname as schema_name,
pg_relation_size(pg_catalog.pg_class.oid) as table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace ON relnamespace =
pg_catalog.pg_namespace.oid) t
GROUP BY schema_name
ORDER BY schema_name
pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg
table.Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
<https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT>Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is the real impact of a bloat index? If I reindex it, queries will
be faster?Thanks
PatrickI ran the query before and after the reindex, and it seems it did not
help on performance.*The query I used:*
explain analyze select * from seg where full_path = '/userfile/123';
The table schema would be useful.
*Before reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87
rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.059 ms
(3 rows)*After reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83
rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.036 ms
(3 rows)
Not showing the complete explain analyze makes the above not all that enlightening.
Note that the '*/cost/*' is pretty much the same.
*My question is:*
If I have a bloat index. Why do I need to reindex it if I got none
performance improvements?
Because it is an indication that you may not have index bloat?
Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of a problem.
Might be worth taking a look at:
https://www.postgresql.org/docs/9.6/static/monitoring-stats.html#PG-STAT-ALL-TABLES-VIEW
Cheers
Patrick
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB Index_size: 17 GB Num_writes 16245023 Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)
What is this from? If you think the table size reported should include
toast, then change it to do that, or request the author of whatever-this-is
to make that change.
What indication is there that the index is bloated? If the
meat-and-potatoes of a table is held in toast, then wouldn't you expect the
size of the table and the size of the index to be about the same?
Cheers,
Jeff
2017-02-22 13:10 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com>:
On 02/21/2017 03:41 PM, Patrick B wrote:
2017-02-22 11:11 GMT+13:00 Patrick B <patrickbakerbr@gmail.com
<mailto:patrickbakerbr@gmail.com>>:2017-02-22 10:59 GMT+13:00 Adrian Klaver <adrian.klaver@aklaver.com
<mailto:adrian.klaver@aklaver.com>>:On 02/21/2017 01:44 PM, Patrick B wrote:
Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly onpg_toast table.
Its real size is 2TB
How do you know one number is right and the other is wrong?
1. on that table (seg) i store binary data. It is impossible to have
only 18GB of it.
2.SELECT schema_name,
pg_size_pretty(sum(table_size)::bigint),
(sum(table_size) /
pg_database_size(current_database())) * 100FROM (
SELECT pg_catalog.pg_namespace.nspname as schema_name,
pg_relation_size(pg_catalog.pg_class.oid) as
table_size
FROM pg_catalog.pg_class
JOIN pg_catalog.pg_namespace ON relnamespace =
pg_catalog.pg_namespace.oid) t
GROUP BY schema_name
ORDER BY schema_name
pg_toast2706 GB82.62112838877240860000 <-- this belongs to the seg
table.Have you looked at the functions here?:
https://www.postgresql.org/docs/9.6/static/functions-admin.html#FUNCTIONS-ADMIN-DBOBJECT
admin.html#FUNCTIONS-ADMIN-DBOBJECT>
Index_size: 17 GB
Num_writes 16245023
Index definition: CREATE INDEX ix_filter_by_tree ON segUSING btree
(full_path varchar_pattern_ops) WHERE (full_path IS NOT
NULL)
What is the real impact of a bloat index? If I reindex it,
queries will
be faster?
Thanks
PatrickI ran the query before and after the reindex, and it seems it did not
help on performance.*The query I used:*
explain analyze select * from seg where full_path = '/userfile/123';
The table schema would be useful.
Why? If i just wanna know how bloat indexes work?
*Before reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..144.87
rows=215 width=8) (actual time=0.047..0.047 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.059 ms
(3 rows)*After reindex:*
Index Scan using ix_filter_by_tree on seg (cost=0.00..141.83
rows=220 width=8) (actual time=0.021..0.021 rows=1 loops=1)
Index Cond: (full_path = '/userfile/123')
Total runtime: 0.036 ms
(3 rows)Not showing the complete explain analyze makes the above not all that
enlightening.
I am showing the whole explain analyze mate.
Note that the '*/cost/*' is pretty much the same.
*My question is:*
If I have a bloat index. Why do I need to reindex it if I got none
performance improvements?Because it is an indication that you may not have index bloat?
Queries for bloat indexes show that I do have bloat indexes. Also, it is
really simple to look.
\d tablename
The table is 18GB big and the index is 17GB big.. this clearly shows me
bloated index.
Not sure a runtime of 0.036 to 0.036 ms over a 2TB table is symptomatic of
a problem.Might be worth taking a look at:
https://www.postgresql.org/docs/9.6/static/monitoring-
stats.html#PG-STAT-ALL-TABLES-VIEW<adrian.klaver@aklaver.com>
Patrick.
2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com>
wrote:Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB Index_size: 17 GB Num_writes 16245023 Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is this from? If you think the table size reported should include
toast, then change it to do that, or request the author of whatever-this-is
to make that change.What indication is there that the index is bloated? If the
meat-and-potatoes of a table is held in toast, then wouldn't you expect the
size of the table and the size of the index to be about the same?Cheers,
Jeff
I am running queries to see bloat indexes [1]https://wiki.postgresql.org/wiki/Index_Maintenance. Also i understand an index
can't have same size as table.
If you have any other table that can prove the index is indeed bloat,
please let me know and i will be happy to post results here.
[1]: https://wiki.postgresql.org/wiki/Index_Maintenance
Patrick
2017-02-25 17:53 GMT+13:00 Patrick B <patrickbakerbr@gmail.com>:
2017-02-23 11:46 GMT+13:00 Jeff Janes <jeff.janes@gmail.com>:
On Tue, Feb 21, 2017 at 1:44 PM, Patrick B <patrickbakerbr@gmail.com>
wrote:Hi guys,
I've got a lot of bloat indexes on my 4TB database.
Let's take this example:
Table: seg Index: ix_filter_by_tree Times_used: 1018082183 Table_size: 18 GB -- wrong. The table is mostly on pg_toast table. Its real size is 2TB Index_size: 17 GB Num_writes 16245023 Index definition: CREATE INDEX ix_filter_by_tree ON seg USING btree (full_path varchar_pattern_ops) WHERE (full_path IS NOT NULL)What is this from? If you think the table size reported should include
toast, then change it to do that, or request the author of whatever-this-is
to make that change.What indication is there that the index is bloated? If the
meat-and-potatoes of a table is held in toast, then wouldn't you expect the
size of the table and the size of the index to be about the same?Cheers,
Jeff
I am running queries to see bloat indexes [1]. Also i understand an index
can't have same size as table.
If you have any other table that can prove the index is indeed bloat,
please let me know and i will be happy to post results here.[1] https://wiki.postgresql.org/wiki/Index_Maintenance
Patrick
FYI - using this query to see the index size:
SELECT idstat.schemaname AS schema,
idstat.relname AS table_name,
indexrelname AS index_name,
idstat.idx_scan AS times_used,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
'.' || quote_ident(idstat.relname))) AS table_size,
pg_size_pretty(pg_relation_size(quote_ident(idstat.schemaname) ||
'.' || quote_ident(indexrelname))) AS index_size,
n_tup_upd + n_tup_ins + n_tup_del as num_writes,
indexdef AS definition
FROM pg_stat_user_indexes AS idstat
JOIN pg_indexes ON indexrelname = indexname
JOIN pg_stat_user_tables AS tabstat ON idstat.relname = tabstat.relname
WHERE indexrelname = 'index_name';