11.7. Indexes on Expressions
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:
Regarding the paragraph (_emphasis_ added):
Index expressions are relatively expensive to maintain, because the
derived expression(s) must be computed for _each row upon insertion and
whenever it is updated_.
Does the "it" in the above refer to the entire row or just the columns used
in the expression? Does updating any column of a row trigger an update to
all indexes with expressions?
Thanks in advance for your time and reply.
Cheers,
Chris Lowder
On Thu, Dec 16, 2021 at 04:57:37PM +0000, PG Doc comments form wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:Regarding the paragraph (_emphasis_ added):
Index expressions are relatively expensive to maintain, because the
derived expression(s) must be computed for _each row upon insertion and
whenever it is updated_.Does the "it" in the above refer to the entire row or just the columns used
So, here 'it' refers the "row" being updated. I am attaching a doc
patch which clarifies this.
in the expression? Does updating any column of a row trigger an update to
all indexes with expressions?
Well, that depends. If _no_ columns involved in any indexes are
changed, and the row is placed in the same page as the previous row (a
HOT update), then the indexes do not need to be updated. If any indexed
column changes, including those involved in expression indexes, or the
updated row isn't placed in the same page as the previous row, all new
index entries will need to be created. We don't really have the concept
of updating some indexes and not others --- we either update them all,
or update none of them --- this is because Postgres uses an MVCC system
of versioning.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
Attachments:
expression.difftext/x-diff; charset=us-asciiDownload+2-2
Thank you for that (speedy) explanation and patch Bruce! That all makes
perfect sense.
Cheers,
- Chris
Show quoted text
On December 16, 2021, Bruce Momjian <bruce@momjian.us> wrote:
On Thu, Dec 16, 2021 at 04:57:37PM +0000, PG Doc comments form wrote:
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:
Regarding the paragraph (_emphasis_ added):
Index expressions are relatively expensive to maintain, because
the
derived expression(s) must be computed for _each row upon insertion
and
whenever it is updated_.
Does the "it" in the above refer to the entire row or just thecolumns used
So, here 'it' refers the "row" being updated. I am attaching a doc
patch which clarifies this.in the expression? Does updating any column of a row trigger an
update to
all indexes with expressions?
Well, that depends. If _no_ columns involved in any indexes are
changed, and the row is placed in the same page as the previous row (a
HOT update), then the indexes do not need to be updated. If any
indexed
column changes, including those involved in expression indexes, or the
updated row isn't placed in the same page as the previous row, all new
index entries will need to be created. We don't really have the
concept
of updating some indexes and not others --- we either update them all,
or update none of them --- this is because Postgres uses an MVCC
system
of versioning.--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.comIf only the physical world exists, free will is an illusion.
On Fri, Dec 17, 2021 at 09:41:32AM +0000, Chris Lowder wrote:
Thank you for that (speedy) explanation and patch Bruce! That all makes perfect
sense.
There are actually two questions being addressed here:
1. In what cases are new index entries added, and for which indexes?
2. In what cases are index expressions evaluated?
For 1, I explained that for updates we either add new rows for all
indexes, or none of them. We create new index rows for non-HOT updates,
which happens if the updated row does not change indexed columns (or
columns involved in index expressions), and if the new row fits on the
same 8k heap page.
What I recently learned, at least for PG 10-14, is that the index
expression is only called for non-HOT updates, not HOT ones. Postgres
does not evaluate the index expression to determine if the indexed
expression has changed, but rather relies on column value comparisons.
I wrote the attached script which creates an expression index function
with a pg_sleep(1) call to determine if the function is called. If you
run the attached SQL script in psql, you will see that times of ~1
second happens only when there is a change in the colummn referenced by
the index expression or when the new row is stored on a new heap page
(ctid page number changes). The expression index function is not called
if the indexed column value does not change and remains on the same heap
page.
I am attaching an updated patch which mentions non-HOT updates for
expression indexes.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:
I am attaching an updated patch which mentions non-HOT updates for
expression indexes.
Thanks. I'm a bit concerned that before this patch the only place we
mentioned HOTin the docs is in the appendix acronym listing which points to
the readme file in the source code.
David J.
On Wed, Dec 22, 2021 at 02:40:12PM -0700, David G. Johnston wrote:
On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:
I am attaching an updated patch which mentions non-HOT updates for
expression indexes.Thanks. I'm a bit concerned that before this patch the only place we mentioned
HOTin the docs is in the appendix acronym listing which points to the readme
file in the source code.
Uh, I had a similar concern and found these cases:
monitoring.sgml: Number of rows updated (includes HOT updated rows)
monitoring.sgml: Number of rows HOT updated (i.e., with no separate index
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Dec 22, 2021 at 02:40:12PM -0700, David G. Johnston wrote:
On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:
I am attaching an updated patch which mentions non-HOT updates for
expression indexes.Thanks. I'm a bit concerned that before this patch the only place we
mentioned
HOTin the docs is in the appendix acronym listing which points to the
readme
file in the source code.
Uh, I had a similar concern and found these cases:
monitoring.sgml: Number of rows updated (includes HOT
updated rows)
monitoring.sgml: Number of rows HOT updated (i.e., with no
separate indexNot sure what else we can do --- it would be odd to explain it in the
expression index docs.
Found another one:
https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."
Make the acronym HOT in all these places take the reader to the acronym
page, and add a brief description there so they don't need to consume the
readme file?
David J.
On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:
On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.Found another one:
https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."Make the acronym HOT in all these places take the reader to the acronym page,
and add a brief description there so they don't need to consume the readme
file?
Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, Dec 22, 2021 at 05:09:53PM -0500, Bruce Momjian wrote:
On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:
On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.Found another one:
https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."Make the acronym HOT in all these places take the reader to the acronym page,
and add a brief description there so they don't need to consume the readme
file?Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.
I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
If only the physical world exists, free will is an illusion.
On Wed, Dec 22, 2021 at 3:09 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:
On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.Found another one:
https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."Make the acronym HOT in all these places take the reader to the acronym
page,
and add a brief description there so they don't need to consume the
readme
file?
Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.
I don't disagree, but it also doesn't seem so wrong that it is
unacceptable; and it is minimally invasive. I don't really see a great
existing place to slot it in.
Now, it seems like it would best fit as a sibling to the other index
internals that we cover - which are all listed on the main table of
contents. i.e., In the current docs my best choice we be to add a new
Chapter 64 for this. I would have wished for all of those index chapters
to be in grouped together under "64. Index Internals" and this would just
go in there. The Database Physical Storage would be Chapter 65.
David J.
On Wed, Dec 22, 2021 at 3:13 PM Bruce Momjian <bruce@momjian.us> wrote:
I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.
Along those lines adding "51.7 Special Considerations for Updates" would be
minimally invasive. Having this under "Overview of PostgreSQL Internals"
feels ok.
David J.
On Wed, Dec 22, 2021 at 2:13 PM Bruce Momjian <bruce@momjian.us> wrote:
I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.
I *strongly* agree that that's the single most important piece of
information, by far. Though we could and should have several
paragraphs about it.
We never get around to explaining HOT at all, which is just bizarre,
considering its importance. Many very popular user-level talks have
mostly just talked about this subject (e.g., a talk from Grant
McAlister from a couple of years back comes to mind).
--
Peter Geoghegan