11.7. Indexes on Expressions

Started by PG Bug reporting formover 4 years ago12 messagesdocs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:

Regarding the paragraph (_emphasis_ added):

Index expressions are relatively expensive to maintain, because the

derived expression(s) must be computed for _each row upon insertion and
whenever it is updated_.

Does the "it" in the above refer to the entire row or just the columns used
in the expression? Does updating any column of a row trigger an update to
all indexes with expressions?

Thanks in advance for your time and reply.

Cheers,
Chris Lowder

#2Bruce Momjian
bruce@momjian.us
In reply to: PG Bug reporting form (#1)
Re: 11.7. Indexes on Expressions

On Thu, Dec 16, 2021 at 04:57:37PM +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:

Regarding the paragraph (_emphasis_ added):

Index expressions are relatively expensive to maintain, because the

derived expression(s) must be computed for _each row upon insertion and
whenever it is updated_.

Does the "it" in the above refer to the entire row or just the columns used

So, here 'it' refers the "row" being updated. I am attaching a doc
patch which clarifies this.

in the expression? Does updating any column of a row trigger an update to
all indexes with expressions?

Well, that depends. If _no_ columns involved in any indexes are
changed, and the row is placed in the same page as the previous row (a
HOT update), then the indexes do not need to be updated. If any indexed
column changes, including those involved in expression indexes, or the
updated row isn't placed in the same page as the previous row, all new
index entries will need to be created. We don't really have the concept
of updating some indexes and not others --- we either update them all,
or update none of them --- this is because Postgres uses an MVCC system
of versioning.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

Attachments:

expression.difftext/x-diff; charset=us-asciiDownload+2-2
#3Chris Lowder
clowder@hey.com
In reply to: Bruce Momjian (#2)
Re: 11.7. Indexes on Expressions

Thank you for that (speedy) explanation and patch Bruce! That all makes
perfect sense. 

Cheers,
- Chris

Show quoted text

On December 16, 2021, Bruce Momjian <bruce@momjian.us> wrote:

On Thu, Dec 16, 2021 at 04:57:37PM +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:
 
Page: https://www.postgresql.org/docs/14/indexes-expressional.html
Description:
 
Regarding the paragraph (_emphasis_ added):
 

Index expressions are relatively expensive to maintain, because

the

derived expression(s) must be computed for _each row upon insertion

and

whenever it is updated_.
 
Does the "it" in the above refer to the entire row or just the

columns used

So, here 'it' refers the "row" being updated. I am attaching a doc
patch which clarifies this.

in the expression? Does updating any column of a row trigger an

update to

all indexes with expressions? 

Well, that depends. If _no_ columns involved in any indexes are
changed, and the row is placed in the same page as the previous row (a
HOT update), then the indexes do not need to be updated. If any
indexed
column changes, including those involved in expression indexes, or the
updated row isn't placed in the same page as the previous row, all new
index entries will need to be created. We don't really have the
concept
of updating some indexes and not others --- we either update them all,
or update none of them --- this is because Postgres uses an MVCC
system
of versioning.

-- 
 Bruce Momjian <bruce@momjian.us> https://momjian.us
 EDB https://enterprisedb.com

 If only the physical world exists, free will is an illusion.

#4Bruce Momjian
bruce@momjian.us
In reply to: Chris Lowder (#3)
Re: 11.7. Indexes on Expressions

On Fri, Dec 17, 2021 at 09:41:32AM +0000, Chris Lowder wrote:

Thank you for that (speedy) explanation and patch Bruce! That all makes perfect
sense. 

There are actually two questions being addressed here:

1. In what cases are new index entries added, and for which indexes?
2. In what cases are index expressions evaluated?

For 1, I explained that for updates we either add new rows for all
indexes, or none of them. We create new index rows for non-HOT updates,
which happens if the updated row does not change indexed columns (or
columns involved in index expressions), and if the new row fits on the
same 8k heap page.

What I recently learned, at least for PG 10-14, is that the index
expression is only called for non-HOT updates, not HOT ones. Postgres
does not evaluate the index expression to determine if the indexed
expression has changed, but rather relies on column value comparisons.

I wrote the attached script which creates an expression index function
with a pg_sleep(1) call to determine if the function is called. If you
run the attached SQL script in psql, you will see that times of ~1
second happens only when there is a change in the colummn referenced by
the index expression or when the new row is stored on a new heap page
(ctid page number changes). The expression index function is not called
if the indexed column value does not change and remains on the same heap
page.

I am attaching an updated patch which mentions non-HOT updates for
expression indexes.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

Attachments:

expr.sqlapplication/x-sqlDownload
expression.difftext/x-diff; charset=us-asciiDownload+2-2
#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#4)
Re: 11.7. Indexes on Expressions

On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:

I am attaching an updated patch which mentions non-HOT updates for
expression indexes.

Thanks. I'm a bit concerned that before this patch the only place we
mentioned HOTin the docs is in the appendix acronym listing which points to
the readme file in the source code.

David J.

#6Bruce Momjian
bruce@momjian.us
In reply to: David G. Johnston (#5)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 02:40:12PM -0700, David G. Johnston wrote:

On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:

I am attaching an updated patch which mentions non-HOT updates for
expression indexes.

Thanks.  I'm a bit concerned that before this patch the only place we mentioned
HOTin the docs is in the appendix acronym listing which points to the readme
file in the source code.

Uh, I had a similar concern and found these cases:

monitoring.sgml: Number of rows updated (includes HOT updated rows)
monitoring.sgml: Number of rows HOT updated (i.e., with no separate index

Not sure what else we can do --- it would be odd to explain it in the
expression index docs.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#7David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#6)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Dec 22, 2021 at 02:40:12PM -0700, David G. Johnston wrote:

On Fri, Dec 17, 2021 at 2:43 PM Bruce Momjian <bruce@momjian.us> wrote:

I am attaching an updated patch which mentions non-HOT updates for
expression indexes.

Thanks. I'm a bit concerned that before this patch the only place we

mentioned

HOTin the docs is in the appendix acronym listing which points to the

readme

file in the source code.

Uh, I had a similar concern and found these cases:

monitoring.sgml: Number of rows updated (includes HOT
updated rows)
monitoring.sgml: Number of rows HOT updated (i.e., with no
separate index

Not sure what else we can do --- it would be odd to explain it in the
expression index docs.

Found another one:

https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."

Make the acronym HOT in all these places take the reader to the acronym
page, and add a brief description there so they don't need to consume the
readme file?

David J.

#8Bruce Momjian
bruce@momjian.us
In reply to: David G. Johnston (#7)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:

On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.

Found another one:

https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."

Make the acronym HOT in all these places take the reader to the acronym page,
and add a brief description there so they don't need to consume the readme
file?

Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#9Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#8)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 05:09:53PM -0500, Bruce Momjian wrote:

On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:

On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.

Found another one:

https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."

Make the acronym HOT in all these places take the reader to the acronym page,
and add a brief description there so they don't need to consume the readme
file?

Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.

I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

#10David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#8)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 3:09 PM Bruce Momjian <bruce@momjian.us> wrote:

On Wed, Dec 22, 2021 at 03:06:12PM -0700, David G. Johnston wrote:

On Wed, Dec 22, 2021 at 2:45 PM Bruce Momjian <bruce@momjian.us> wrote:
Not sure what else we can do --- it would be odd to explain it in the
expression index docs.

Found another one:

https://www.postgresql.org/docs/current/btree-implementation.html
"...(where most individual updates cannot apply the HOT optimization."

Make the acronym HOT in all these places take the reader to the acronym

page,

and add a brief description there so they don't need to consume the

readme

file?

Wow, I don't think the acronym page is the right place for this --- I
think we need a new section for this somewhere.

I don't disagree, but it also doesn't seem so wrong that it is
unacceptable; and it is minimally invasive. I don't really see a great
existing place to slot it in.

Now, it seems like it would best fit as a sibling to the other index
internals that we cover - which are all listed on the main table of
contents. i.e., In the current docs my best choice we be to add a new
Chapter 64 for this. I would have wished for all of those index chapters
to be in grouped together under "64. Index Internals" and this would just
go in there. The Database Physical Storage would be Chapter 65.

David J.

#11David G. Johnston
david.g.johnston@gmail.com
In reply to: Bruce Momjian (#9)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 3:13 PM Bruce Momjian <bruce@momjian.us> wrote:

I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.

Along those lines adding "51.7 Special Considerations for Updates" would be
minimally invasive. Having this under "Overview of PostgreSQL Internals"
feels ok.

David J.

In reply to: Bruce Momjian (#9)
Re: 11.7. Indexes on Expressions

On Wed, Dec 22, 2021 at 2:13 PM Bruce Momjian <bruce@momjian.us> wrote:

I think our _big_ missing information is a definition of HOT updates and
that updating a column involved in an index disables HOT updates.

I *strongly* agree that that's the single most important piece of
information, by far. Though we could and should have several
paragraphs about it.

We never get around to explaining HOT at all, which is just bizarre,
considering its importance. Many very popular user-level talks have
mostly just talked about this subject (e.g., a talk from Grant
McAlister from a couple of years back comes to mind).

--
Peter Geoghegan