Mat Views and Conflicts

Started by Zahir Lalaniabout 2 years ago8 messagesgeneral
Jump to latest
#1Zahir Lalani
ZahirLalani@oliver.agency

Hi All

My understanding and hope was that Mat Views cache their data and that is how they are so fast. But we are experience "cancelling statement due to conflict with recovery" errors on MV's with large data sets and I thought that this could only happen if we ran the underlying query, not the Mat View?

Have I got it wrong??

Z

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Zahir Lalani (#1)
Re: Mat Views and Conflicts

On Mon, Feb 19, 2024, 10:53 Zahir Lalani <ZahirLalani@oliver.agency> wrote:

Hi All

My understanding and hope was that Mat Views cache their data and that is
how they are so fast. But we are experience “cancelling statement due to
conflict with recovery” errors on MV’s with large data sets and I thought
that this could only happen if we ran the underlying query, not the Mat
View?

Have I got it wrong??

A materialized view is, physically, just like any other table. It is a
cache but only in the sense that tables cache real life data.

David J.

#3Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Zahir Lalani (#1)
Re: Mat Views and Conflicts

On 2/19/24 09:52, Zahir Lalani wrote:

Hi All

My understanding and hope was that Mat Views cache their data and that
is how they are so fast. But we are experience “cancelling statement due
to conflict with recovery” errors on MV’s with large data sets and I
thought that this could only happen if we ran the underlying query, not
the Mat View?

1) When you say '... ran the underlying query ...' are you referring to:

REFRESH MATERIALIZED VIEW
https://www.postgresql.org/docs/current/sql-refreshmaterializedview.html

2) What is the complete error message from the Postgres log?

Have I got it wrong??

Z

--
Adrian Klaver
adrian.klaver@aklaver.com

#4Zahir Lalani
ZahirLalani@oliver.agency
In reply to: Adrian Klaver (#3)
RE: Mat Views and Conflicts

-----Original Message-----
From: Adrian Klaver <adrian.klaver@aklaver.com>
Sent: Monday, February 19, 2024 7:36 PM
To: Zahir Lalani <ZahirLalani@oliver.agency>; pgsql-
general@lists.postgresql.org
Subject: Re: Mat Views and Conflicts

On 2/19/24 09:52, Zahir Lalani wrote:

Hi All

My understanding and hope was that Mat Views cache their data and that
is how they are so fast. But we are experience “cancelling statement
due to conflict with recovery” errors on MV’s with large data sets and
I thought that this could only happen if we ran the underlying query,
not the Mat View?

1) When you say '... ran the underlying query ...' are you referring to:

REFRESH MATERIALIZED VIEW
https://www.postgresql.org/docs/current/sql-refreshmaterializedview.html

2) What is the complete error message from the Postgres log?

Have I got it wrong??

Z

--
Adrian Klaver
adrian.klaver@aklaver.com

To clarify:

We are not running the REFRESH. We are only running the materialised view. We used to run the non-view query directly and always suffered from the conflict with recovery error due to data changes on the replica server during the query run. We increased the max_standby_streaming_delay quite a lot, but these are very heavy queries which take a long time. So that is when we decided to convert to an MV. We get the conflict a lot less, but still enough to be annoying. This suggests that not all the MV data is cached and it still queries the source tables in some way?

Z

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Zahir Lalani (#4)
Re: Mat Views and Conflicts

On Tuesday, February 20, 2024, Zahir Lalani <ZahirLalani@oliver.agency>
wrote:

This suggests that not all the MV data is cached and it still queries the
source tables in some way?

No. That isn’t how MV work. If you include an MV relation in your query
from clause there is no reference or knowledge as to the underlying query
that built the MV physical relation.

David J.

#6Zahir Lalani
ZahirLalani@oliver.agency
In reply to: David G. Johnston (#5)
RE: Mat Views and Conflicts

-From: David G. Johnston <david.g.johnston@gmail.com>
-Sent: Tuesday, February 20, 2024 1:20 PM
-To: Zahir Lalani <ZahirLalani@oliver.agency>
-Cc: Adrian Klaver <adrian.klaver@aklaver.com>; pgsql-general@lists.postgresql.org
-Subject: Re: Mat Views and Conflicts
-
-
-On Tuesday, February 20, 2024, Zahir Lalani <mailto:ZahirLalani@oliver.agency> wrote:
-
-This suggests that not all the MV data is cached and it still queries the source tables in some way?
-
-No.  That isn’t how MV work.  If you include an MV relation in your query from clause there is no reference or knowledge as to the underlying query that built the MV physical relation.
-
-David J.

Thx David

Cool so that is what my understanding was. But that comes back to my main question - the MV is cached data from the last time refresh was run. If I query the MV and the underlying realtime data changes (i.e. from the source tables not the MV) - should I be getting a replication conflict issue? I would have thought not as the MV data should not have been impacted

Z

#7Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Zahir Lalani (#4)
Re: Mat Views and Conflicts

On 2/20/24 04:20, Zahir Lalani wrote:

1) When you say '... ran the underlying query ...' are you referring to:

REFRESH MATERIALIZED VIEW
https://www.postgresql.org/docs/current/sql-refreshmaterializedview.html

2) What is the complete error message from the Postgres log?

Have I got it wrong??

Z

--
Adrian Klaver
adrian.klaver@aklaver.com

To clarify:

We are not running the REFRESH. We are only running the materialised view. We used to run the non-view query directly and always suffered from the conflict with recovery error due to data changes on the replica server during the query run. We increased the max_standby_streaming_delay quite a lot, but these are very heavy queries which take a long time. So that is when we decided to convert to an MV. We get the conflict a lot less, but still enough to be annoying. This suggests that not all the MV data is cached and it still queries the source tables in some way?

2) What is the complete error message from the Postgres log?

Z

--
Adrian Klaver
adrian.klaver@aklaver.com

#8Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Zahir Lalani (#6)
Re: Mat Views and Conflicts

On 2/20/24 05:25, Zahir Lalani wrote:

-From: David G. Johnston <david.g.johnston@gmail.com>
-Sent: Tuesday, February 20, 2024 1:20 PM
-To: Zahir Lalani <ZahirLalani@oliver.agency>
-Cc: Adrian Klaver <adrian.klaver@aklaver.com>; pgsql-general@lists.postgresql.org
-Subject: Re: Mat Views and Conflicts
-
-
-On Tuesday, February 20, 2024, Zahir Lalani <mailto:ZahirLalani@oliver.agency> wrote:
-
-This suggests that not all the MV data is cached and it still queries the source tables in some way?
-
-No.  That isn’t how MV work.  If you include an MV relation in your query from clause there is no reference or knowledge as to the underlying query that built the MV physical relation.
-
-David J.

Thx David

Cool so that is what my understanding was. But that comes back to my main question - the MV is cached data from the last time refresh was run. If I query the MV and the underlying realtime data changes (i.e. from the source tables not the MV) - should I be getting a replication conflict issue? I would have thought not as the MV data should not have been impacted

Then either something is running REFRESH MATERIALIZED VIEW or some other
process is accessing the underlying tables.

Z

--
Adrian Klaver
adrian.klaver@aklaver.com