refresh materialized view concurrently
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.
It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.
I didn't need to touch very much outside of matview-specific files
for this. My biggest concern is that I needed two small functions
which did *exactly* what some static functions in ri_triggers.c
were doing and couldn't see where the best place to share them from
was. For the moment I just duplicated them, but my hope would be
that they could be put in a suitable location and called from both
places, rather than duplicating the 30-some lines of code. The
function signatures are:
void quoteOneName(char *buffer, const char *name)
void quoteRelationName(char *buffer, Relation rel)
Comments in the patch describe the technique used for the
transactional refresh, but I'm not sure how easy it is to
understand the technique from the comments. Here is a
demonstration of the basic technique, using a table to mock the
materialized view so it can be run directly.
-------------------------------------------------------------------
--
-- Setup
--
drop table if exists n, nt, nd cascade;
drop table if exists nm;
create table n (id int not null primary key, val text);
insert into n values
(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four'), (5, 'five'),
(6, null), (7, null), (8, null), (9, null);
-- We use a table to mock this materialized view definition:
-- create materialized view nm as select * from n;
create table nm as select * from n;
insert into n values (10, 'ten'), (11, null);
update n set val = 'zwei' where id = 2;
update n set val = null where id = 3;
update n set id = 44, val = 'forty-four' where id = 4;
update n set val = 'seven' where id = 7;
delete from n where id = 5;
delete from n where id = 8;
vacuum analyze;
--
-- Sample of internal processing for REFRESH MV CONCURRENTLY.
--
begin;
create temp table nt as select * from n;
analyze nt;
create temp table nd as
SELECT x.ctid as tid, y
FROM nm x
FULL JOIN n y ON (y.id OPERATOR(pg_catalog.=) x.id)
WHERE (y.*) IS DISTINCT FROM (x.*)
ORDER BY tid;
analyze nd;
delete from nm where ctid in
(select tid from nd
where tid is not null and y is not distinct from null);
update nm x set id = (d.y).id, val = (d.y).val from nd d
where d.tid is not null and x.ctid = d.tid;
insert into nm select (y).* from nd where tid is null;
commit;
--
-- Check that results match.
--
select * from n order by id;
select * from nm order by id;
-------------------------------------------------------------------
I also tried a million-row materialized view with the patch to see
what the performace was like on a large table with just a few
changes. I was surprised that a small change-set like this was
actually faster than replacing the heap, at least on my machine.
Obviously, when a larger number of rows are affected the
transactional CONCURRENTLY option will be slower, and this is not
intended in any way as a performace-enhancing feature, that was
just a happy surprise in testing.
-------------------------------------------------------------------
-- drop from previous test
drop table if exists testv cascade;
-- create and populate permanent table
create table testv (id int primary key, val text);
insert into testv
select n, cash_words((floor(random() * 100000000) / 100)::text::money)
from (select generate_series(1, 2000000, 2)) s(n);
update testv
set val = NULL
where id = 547345;
create materialized view matv as select * from testv;
create unique index matv_id on matv (id);
vacuum analyze matv;
delete from testv where id = 16405;
insert into testv
values (393466, cash_words((floor(random() * 100000000) / 100)::text::money));
update testv
set val = cash_words((floor(random() * 100000000) / 100)::text::money)
where id = 1947141;
refresh materialized view concurrently matv;
-------------------------------------------------------------------
People may be surprised to see this using SPI even more than
ri_triggers.c does. I think this is the safest and most
maintainable approach, although I welcome alternative suggestions.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
refresh-concurrently-v1.patchtext/x-diff; name=refresh-concurrently-v1.patchDownload+569-90
On 14 June 2013 17:05, Kevin Grittner <kgrittn@ymail.com> wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.
Is there a reason to keep the non-concurrent behaviour? Anybody that
wants to explicitly lock should just run a LOCK statement. Can't we
treat behaviour when fully locked as an optimisation, so we can just
do the right thing without extra thought and keywords?
It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.
Incremental maintenance will be very straightforward using the logical
changeset extraction code Andres is working on. Having two parallel
mechanisms for changeset extraction in one release seems like a waste
of time. Especially when one is known to be better than the other
already.
Given that we also want to do concurrent CLUSTER and ALTER TABLE ...
SET TABLESPACE using changeset extraction I think its time that
discussion happened on hackers.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> wrote:
On 14 June 2013 17:05, Kevin Grittner <kgrittn@ymail.com> wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
for 9.4 CF1. The goal of this patch is to allow a refresh
without interfering with concurrent reads, using transactional
semantics.Is there a reason to keep the non-concurrent behaviour?
Yes. For initial population, truncation, and replacement of
contents when more than a small percentage of rows are affected.
Anybody that wants to explicitly lock should just run a LOCK
statement. Can't we treat behaviour when fully locked as an
optimisation, so we can just do the right thing without extra
thought and keywords?
Are you suggesting that we use heap replacement or DML depending on
what heavyweight locks held when the statement is executed?
It is my hope to get this committed during this CF to allow me
to focus on incremental maintenance for the rest of the release
cycle.Incremental maintenance will be very straightforward using the
logical changeset extraction code Andres is working on.
At most, changeset extraction will help with obtaining the initial
delta for the base relations, which is less than 5% of what needs
doing for incremental maintenance of materialized views. If it
looks like a good fit, of course I'll use it.
Having two parallel mechanisms for changeset extraction in one
release seems like a waste of time.
I haven't looked in depth at what technique to use for capturing
the base relation deltas. The changeset extraction technique is
something to consider, for sure. I have a lot of work left to see
whether it works for this. In particular, to handle all requested
timings, it would need to have low enough latency to provide a
delta during the completion of each DML statement, to support
requests for "eager" maintenance of a materialized view -- where
the transaction which just changed the base relation would see the
effect if they queried the matview. That may not be the something
to try to tackle in this release, but there are people who want it,
and I would prefer to pick a technique which didn't have a latency
high enough to make that impractical. That's not to say that I
know that to be a problem for using the changeset extraction
technique for this -- just that I haven't gotten to the point of
evaluating that.
Especially when one is known to be better than the other already.
What is the hypothetical technique you're arguing is inferior? For
my own part, I haven't gotten beyond the phase of knowing that to
meet all requests for the feature, it would need to be available at
about the same point that AFTER EACH STATEMENT triggers fire, but
that it should not involve any user-written triggers. Have you
implemented something similar to what you think I might be
considering? Do you have benchmark results? Can you share
details?
Given that we also want to do concurrent CLUSTER and ALTER TABLE
... SET TABLESPACE using changeset extraction I think its time
that discussion happened on hackers.
No objections to that here; but please don't hijack this thread for
that discussion.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 17 June 2013 00:43, Kevin Grittner <kgrittn@ymail.com> wrote:
Especially when one is known to be better than the other already.
What is the hypothetical technique you're arguing is inferior? For
my own part, I haven't gotten beyond the phase of knowing that to
meet all requests for the feature, it would need to be available at
about the same point that AFTER EACH STATEMENT triggers fire, but
that it should not involve any user-written triggers. Have you
implemented something similar to what you think I might be
considering? Do you have benchmark results? Can you share
details?
Recording the changeset required by replication is known to be more
efficient using WAL based extraction than using triggers. WAL writes
are effectively free and using WAL concentrates the reads to avoid
random I/O in large databases. That would be the most suitable
approach for continuously updated matviews, or frequently updates.
Extraction using multiple snapshots is also possible, using a
technique similar to "concurrently" mechanism. That would require
re-scanning the whole table which might be overkill depending upon the
number of changes. That would work for reasonably infrequent updates.
Given that we also want to do concurrent CLUSTER and ALTER TABLE
... SET TABLESPACE using changeset extraction I think its time
that discussion happened on hackers.No objections to that here; but please don't hijack this thread for
that discussion.
There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what that
mechanism might be in each case, so we don't develop 2 or even 3
different ones while everybody ignores each other. As you said, we
should be helping each other and working together, and I agree.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 14.06.2013 19:05, Kevin Grittner wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.
I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.
I didn't need to touch very much outside of matview-specific files
for this. My biggest concern is that I needed two small functions
which did *exactly* what some static functions in ri_triggers.c
were doing and couldn't see where the best place to share them from
was. For the moment I just duplicated them, but my hope would be
that they could be put in a suitable location and called from both
places, rather than duplicating the 30-some lines of code. The
function signatures are:void quoteOneName(char *buffer, const char *name)
void quoteRelationName(char *buffer, Relation rel)
I'd just use quote_identifier and quote_qualified_identifier instead.
I didn't understand this error message:
+ if (!foundUniqueIndex)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("concurrent refresh requires a unique index on just
columns for all rows of the materialized view")));
+
What does that mean?
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> wrote:
There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what
that mechanism might be in each case
Changeset extraction has nothing to do with this patch, and cannot
possibly be useful for it. Please keep discussion which is
completely unrelated to this patch off this thread.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 17 June 2013 12:13, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 14.06.2013 19:05, Kevin Grittner wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.
There are generally 4 kinds of mat view
1. Transactionally updated
2. Incremental update, eventually consistent
3. Incremental update, regular refresh
4. Full refresh
At the moment we only have type 4 and it holds a full lock while it
runs. We definitely need a CONCURRENTLY option and this is it.
Implementing the other types won't invalidate what we currently have,
so this makes sense to me.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 17 June 2013 13:15, Kevin Grittner <kgrittn@ymail.com> wrote:
Simon Riggs <simon@2ndQuadrant.com> wrote:
There are multiple features all requiring efficient change set
extraction. It seems extremely relevant to begin discussing what
that mechanism might be in each caseChangeset extraction has nothing to do with this patch, and cannot
possibly be useful for it. Please keep discussion which is
completely unrelated to this patch off this thread.
Kevin,
You mentioned "incremental maintenance" in your original post and I
have been discussing it. Had you not mentioned it, I doubt I would
have thought of it.
But since you did mention it, and its clearly an important issue, it
seems relevant to have discussed it here and now.
I'm happy to wait for you to start the thread discussing it directly though.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
On 14.06.2013 19:05, Kevin Grittner wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY
for 9.4 CF1. The goal of this patch is to allow a refresh
without interfering with concurrent reads, using transactional
semantics.It is my hope to get this committed during this CF to allow me
to focus on incremental maintenance for the rest of the release
cycle.I must say this seems a bit pointless on its own.
I completely disagree. When I read what people were posting about
the materialized view creation that went into 9.3, there were many
comments by people that they can't use it until the materialized
views can be refreshed without blocking readers. There is a clear
need for this. It doesn't do much to advance incremental
maintenance, but it is a much smaller patch which will make
matviews usable by a lot of people who can't use the initial
feature set.
I didn't understand this error message:
+ if (!foundUniqueIndex) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("concurrent refresh requires a unique index on just columns for all rows of the materialized view"))); +What does that mean?
It means that the REFRESH MATERIALIZED VIEW CONCURRENTLY command
cannot be used on a materialized view unless it has at least one
UNIQUE index which is not partial (i.e., there is no WHERE clause)
and is not indexing on an expression (i.e., the index is entirely
on bare column names). Set logic to do the "diff" is hard to get
right if the tables are not proper sets (i.e., they contain
duplicate rows). I can see at least three ways it *could* be done,
but all of them are much more complex and significantly slower.
With a UNIQUE index on some set of columns in all rows the correct
guarantees exist to use fast set logic. It isn't that it's needed
for access; it is needed to provide a guarantee that there is no
row without NULLs that exactly duplicates another row.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Simon Riggs <simon@2ndQuadrant.com> wrote:
Kevin Grittner <kgrittn@ymail.com> wrote:
Changeset extraction has nothing to do with this patch, and
cannot possibly be useful for it. Please keep discussion which
is completely unrelated to this patch off this thread.
You mentioned "incremental maintenance" in your original post and
I have been discussing it. Had you not mentioned it, I doubt I
would have thought of it.But since you did mention it, and its clearly an important issue,
it seems relevant to have discussed it here and now.
What I said was that I wanted to get this out of the way before I
started working on incremental maintenance.
I'm happy to wait for you to start the thread discussing it
directly though.
Cool.
-Kevin
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:
+ errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));
Maybe my english is failing me here, but I don’t understand the “just” part.
Nicolas
--
A. Because it breaks the logical sequence of discussion.
Q. Why is top posting bad?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Nicolas Barbier <nicolas.barbier@gmail.com> wrote:
2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:
+ errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));Maybe my english is failing me here, but I don’t understand the “just” part.
It means that the index must not use any expressions in the list of
what it's indexing on -- only column names. Suggestions for better
wording would be welcome.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jun 17, 2013 at 11:21 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Nicolas Barbier <nicolas.barbier@gmail.com> wrote:
2013/6/17 Heikki Linnakangas <hlinnakangas@vmware.com>:
+ errmsg("concurrent refresh requires a
unique index on just columns for all rows of the materialized view")));Maybe my english is failing me here, but I don’t understand the “just” part.
It means that the index must not use any expressions in the list of
what it's indexing on -- only column names. Suggestions for better
wording would be welcome.
Random idea:
ERROR: materialized view \"%s\" does not have a unique key
Perhaps augmented with:
HINT: Create a UNIQUE btree index with no WHERE clause on one or more
columns of the materialized view.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/17/2013 04:13 AM, Heikki Linnakangas wrote:
On 14.06.2013 19:05, Kevin Grittner wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.It is my hope to get this committed during this CF to allow me to
focus on incremental maintenance for the rest of the release cycle.I must say this seems a bit pointless on its own. But if it's a stepping
stone to incremental maintenance, I have no objections.
Actually, CONCURRENTLY was cited as the #1 deficiency for the matview
feature for 9.3. With it, the use-case for Postgres matviews is
broadened considerably. So it's very valuable on its own, even if (for
example) INCREMENTAL didn't get done for 9.3.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Import Notes
Reply to msg id not found: WM992465968fcbdb9d29166d2871b1728a1d32b483474d657f9faa07a6db75c8c6711626b0fd5af1596642555fee8f3a7e@asav-2.01.com
On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.
I spent a few hours to review the patch.
As far as I can tell, the overall approach is as follows.
- create a new temp heap as non-concurrent does, but with ExclusiveLock on
the matview, so that reader wouldn't be blocked
- with this temp table and the matview, query FULL JOIN and extract
difference between original matview and temp heap (via SPI)
- this operation requires unique index for performance reason (or
correctness reason too)
- run ANALYZE on this diff table
- run UPDATE, INSERT and DELETE via SPI, to do the merge
- close these temp heap
If I don't miss something, the requirement for the CONCURRENTLY option is
to allow simple SELECT reader to read the matview concurrently while the
view is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible just
to acquire ExclusiveLock on the matview while populating the data and swap
the relfile by taking small AccessExclusiveLock. This lock escalation is
no dead lock hazard, I suppose, because concurrent operation would block
the other at the point ExclusiveLock is acquired, and ExclusiveLock
conflicts AccessExclusiveLock. Then you don't need the complicated SPI
logic or unique key index dependency.
Assuming I'm asking something wrong and going for the current approach,
here's what I've found in the patch.
- I'm not sure if unique key index requirement is reasonable or not,
because users only add CONCURRENTLY option and not merging or incremental
update.
- This could be an overflow in diffname buffer.
+ quoteRelationName(tempname, tempRel);
+ strcpy(diffname, tempname);
+ strcpy(diffname + strlen(diffname) - 1, "_2\"");
- As others pointed out, quoteOneName can be replaced with quote_identifier
- This looks harmless, but I wonder if you need to change relkind.
*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace)
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
! OldHeap->rd_rel->relkind,
! OldHeap->rd_rel->relpersistence,
false,
RelationIsMapped(OldHeap),
true,
--- 680,687 ----
OldHeap->rd_rel->relowner,
OldHeapDesc,
NIL,
! RELKIND_RELATION,
! relpersistence,
false,
RelationIsMapped(OldHeap),
true,
Since OldHeap->rd_rel->relkind has been working with 'm', too, not only 'r'?
- I found two additional parameters on make_new_heap ugly, but couldn't
come up with better solution. Maybe we can pass Relation of old heap to
the function instead of Oid..
That's about it from me.
Thanks,
--
Hitoshi Harada
On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>wrote:
On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.I spent a few hours to review the patch.
Oh, BTW, though it is not part of this patch, but I came across this.
! /*
! * We're not using materialized views in the system catalogs.
! */
Assert(!IsSystemRelation(matviewRel));
Of course we don't have builtin matview on system catalog, but it is
possible to create such one by allow_system_table_mods=true, so Assert
doesn't look like good to me.
Thanks,
--
Hitoshi Harada
On 2013-06-21 02:43:23 -0700, Hitoshi Harada wrote:
On Fri, Jun 21, 2013 at 2:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>wrote:
On Fri, Jun 14, 2013 at 9:05 AM, Kevin Grittner <kgrittn@ymail.com> wrote:
Attached is a patch for REFRESH MATERIALIZED VIEW CONCURRENTLY for
9.4 CF1. The goal of this patch is to allow a refresh without
interfering with concurrent reads, using transactional semantics.I spent a few hours to review the patch.
Oh, BTW, though it is not part of this patch, but I came across this.
! /*
! * We're not using materialized views in the system catalogs.
! */
Assert(!IsSystemRelation(matviewRel));Of course we don't have builtin matview on system catalog, but it is
possible to create such one by allow_system_table_mods=true, so Assert
doesn't look like good to me.
Anybody using allow_system_table_mods gets to keep the pieces. There are
so many ways to break just about everything things using it that I don't
think worrying much makes sense.
If you want you could replace that by an elog(), but...
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tanuki@gmail.com> wrote:
If I don't miss something, the requirement for the CONCURRENTLY option is to
allow simple SELECT reader to read the matview concurrently while the view
is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possible just to
acquire ExclusiveLock on the matview while populating the data and swap the
relfile by taking small AccessExclusiveLock. This lock escalation is no
dead lock hazard, I suppose, because concurrent operation would block the
other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
AccessExclusiveLock. Then you don't need the complicated SPI logic or
unique key index dependency.
This is no good. One, all lock upgrades are deadlock hazards. In
this case, that plays out as follows: suppose that the session running
REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
else. Some other process takes an AccessShareLock on the materialized
view and then tries to take a conflicting lock on the other object.
Kaboom, deadlock. Granted, the chances of that happening in practice
are small, but it IS the reason why we typically try to having
long-running operations perform lock upgrades. Users get really
annoyed when their DDL runs for an hour and then rolls back.
Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple. Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hitoshi Harada <umi.tanuki@gmail.com> wrote:
I spent a few hours to review the patch.
Thanks!
As far as I can tell, the overall approach is as follows.
- create a new temp heap as non-concurrent does, but with
ExclusiveLock on the matview, so that reader wouldn't be blocked
Non-concurrent creates the heap in the matview's tablespace and
namespace, so the "temp" part is different in concurrent
generation. This difference is why concurrent can be faster when
few rows change.
Also, before the next step there is an ANALYZE of the temp table,
so the planner can make good choices in the next step.
- with this temp table and the matview, query FULL JOIN and
extract difference between original matview and temp heap (via SPI)
Right; into another temp table.
- this operation requires unique index for performance reason (or
correctness reason too)
It is primarily for correctness in the face of duplicate rows which
have no nulls. Do you think the reasons need to be better
documented with comments?
- run ANALYZE on this diff table
- run UPDATE, INSERT and DELETE via SPI, to do the merge
- close these temp heap
Right, then drop them.
Assuming I'm asking something wrong and going for the current
approach, here's what I've found in the patch.
- I'm not sure if unique key index requirement is reasonable or
not, because users only add CONCURRENTLY option and not merging
or incremental update.
The patch would need to be about an order of magnitude more complex
without that requirement due to the problems handling duplicate
rows. This patch uses set logic, and set logic falls down badly in
the face of duplicate rows. Consider how you would try to make
this technique work if the "old" data has 3 versions of a row and
the "new" data in the temp table has 4 versions of that same row.
You can play around with that by modifying the examples of the
logic using regular tables I included with the first version of the
patch. A UNIQUE index is the only way to prevent duplicate rows.
The fact that there can be duplicates with NULL in one or more of
the columns which are part of a duplicate index is not a fatal
problem; it just means that those cases much be handled through
DELETE and re-INSERT of the rows containing NULL in any column
which is part of a duplicate index key.
- As others pointed out, quoteOneName can be replaced with
quote_identifier
OK, I did it that way. The quote_identifier and
quote_qualified_identifier functions seem rather clumsy for the
case where you already know you have an identifier (because you are
dealing with an open Relation). I think we should have different
functions for that case, but that should probably not be material
for this patch, so changed to use the existing tools.
- This looks harmless, but I wonder if you need to change relkind.
*** 665,672 **** make_new_heap(Oid OIDOldHeap, Oid NewTableSpace) OldHeap->rd_rel->relowner, OldHeapDesc, NIL, ! OldHeap->rd_rel->relkind, ! OldHeap->rd_rel->relpersistence, false, RelationIsMapped(OldHeap), true, --- 680,687 ---- OldHeap->rd_rel->relowner, OldHeapDesc, NIL, ! RELKIND_RELATION, ! relpersistence, false, RelationIsMapped(OldHeap), true,Since OldHeap->rd_rel->relkind has been working with 'm', too,
not only 'r'?
No, the relation created by this is not going to be around when
we're done; we're either going to move its heap onto the existing
matview or drop the temp table. Either way, it's OK for it to be a
table until we do. I don't see the benefit of complicating the
code to do otherwise.
- I found two additional parameters on make_new_heap ugly, but
couldn't come up with better solution. Maybe we can pass
Relation of old heap to the function instead of Oid..
This was the cleanest way I could see. Opening the relation
elsewhere and passing it in would not only be uglier IMO, it would
do nothing to tell the function that it should create a temporary
table.
I also modified the confusing error message to something close to
the suggestion from Robert.
What to do about the Assert that the matview is not a system
relation seems like material for a separate patch. After review,
I'm inclined to remove the test altogether, so that extensions can
create matviews in pg_catalog.
New version attached.
--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
refresh-concurrently-v2.patchtext/x-diff; name=refresh-concurrently-v2.patchDownload+551-90
On Tue, Jun 25, 2013 at 9:07 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jun 21, 2013 at 5:20 AM, Hitoshi Harada <umi.tanuki@gmail.com>
wrote:If I don't miss something, the requirement for the CONCURRENTLY option
is to
allow simple SELECT reader to read the matview concurrently while the
view
is populating the new data, and INSERT/UPDATE/DELETE and SELECT FOR
UPDATE/SHARE are still blocked. So, I wonder why it is not possiblejust to
acquire ExclusiveLock on the matview while populating the data and swap
the
relfile by taking small AccessExclusiveLock. This lock escalation is no
dead lock hazard, I suppose, because concurrent operation would block the
other at the point ExclusiveLock is acquired, and ExclusiveLock conflicts
AccessExclusiveLock. Then you don't need the complicated SPI logic or
unique key index dependency.This is no good. One, all lock upgrades are deadlock hazards. In
this case, that plays out as follows: suppose that the session running
REFRESH MATERIALIZED VIEW CONCURRENTLY also holds a lock on something
else. Some other process takes an AccessShareLock on the materialized
view and then tries to take a conflicting lock on the other object.
Kaboom, deadlock. Granted, the chances of that happening in practice
are small, but it IS the reason why we typically try to having
long-running operations perform lock upgrades. Users get really
annoyed when their DDL runs for an hour and then rolls back.
OK, that' not safe. What I was thinking was something similar to
compare-and-swap, where the whole operation is atomic under an
AccessExclusiveLock. What if we release ExclusiveLock once a new matview
was created and re-acquire AccessExclusiveLock before trying swap? Note
matview is a little different from index which I know people are talking
about in REINDEX CONCURRENTLY thread, in that the content of matview does
not change incrementally (at least at this point), but only does change
fully in swapping operation by the same REFRESH MATERIALIZED VIEW command.
The only race condition is between releasing Exclusive lock and re-acquire
AccessExclusiveLock someone else can go ahead with the same operation and
could create another one. If it happens, let's abort us, because I guess
that's the way our transaction system is working anyway; in case of unique
key index insertion for example, if I find another guy is inserting the
same value in the index, I wait for the other guy to finish his work and if
his transaction commits I give up, otherwise I go ahead. Maybe it's
annoying if an hour operation finally gets aborted, but my purpose is
actually achieved by the other guy. If the primary goal of this feature is
let reader reads the matview concurrently it should be ok?
Hmm, but in such cases the first guy is always win and the second guy who
may come an hour later loses so we cannot get the result from the latest
command... I still wonder there should be some way.
Two, until we get MVCC catalog scans, it's not safe to update any
system catalog tuple without an AccessExclusiveLock on some locktag
that will prevent concurrent catalog scans for that tuple. Under
SnapshotNow semantics, concurrent readers can fail to see that the
object is present at all, leading to mysterious failures - especially
if some of the object's catalog scans are seen and others are missed.So what I'm saying above is take AccessExclusiveLock on swapping relfile
in catalog. This doesn't violate your statement, I suppose. I'm actually
still skeptical about MVCC catalog, because even if you can make catalog
lookup MVCC, relfile on the filesystem is not MVCC. If session 1 changes
relfilenode in pg_class and commit transaction, delete the old relfile from
the filesystem, but another concurrent session 2 that just took a snapshot
before 1 made such change keeps running and tries to open this relation,
grabbing the old relfile and open it from filesystem -- ERROR: relfile not
found. So everyone actually needs to see up-to-date information that
synchronizes with what filesystem says and that's SnapshotNow. In my
experimental thought above about compare-and-swap way, in compare phase he
needs to see the most recent valid information, otherwise he never thinks
someone did something new. Since I haven't read the whole thread, maybe we
have already discussed about it, but it would help if you clarify this
concern.
Thanks,
--
Hitoshi Harada