Remaining beta blockers

Started by Tom Laneover 12 years ago73 messages

tgl@sss.pgh.pa.us

over 12 years ago

The schedule says we're going to wrap 9.3beta1 on Monday, but it doesn't
feel to me like we are anywhere near ready to ship a credible beta.
Of the items on the 9.3 open-items page,
https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items
there are at least three that seem like absolute drop-dead stop-ship issues:

1. The matviews mess. Changing that will force initdb, more than
likely, so we need it resolved before beta1.

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

3. The ProcessUtility restructuring problem. Surely we're not going to
ship a beta with persistent buildfarm failures, which even show up
sometimes on non-CLOBBER_CACHE_ALWAYS animals, eg today at
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2013-04-27%2009%3A27%3A00

Can we get these resolved by Monday, or must we postpone beta?

As far as #1 goes, I think we have little choice at this point but to
remove the unlogged-matviews feature for 9.3. Various alternatives were
kicked around in the "matview scannability rehash" thread but they were
only marginally less klugy, and nobody's stepped up with a patch anyway.
I will undertake to remove unlogged matviews and replace isscannable-
as-a-file-size-property with isscannable-as-a-reloption (unless anyone
feels it would be better as a separate pg_class column?).

I haven't been paying too close attention to the checksum threads
so I'm not sure where we are on #2.

As for #3, there's a draft patch, who's going to take responsibility
for that?

Anything else that's "must fix"?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Noah Misch

noah@leadboat.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

On Sat, Apr 27, 2013 at 10:59:32AM -0400, Tom Lane wrote:

The schedule says we're going to wrap 9.3beta1 on Monday, but it doesn't
feel to me like we are anywhere near ready to ship a credible beta.
Of the items on the 9.3 open-items page,
https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items
there are at least three that seem like absolute drop-dead stop-ship issues:

1. The matviews mess. Changing that will force initdb, more than
likely, so we need it resolved before beta1.

In similar discussions last year, we concluded that forcing initdb after beta
is fine so long as pg_upgrade can handle the change. Any of the proposals for
changing materialized view scannability are easy for pg_upgrade to handle.

As far as #1 goes, I think we have little choice at this point but to
remove the unlogged-matviews feature for 9.3. Various alternatives were
kicked around in the "matview scannability rehash" thread but they were
only marginally less klugy, and nobody's stepped up with a patch anyway.
I will undertake to remove unlogged matviews and replace isscannable-
as-a-file-size-property with isscannable-as-a-reloption (unless anyone
feels it would be better as a separate pg_class column?).

This perspective is all wrong. I hate to be blunt, but that thread ended with
your technical objections to the committed implementation breaking apart and
sinking. There was no consensus to change it on policy/UI grounds, either.

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

Since pg_upgrade isn't in a position to migrate beta clusters to a new
checksum algorithm, I share the desire to settle this sooner rather than
later. However, if finalizing it before beta singularly entails slipping beta
by more than a week or two, I think we should cut the beta without doing so.
Then mention in its release notes that "initdb --data-checksums" beta clusters
may require dump/reload to upgrade to a release or later beta.

Anything else that's "must fix"?

Not to my knowledge.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Noah Misch (#2)

Re: Remaining beta blockers

Noah Misch <noah@leadboat.com> writes:

On Sat, Apr 27, 2013 at 10:59:32AM -0400, Tom Lane wrote:

As far as #1 goes, I think we have little choice at this point but to
remove the unlogged-matviews feature for 9.3.

This perspective is all wrong. I hate to be blunt, but that thread ended with
your technical objections to the committed implementation breaking apart and
sinking. There was no consensus to change it on policy/UI grounds, either.

[ shrug... ] You and Kevin essentially repeated your claims that the
current implementation is OK; nobody else weighed in. I see absolutely
no reason to change my opinion on this. Keeping relisscannable state in
the form of is-the-file-of-nonzero-size is something we WILL regret, and
Pollyanna-ish refusal to believe that is not an adequate reason for
painting ourselves into a corner, especially not for a second-order
feature like unlogged matviews.

The way forward to unlogged matviews that behave the way Kevin wants
is to improve crash recovery so that we can update catalog state after
completing recovery, which is something there are other reasons to want
anyway. But it's far too late to do that in this cycle. In the
meantime I remain convinced that we're better off dropping the feature
until we have an implementation that's not going to bite us in the rear
in the future.

I also note that there are acknowledged-even-by-you bugs in the current
implementation, which no patch has emerged for, so *something's* got to
be done. I'm done waiting for something to happen, and am going to go
fix it in the way I think best.

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

Since pg_upgrade isn't in a position to migrate beta clusters to a new
checksum algorithm, I share the desire to settle this sooner rather than
later. However, if finalizing it before beta singularly entails slipping beta
by more than a week or two, I think we should cut the beta without doing so.
Then mention in its release notes that "initdb --data-checksums" beta clusters
may require dump/reload to upgrade to a release or later beta.

Meh. If we want to reserve the right to do that, we'd better do
something about putting a checksum algorithm ID in pg_control after all.
Otherwise we'll be faced with not detecting the algorithm change, or
else changing pg_control format post-beta1, which would break beta
clusters for everybody not just those who'd experimented with checksums.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

On Sat, Apr 27, 2013 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The schedule says we're going to wrap 9.3beta1 on Monday, but it doesn't
feel to me like we are anywhere near ready to ship a credible beta.
Of the items on the 9.3 open-items page,
https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items
there are at least three that seem like absolute drop-dead stop-ship issues:

I completely agree. I think it's considerably premature to wrap a
beta at this point. We haven't resolved the issue of what to do about
accidental restores into pg_catalog either; nobody replied to my last
email on that thread.

1. The matviews mess. Changing that will force initdb, more than
likely, so we need it resolved before beta1.

I would like to rip out the whole notion of whether a materialized
view is scannable and am happy to do that on Monday if you're willing
to sit still for it. I think that's better than failing to support
unlogged relations, and I'm confident that the decision to put the
scannability flag in pg_class rather than the backing file is dead
wrong. At the same time, I *also* agree that using the file size as a
flag is untenable.

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

I think it's pretty darn clear that we should change the algorithm,
and I think we've got a patch to do that. So we should be able to
resolve this relatively quickly. But +1 for adding a checksum
algorithm ID to pg_control anyway.

3. The ProcessUtility restructuring problem. Surely we're not going to
ship a beta with persistent buildfarm failures, which even show up
sometimes on non-CLOBBER_CACHE_ALWAYS animals, eg today at
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nightjar&dt=2013-04-27%2009%3A27%3A00

As for #3, there's a draft patch, who's going to take responsibility
for that?

I have been assuming that Alvaro was responsible for fixing this since
he (AIUI) was the one who committed what broke it. If that's not the
case, I should probably jump in, since I committed some earlier event
trigger patches. Or Dimitri, as the original author of said patches.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#4)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Apr 27, 2013 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Of the items on the 9.3 open-items page,
https://wiki.postgresql.org/wiki/PostgreSQL_9.3_Open_Items
there are at least three that seem like absolute drop-dead stop-ship issues:

I completely agree. I think it's considerably premature to wrap a
beta at this point. We haven't resolved the issue of what to do about
accidental restores into pg_catalog either; nobody replied to my last
email on that thread.

As far as that item goes, I agree it's must-fix, but I'm not sure it's
must-fix-before-beta.

1. The matviews mess. Changing that will force initdb, more than
likely, so we need it resolved before beta1.

I would like to rip out the whole notion of whether a materialized
view is scannable and am happy to do that on Monday if you're willing
to sit still for it.

That would actually be my druthers too; while I see that we're going to
want such a concept eventually, I'm not convinced that the current
feature is a reasonable (and upward-compatible) subset of what we'll
want later. However, when I proposed doing that earlier, Kevin
complained pretty strenuously. I'm willing to yield on the point,
as long as the implementation doesn't make use of storage-file size
to represent scannability.

I think that's better than failing to support
unlogged relations, and I'm confident that the decision to put the
scannability flag in pg_class rather than the backing file is dead
wrong. At the same time, I *also* agree that using the file size as a
flag is untenable.

Um, wait, it's *not* in pg_class now, and what I was about to do was
go put it there. Is there a typo in the above para, or are you saying
you don't like either approach? If the latter, what concept have you
got for an eventual implementation?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#5)

Re: Remaining beta blockers

On Sat, Apr 27, 2013 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

1. The matviews mess. Changing that will force initdb, more than
likely, so we need it resolved before beta1.

I would like to rip out the whole notion of whether a materialized
view is scannable and am happy to do that on Monday if you're willing
to sit still for it.

That would actually be my druthers too; while I see that we're going to
want such a concept eventually, I'm not convinced that the current
feature is a reasonable (and upward-compatible) subset of what we'll
want later. However, when I proposed doing that earlier, Kevin
complained pretty strenuously. I'm willing to yield on the point,
as long as the implementation doesn't make use of storage-file size
to represent scannability.

I think that's better than failing to support
unlogged relations, and I'm confident that the decision to put the
scannability flag in pg_class rather than the backing file is dead
wrong. At the same time, I *also* agree that using the file size as a
flag is untenable.

Um, wait, it's *not* in pg_class now, and what I was about to do was
go put it there. Is there a typo in the above para, or are you saying
you don't like either approach? If the latter, what concept have you
got for an eventual implementation?

If we're going to have it at all, I'd like to make it a flag in the
page header on page 0, or maybe have a dedicated metapage that stores
that detail, and perhaps other things.

But really, I'd rather not have it at all.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#6)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Apr 27, 2013 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Um, wait, it's *not* in pg_class now, and what I was about to do was
go put it there. Is there a typo in the above para, or are you saying
you don't like either approach? If the latter, what concept have you
got for an eventual implementation?

If we're going to have it at all, I'd like to make it a flag in the
page header on page 0, or maybe have a dedicated metapage that stores
that detail, and perhaps other things.

I cannot say that I find that idea attractive; the biggest problem with
it being that updating such a state flag will be nontransactional,
unless we go to a lot of effort to support rollbacks. ISTM that the
scannability property is a perfectly normal relation property and as
such *ought* to be in the pg_class row, or at worst some other catalog
entry. Why do you think differently?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

over 12 years ago

In reply to: Robert Haas (#4)

Re: Remaining beta blockers

On 27 April 2013 20:23, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 27, 2013 at 10:59 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

I think it's pretty darn clear that we should change the algorithm,
and I think we've got a patch to do that. So we should be able to
resolve this relatively quickly.

I'll be working on this today.

But +1 for adding a checksum
algorithm ID to pg_control anyway.

Yes, seems best.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Simon Riggs

simon@2ndQuadrant.com

over 12 years ago

In reply to: Tom Lane (#3)

Re: Remaining beta blockers

On 27 April 2013 19:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

On Sat, Apr 27, 2013 at 10:59:32AM -0400, Tom Lane wrote:

As far as #1 goes, I think we have little choice at this point but to
remove the unlogged-matviews feature for 9.3.

This perspective is all wrong. I hate to be blunt, but that thread ended with
your technical objections to the committed implementation breaking apart and
sinking. There was no consensus to change it on policy/UI grounds, either.

[ shrug... ] You and Kevin essentially repeated your claims that the
current implementation is OK; nobody else weighed in.

On other patches, one committer objecting to something is seen as
enough of a blocker to require change. That should work in every
direction.

In any case, we shouldn't all need to line up and vote on stuff, its
so timewasting. Of course, we need to sometimes, but only when the
case is put clearly enough that it can be done sensibly, otherwise we
just end up voting ad hominem.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#7)

Re: Remaining beta blockers

On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Apr 27, 2013 at 3:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Um, wait, it's *not* in pg_class now, and what I was about to do was
go put it there. Is there a typo in the above para, or are you saying
you don't like either approach? If the latter, what concept have you
got for an eventual implementation?

If we're going to have it at all, I'd like to make it a flag in the
page header on page 0, or maybe have a dedicated metapage that stores
that detail, and perhaps other things.

I cannot say that I find that idea attractive; the biggest problem with
it being that updating such a state flag will be nontransactional,
unless we go to a lot of effort to support rollbacks. ISTM that the
scannability property is a perfectly normal relation property and as
such *ought* to be in the pg_class row, or at worst some other catalog
entry. Why do you think differently?

Mostly because of the issue with unlogged tables, I suppose. If
you've got a reasonable idea how to do catalog updates on restart,
though, I could probably be convinced to yield to that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#10)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I cannot say that I find that idea attractive; the biggest problem with
it being that updating such a state flag will be nontransactional,
unless we go to a lot of effort to support rollbacks. ISTM that the
scannability property is a perfectly normal relation property and as
such *ought* to be in the pg_class row, or at worst some other catalog
entry. Why do you think differently?

Mostly because of the issue with unlogged tables, I suppose. If
you've got a reasonable idea how to do catalog updates on restart,
though, I could probably be convinced to yield to that.

Well, it's fairly clear *how* to do it: add some more processing that
occurs after we've completed crash replay. We already have some of
that, eg completion of partial splits in btrees, so it's not that much
of a stretch; it's just a lot of code that's not been written yet.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Simon Riggs (#9)

Re: Remaining beta blockers

Simon Riggs <simon@2ndQuadrant.com> writes:

On other patches, one committer objecting to something is seen as
enough of a blocker to require change. That should work in every
direction.

The bottom line here is that we have substantial disagreement on how
unlogged matviews should be implemented, and there's no longer enough
time for coming to a resolution that will satisfy everybody. I think
that means we have to pull the feature from 9.3. If it had not yet
been committed it would certainly not be getting in now over multiple
objections.

Given Robert's concerns, it may be that the same should be said for
scannability tracking. I think it's definitely the case that if we
don't have unlogged matviews then the need for system-level tracking
of scannability is greatly decreased. Kevin's already said that he
plans to work on a much more flexible notion of scannability for 9.4,
and I remain concerned that something we do in haste now might not
prove to be a good upward-compatible basis for that redesign.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Simon Riggs

simon@2ndQuadrant.com

over 12 years ago

In reply to: Tom Lane (#12)

Re: Remaining beta blockers

On 28 April 2013 16:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On other patches, one committer objecting to something is seen as
enough of a blocker to require change. That should work in every
direction.

The bottom line here is that we have substantial disagreement on how
unlogged matviews should be implemented, and there's no longer enough
time for coming to a resolution that will satisfy everybody. I think
that means we have to pull the feature from 9.3. If it had not yet
been committed it would certainly not be getting in now over multiple
objections.

I've not said much good about Mat Views, that is true, but that was
aimed at not running with it as a headline feature without
qualification. I don't take that as far as thinking the feature should
be pulled completely; there is some good worth having in most things.
Is this issue worth pulling the whole feature on?

Given Robert's concerns, it may be that the same should be said for
scannability tracking. I think it's definitely the case that if we
don't have unlogged matviews then the need for system-level tracking
of scannability is greatly decreased. Kevin's already said that he
plans to work on a much more flexible notion of scannability for 9.4,
and I remain concerned that something we do in haste now might not
prove to be a good upward-compatible basis for that redesign.

Given that unlogged tables are somewhat volatile, unlogged matviews
wouldn't be missed much AFAICS. We can add that thought as a later
optimisation.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Simon Riggs (#13)

Re: Remaining beta blockers

Simon Riggs <simon@2ndQuadrant.com> writes:

On 28 April 2013 16:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The bottom line here is that we have substantial disagreement on how
unlogged matviews should be implemented, and there's no longer enough
time for coming to a resolution that will satisfy everybody. I think
that means we have to pull the feature from 9.3. If it had not yet
been committed it would certainly not be getting in now over multiple
objections.

I've not said much good about Mat Views, that is true, but that was
aimed at not running with it as a headline feature without
qualification. I don't take that as far as thinking the feature should
be pulled completely; there is some good worth having in most things.
Is this issue worth pulling the whole feature on?

I think you misread that. I'm only proposing that we remove *unlogged*
matviews, and perhaps scannability tracking for matviews.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Simon Riggs

simon@2ndQuadrant.com

over 12 years ago

In reply to: Tom Lane (#14)

Re: Remaining beta blockers

On 28 April 2013 21:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

On 28 April 2013 16:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The bottom line here is that we have substantial disagreement on how
unlogged matviews should be implemented, and there's no longer enough
time for coming to a resolution that will satisfy everybody. I think
that means we have to pull the feature from 9.3. If it had not yet
been committed it would certainly not be getting in now over multiple
objections.

I've not said much good about Mat Views, that is true, but that was
aimed at not running with it as a headline feature without
qualification. I don't take that as far as thinking the feature should
be pulled completely; there is some good worth having in most things.
Is this issue worth pulling the whole feature on?

I think you misread that. I'm only proposing that we remove *unlogged*
matviews, and perhaps scannability tracking for matviews.

Happily so.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#11)

Re: Remaining beta blockers

On Sun, Apr 28, 2013 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Sat, Apr 27, 2013 at 3:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I cannot say that I find that idea attractive; the biggest problem with
it being that updating such a state flag will be nontransactional,
unless we go to a lot of effort to support rollbacks. ISTM that the
scannability property is a perfectly normal relation property and as
such *ought* to be in the pg_class row, or at worst some other catalog
entry. Why do you think differently?

Mostly because of the issue with unlogged tables, I suppose. If
you've got a reasonable idea how to do catalog updates on restart,
though, I could probably be convinced to yield to that.

Well, it's fairly clear *how* to do it: add some more processing that
occurs after we've completed crash replay. We already have some of
that, eg completion of partial splits in btrees, so it's not that much
of a stretch; it's just a lot of code that's not been written yet.

As far as I can see, that would require starting a separate backend
process for every database, and keeping track of which of those
completed their post-recovery work, and disallowing connections to any
given database until the post-recovery work for that database had been
completed. That seems to add quite a few failure modes that we don't
have today, which is why I haven't been much interested in going that
route.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#16)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> writes:

On Sun, Apr 28, 2013 at 11:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, it's fairly clear *how* to do it: add some more processing that
occurs after we've completed crash replay. We already have some of
that, eg completion of partial splits in btrees, so it's not that much
of a stretch; it's just a lot of code that's not been written yet.

As far as I can see, that would require starting a separate backend
process for every database, and keeping track of which of those
completed their post-recovery work, and disallowing connections to any
given database until the post-recovery work for that database had been
completed. That seems to add quite a few failure modes that we don't
have today, which is why I haven't been much interested in going that
route.

Well, I didn't say it would be easy or quick ;-). But you're presuming
quite a number of design elements that don't seem essential to me.

What I'd be inclined to think about as prerequisite work is fixing
things so that a process could reattach to a new database, after
flushing all its caches. That's a feature that's been requested plenty,
and could have applications in autovacuum or other places besides this,
and would certainly get lots of testing outside of crash recovery.

Having done that, we could imagine that the startup process itself
cycles through all the databases and does the fixup work, rather than
complicating the postmaster logic as suggested above. Potentially this
could replace the filesystem-scan-driven fixup logic that exists in it
now, if it turns out to be faster to search for unlogged tables via a
catalog scan rather than directory scans.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Tom Lane (#3)

Re: Remaining beta blockers

Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ shrug... ] You and Kevin essentially repeated your claims that
the current implementation is OK; nobody else weighed in.

Many people weighed in on the need to differentiate between an
empty matview and one which has not been populated. Many have also
weighed in on the benefits of unlogged matviews.

I see absolutely no reason to change my opinion on this. Keeping
relisscannable state in the form of is-the-file-of-nonzero-size
is something we WILL regret

... or change in a subsequent major release. I see no reason why
using this technique now make it harder to do something else later.
Could you elaborate on the technical challenges you see to doing
so?

Pollyanna-ish refusal to believe that is not an adequate reason
for painting ourselves into a corner, especially not for a
second-order feature like unlogged matviews.

I would guess that about half the use-cases for materialized views
will stay with tables in spite of the added hassle, if they have to
degrade performance by adding logging where they currently have
none.

The way forward to unlogged matviews that behave the way Kevin
wants is to improve crash recovery so that we can update catalog
state after completing recovery, which is something there are
other reasons to want anyway.

That would certainly be for the best.

But it's far too late to do that in this cycle.

Yes it is.

In the meantime I remain convinced that we're better off dropping
the feature until we have an implementation that's not going to
bite us in the rear in the future.

I haven't caught up with you on how it will do that.

I also note that there are acknowledged-even-by-you bugs in the
current implementation, which no patch has emerged for, so
*something's* got to be done. I'm done waiting for something to
happen, and am going to go fix it in the way I think best.

Are you referring to the case where if you refresh a matview with
over 8GB and it winds up empty, that vacuum can make it look
invalid until the next REFRESH? This is one of many ways that
could be fixed.

diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 8a1ffcf..b950b16 100644
--- a/src/backend/commands/vacuumlazy.c
+++ b/src/backend/commands/vacuumlazy.c
@@ -230,7 +230,13 @@ lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
      *
      * Don't even think about it unless we have a shot at releasing a goodly
      * number of pages.  Otherwise, the time taken isn't worth it.
+     *
+     * Leave a populated materialized view with at least one page.
      */
+    if (onerel->rd_rel->relkind == RELKIND_MATVIEW &&
+        vacrelstats->nonempty_pages == 0)
+        vacrelstats->nonempty_pages = 1;
+
     possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages;
     if (possibly_freeable > 0 &&
         (possibly_freeable >= REL_TRUNCATE_MINIMUM ||

The hysteria and FUD about using the currently-supported technique
for unlogged tables to implement unlogged matviews are very
discouraging. And the notion that we would release a matview
feature which allowed false results (in the form of treating a
never-populated matview as a legal empty matview) is truly scary to
me. If we can't tell the difference between those two things, I
don't think we should be releasing the feature.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Kevin Grittner (#18)

Re: Remaining beta blockers

* Kevin Grittner (kgrittn@ymail.com) wrote:

Many people weighed in on the need to differentiate between an
empty matview and one which has not been populated. Many have also
weighed in on the benefits of unlogged matviews.

Sure, I think I did that- we should be able to distinguish between those
two cases and unlogged matviews are great.

I see absolutely no reason to change my opinion on this. Keeping
relisscannable state in the form of is-the-file-of-nonzero-size
is something we WILL regret

... or change in a subsequent major release. I see no reason why
using this technique now make it harder to do something else later.
Could you elaborate on the technical challenges you see to doing
so?

I've not followed this all that closely, to be honest, but I tend to
side with Tom on this, making PG depend on the file size for an
attribute of a relation is a bad idea. For one thing, what happens when
an admin figures out that they can 'hack' the system to do what they
want by truncating that file? Or we end up wanting to have that file be
non-zero and considered 'empty' later, but we don't want pg_upgrade
running around touching all of the existing files out there?

I would guess that about half the use-cases for materialized views
will stay with tables in spite of the added hassle, if they have to
degrade performance by adding logging where they currently have
none.

Life's tough. There's quite a few things that I wish had made it into
9.3 which didn't. One might also point out that a lot of individuals
may be, understandably, cautious about this new feature and not use it
in the first major rev it's released in anyway (I'm talking about
matviews entirely here..). Basically, I don't believe we should be
acting like we've promised unlogged-matviews will be in 9.3 just because
it's been committed.

The way forward to unlogged matviews that behave the way Kevin
wants is to improve crash recovery so that we can update catalog
state after completing recovery, which is something there are
other reasons to want anyway.

That would certainly be for the best.

Then let's forgo having this specific feature in 9.3 and implement it
correctly for 9.4.

The hysteria and FUD about using the currently-supported technique
for unlogged tables to implement unlogged matviews are very
discouraging.

Perhaps I'm all wet here, but it's not obvious to me that what's done
for unlogged tables really equates to what's being done here.

And the notion that we would release a matview
feature which allowed false results (in the form of treating a
never-populated matview as a legal empty matview) is truly scary to
me. If we can't tell the difference between those two things, I
don't think we should be releasing the feature.

I agree that it's important to distinguish the two. I'm not sure that
I've heard anyone explicitly say otherwise..

Thanks,

Stephen

#20

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Stephen Frost (#19)

Re: Remaining beta blockers

Stephen Frost <sfrost@snowman.net> wrote:

what happens when an admin figures out that they can 'hack' the
system to do what they want by truncating that file?

If they modified the heap files that way while the server was
running, the results would be somewhat unpredictable. If they did
it while the server was stopped, starting the server and attempting
to access the matview would generate:

ERROR: materialized view "matview_name" has not been populated
HINT: Use the REFRESH MATERIALIZED VIEW command.

Or we end up wanting to have that file be non-zero and considered
'empty' later, but we don't want pg_upgrade running around
touching all of the existing files out there?

I didn't follow this one; could you restate it, please?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Josh Berkus

josh@agliodbs.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

what happens when an admin figures out that they can 'hack' the
system to do what they want by truncating that file?

That can't possibly be a serious objection. DBAs who mess with DB files
are on their own, and should expect unpredictable behavior.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WMefbc3ca1b81f3f169d307c3f78f65b158559b2e5472f448c0b521b1f0142248c0eb1185124c155b3c50d773bc8851fb6@asav-1.01.com

#22

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Kevin Grittner (#18)

Re: Remaining beta blockers

On Mon, Apr 29, 2013 at 3:34 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

The hysteria and FUD about using the currently-supported technique
for unlogged tables to implement unlogged matviews are very
discouraging. And the notion that we would release a matview
feature which allowed false results (in the form of treating a
never-populated matview as a legal empty matview) is truly scary to
me.

I think that labeling other people's concerns as hysteria and FUD is
not something which will advance the debate. I might be
misunderstanding the nature of Tom's concern, but I believe the
concern is not so much that whatever bugs exist now can't be fixed,
but that the use of this mechanism might leave us vulnerable to a
steady stream of future bugs or make it hard to do things which, in
the future, we may want to do.

To take one not-particularly-hypothetical example of that, I was
recently discussing with someone, probably Noah, the idea of adding a
command to pre-extend a relation to a certain number of blocks (and
prevent vacuum from truncating those extra blocks away again). If
you're counting on the number of blocks in the relation to mean
something other than the number of blocks in the relation, there's a
potential conflict there. Now maybe the fact that you've defined 0
to mean non-scannable and 1+ to mean scannable means there's no real
problem with that particular idea... but suppose you'd defined it the
other way around. I don't think it's purely unreasonable to think
that this could conflict with future developments in other areas.

The basic problem here is that the way unlogged tables work is kind of
a crock. I didn't fully realize that at the time I implemented it,
but it's boxed us in a little in terms of trying to extend that
feature; for example, we still don't have a way to convert logged
relations to unlogged relations, or visca versa, and we're not going
to grow a way without some kind of reworking of that mechanism. This
is another aspect of that problem. To solve these problems cleanly,
we need either a relation metapage, or some more state in pg_class.
Short of that, this isn't a scalable solution. Even if you think it's
workable to coax one more bit of state out of the file length, what
will you do when you need a second bit (or say another 32 bits)?

If we can't tell the difference between those two things, I
don't think we should be releasing the feature.

So, speaking of hysteria...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Kevin Grittner (#20)

Re: Remaining beta blockers

* Kevin Grittner (kgrittn@ymail.com) wrote:

If they modified the heap files that way while the server was
running, the results would be somewhat unpredictable. If they did
it while the server was stopped, starting the server and attempting
to access the matview would generate:

Right, the point being that they could (ab)use it as a flag to trigger
something to happen. I'd also be worried about failure cases where
files appear to be zero-length.

Or we end up wanting to have that file be non-zero and considered
'empty' later, but we don't want pg_upgrade running around
touching all of the existing files out there?

I didn't follow this one; could you restate it, please?

Down the road we decide that we shouldn't have any zero-length files
(perhaps due to checksums..?), yet we have to special case around these
mat views and figure out a way to deal with them during pg_upgrade.

Thanks,

Stephen

#24

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Stephen Frost (#23)

Re: Remaining beta blockers

On Mon, Apr 29, 2013 at 9:44 PM, Stephen Frost <sfrost@snowman.net> wrote:

* Kevin Grittner (kgrittn@ymail.com) wrote:

If they modified the heap files that way while the server was
running, the results would be somewhat unpredictable. If they did
it while the server was stopped, starting the server and attempting
to access the matview would generate:

Right, the point being that they could (ab)use it as a flag to trigger
something to happen. I'd also be worried about failure cases where
files appear to be zero-length.

If you assume that people are going to modify files while the backend
is running, nothing we do anywhere is safe.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Robert Haas (#24)

Re: Remaining beta blockers

* Robert Haas (robertmhaas@gmail.com) wrote:

If you assume that people are going to modify files while the backend
is running, nothing we do anywhere is safe.

I agree that it's a bad idea and that people who do such things deserve
what they get, but that doesn't mean it won't happen when people realize
they can do it.

And, really, it's all fun and games until someone writes a tool to do
it...

Thanks,

Stephen

#26

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Robert Haas (#22)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> wrote:

Kevin Grittner <kgrittn@ymail.com> wrote:

The hysteria and FUD about using the currently-supported technique
for unlogged tables to implement unlogged matviews are very
discouraging. And the notion that we would release a matview
feature which allowed false results (in the form of treating a
never-populated matview as a legal empty matview) is truly scary to
me.

I think that labeling other people's concerns as hysteria and FUD is
not something which will advance the debate.

The point of that language is that "charged" language which has
nothing to do with reality is already being thrown around; by all
means let's get back to constructive discussion. If there is some
particular problem someone sees, I don't think it has been
expressed yet, which makes it impossible to address, unless you
count the assertion that *if* we had chosen a zero-length heap to
represent a table with valid data we *would* have a problem?

Let's look at the "corner" this supposedly paints us into. If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump. Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway. I'm not seeing cause for panic here.

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

If we can't tell the difference between those two things, I
don't think we should be releasing the feature.

So, speaking of hysteria...

Yeah, I know you don't get it, but as a DBA I would never have
allowed a feature which could quietly generate false results to be
used -- which is exactly what you have without a way to
differentiate these cases. If it comes down to a choice between a
performance tweak like unlogged matviews and an issue of
correctness of results, I will pick correctness. Now, as I've
already said, this tweak is significant (I expect it will make
matviews useful in roughly twice as many cases), but it is just a
performance tweak.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Kevin Grittner (#26)

Re: Remaining beta blockers

Kevin,

* Kevin Grittner (kgrittn@ymail.com) wrote:

If there is some
particular problem someone sees, I don't think it has been
expressed yet, which makes it impossible to address, unless you
count the assertion that *if* we had chosen a zero-length heap to
represent a table with valid data we *would* have a problem?

The problem which people see is that this approach isn't a good idea and
will lead down a road to ugly hacks to make things work correctly. It's
not a question about code correctness- it's a question about the design.
Those are two very different things but both need to be considered.

I'm not seeing cause for panic here.

Speaking to overcharged words, I don't see any 'panic' here but rather a
strong disagreement between individuals over this design. Actually, I'm
not even sure there is disagreement about there being a better design
for this- it sounds like everyone agrees that there is. The question
boils down to "do we ship it with an inferior design, or hold off and do
it right the first time".

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

Various examples have been articulated and you've worked through
specific high-level designs for how to deal with those, which is great,
but design isn't about just those things which you can forsee and
predict, it's about ensuring flexibility is there for those things that
you don't predict.

If we can't tell the difference between those two things, I
don't think we should be releasing the feature.

So, speaking of hysteria...

Yeah, I know you don't get it, but as a DBA I would never have
allowed a feature which could quietly generate false results to be
used -- which is exactly what you have without a way to
differentiate these cases.

It also happens to be what you can end up having with unlogged tables
already.. We ran into exactly that issue and decided that we'd be
better off not using them (for now anyway) even for things which really
should be just transient data sets.

If it comes down to a choice between a
performance tweak like unlogged matviews and an issue of
correctness of results, I will pick correctness. Now, as I've
already said, this tweak is significant (I expect it will make
matviews useful in roughly twice as many cases), but it is just a
performance tweak.

And, I think anyway, I agree w/ you (Kevin) here- we should really have
a way to tell the difference between no-data-in-query-result and
never-populated. I'd like more options than just a big ERROR result
happening when it's never been populated, but that's a different
discussion.

Thanks,

Stephen

#28

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Kevin Grittner (#26)

Re: Remaining beta blockers

Could we please stop the ad-hominem stuff from all sides? We want to
solve the issue not to make it bigger.

On 2013-04-30 04:29:26 -0700, Kevin Grittner wrote:

Let's look at the "corner" this supposedly paints us into. If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump. Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway. I'm not seeing cause for panic here.

I don't think panic is appropriate either, but I think there are some
valid concerns around this.

1) vacuum can truncate the table to an empty relation already if there is
no data returned by the view's query which then leads to the wrong
scannability state.

S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random() < -1;
S2: S2: SELECT * FROM matview_empty; -- works
S1: VACUUM matview_empty;
S2: SELECT * FROM matview_empty; -- still works
S3: SELECT * FROM matview_empty; -- errors out

So we need to make vacuum skip cleaning out the last page. Once we
get incrementally updated matviews there are more situations to get
into this than just a query not returning anything.
I remember this being discussed somewhere already, but couldn't find
it quickly enough.

Imo this is quite an indicator that the idea of using the filesize is
just plain wrong. Adding logic to vacuum not to truncate data because
its a flag for matview scannability is quite the layer violation and
a sign for a bad design decision. Such a hack has already been added
to copy_heap_data(), while not as bad, shows that it is hard to find
all the places where we need to add it.

2) Since we don't have a metapage to represent scannability in 9.3 we
cannot easily use one in 9.4+ without pg_upgrade emptying all
matviews unless we only rely on the catalogs which we currently
cannot. This will either slow down development or make users
unhappy. Alternatively we can add yet another fork, but that has its
price (quite a bit more open files during normal operation, slower
CREATE DATABASE).

This is actually an argument for not releasing matviews without such
an external state. Going from disk-based state to catalog is easy,
the other way round: not so much.

3) Using the filesize as a flag will make other stuff like preallocating
on-disk data in bigger chunks and related things more complicated.

4) doing the check for scannability in the executor imo is a rather bad
idea. Note that we e.g. don't see an error about a matview which
won't be scanned because of constraint exclusion or one-time
filters.

S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false WITH NO DATA;
S1: SELECT * FROM matview_unit_false;

You can get there in far more reasonable cases than WHERE false.

5) I have to agree with Kevin that the scannability is an important thing
to track though.

a) we cannot remove support for WITH NO DATA because of pg_dump order
and unlogged relations. So even without unlogged relations the
problem exists although its easier to represent.
b) Just giving wrong responses is bad [tm]. Outdated data is something
completely different (it has existed in that state in the (near)
past) than giving an empty response (might never have been a visible
state, and more likely not so in any reasonably near
past). Consider an application behind a pooler suddently getting
an empty response from a SELECT * FROM unlogged_matview; It won't
notice anything without a unscannable state since it probably
won't even notice the database restart.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but ...

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#29

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Kevin Grittner (#26)

Re: Remaining beta blockers

On Tue, Apr 30, 2013 at 7:29 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

Let's look at the "corner" this supposedly paints us into. If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump. Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway. I'm not seeing cause for panic here.

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

Yeah, I know you don't get it, but as a DBA I would never have
allowed a feature which could quietly generate false results to be
used -- which is exactly what you have without a way to
differentiate these cases. If it comes down to a choice between a
performance tweak like unlogged matviews and an issue of
correctness of results, I will pick correctness. Now, as I've
already said, this tweak is significant (I expect it will make
matviews useful in roughly twice as many cases), but it is just a
performance tweak.

Sure, I wouldn't allow that either. My point is that I feel that
could be engineered around in user space. If you have a materialized
view which could legitimately be empty (there are many for which that
won't be an issue), then you can either arrange the view definition so
that it isn't (e.g. by including a dummy record that clients can look
for), or you can include a sentinel unlogged table that will contain a
row if and only if materialized views have been refreshed since the
last crash. In the examples I can think of,
indefinitely-stale-but-valid-at-some-point wouldn't be very good
either, so I would anticipate needing to do some engineering around
relative freshness anyway - e.g. keeping a side table that lists the
last-refreshed-time for each matview. Or, maybe I'd wouldn't care
about tracking elapsed time per se, but instead want to track
freshness relative to updates - e.g. set things up so that anyone who
performs an action that would invalidate a row in the materialized
view would also update a row someplace that would allow me to identify
the row as stale. In either case, handling the case where the view is
altogether invalid doesn't seem to add a whole lot of additional
complexity.

Now, I can imagine cases where it does. For example, suppose you have
a cron job (which you trust to work) to refresh your materialized
views every night. Well, that means that you'll never be more than 24
hours stale - but if any of those materialized views are unlogged,
that also means that you could have no data at all for up to 24 hours
following a crash. Not great, because now you need some logic to
handle just that one case that wouldn't be necessary if the DB did it
for you. But I just think it's a judgement call how serious one
thinks that scenario is, vs. any other scenario where a boolean isn't
adequate either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Andres Freund (#28)

Re: Remaining beta blockers

Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-04-30 04:29:26 -0700, Kevin Grittner wrote:

Let's look at the "corner" this supposedly paints us into. If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump. Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway. I'm not seeing cause for panic here.

I don't think panic is appropriate either, but I think there are some
valid concerns around this.

1) vacuum can truncate the table to an empty relation already if there is
   no data returned by the view's query which then leads to the wrong
   scannability state.

   S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random() < -1;
   S2: S2: SELECT * FROM matview_empty; -- works
   S1: VACUUM matview_empty;
   S2: SELECT * FROM matview_empty; -- still works
   S3: SELECT * FROM matview_empty; -- errors out

   So we need to make vacuum skip cleaning out the last page. Once we
   get incrementally updated matviews there are more situations to get
   into this than just a query not returning anything.
   I remember this being discussed somewhere already, but couldn't find
   it quickly enough.

Yeah, I posted a short patch earlier on this thread that fixes
that. Nobody has commented on it, and so far I have not committed
anything related to this without posting details and giving ample
opportunity for anyone to comment. If nobody objects, I can push
that, and this issue is gone.

   Imo this is quite an indicator that the idea of using the filesize is
   just plain wrong. Adding logic to vacuum not to truncate data because
   its a flag for matview scannability is quite the layer violation and
   a sign for a bad design decision. Such a hack has already been added
   to copy_heap_data(), while not as bad, shows that it is hard to find
   all the places where we need to add it.

I agree, but if you review the thread I started with a flag in
pg_class, I tried using the "special" area of the first page of the
heap, and finally wound up with the current technique based on
several people reviewing the patch and offering opinions and
reaching an apparent consensus that this was the least evil way to
do it given current infrastructure for unlogged tables. This
really is a result of trying to preserver *correct* behavior while
catering to those emphasizing how important the performance of
unlogged matviews is.

2) Since we don't have a metapage to represent scannability in 9.3 we
   cannot easily use one in 9.4+ without pg_upgrade emptying all
   matviews unless we only rely on the catalogs which we currently
   cannot.

I am not following this argument at all. Doesn't pg_upgrade use
pg_dump to create the tables and matviews WITH NO DATA and take
later action for ones which are populated in the source? It would
not be hard at all to move to a new release which used a different
technique for tracking populated tables by changing what pg_dump
does for populated tables with the switch pg_upgrade uses.

   This will either slow down development or make users
   unhappy. Alternatively we can add yet another fork, but that has its
   price (quite a bit more open files during normal operation, slower
   CREATE DATABASE).

   This is actually an argument for not releasing matviews without such
   an external state. Going from disk-based state to catalog is easy,
   the other way round: not so much.

I am not seeing this at all. Given my comment above about how this
works for pg_upgrade and pg_dump, can you explain how this is a
problem?

3) Using the filesize as a flag will make other stuff like preallocating
on-disk data in bigger chunks and related things more complicated.

Why? You don't need to preallocate for a non-populated matview,
since its heap will be replaced on REFRESH. You would not *want*
to preallocate a lot of space for something guaranteed to remain at
zero length until deleted.

4) doing the check for scannability in the executor imo is a rather bad
   idea. Note that we e.g. don't see an error about a matview which
   won't be scanned because of constraint exclusion or one-time
   filters.

   S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false
WITH NO DATA;
   S1: SELECT * FROM matview_unit_false;

You can get there in far more reasonable cases than WHERE false.

I am a little concerned about that, but the case you show doesn't demonstrate a problem:

test=# CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false WITH NO DATA;
SELECT 0
test=# SELECT * FROM matview_unit_false;
ERROR: materialized view "matview_unit_false" has not been populated
HINT: Use the REFRESH MATERIALIZED VIEW command.

The view was defined WITH NO DATA and is behaving correctly for
that case. When populated (with zero rows) it behaves correctly:

test=# REFRESH materialized view matview_unit_false ;
REFRESH MATERIALIZED VIEW
test=# SELECT * FROM matview_unit_false;
bool
------
(0 rows)

I would be interested to see whether anyone can come up with an
actual mis-behavior related to this either before or after the
recent refactoring.

5) I have to agree with Kevin that the scannability is an important thing
   to track though.

   a) we cannot remove support for WITH NO DATA because of pg_dump order
       and unlogged relations. So even without unlogged relations the
       problem exists although its easier to represent.
   b) Just giving wrong responses is bad [tm]. Outdated data is something
       completely different (it has existed in that state in the (near)
       past) than giving an empty response (might never have been a visible
       state, and more likely not so in any reasonably near
       past). Consider an application behind a pooler suddently getting
       an empty response from a SELECT * FROM unlogged_matview; It won't
       notice anything without a unscannable state since it probably
       won't even notice the database restart.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but ...

That seems far riskier to me than using the current
lame-but-functional approach now and improving it in a subsequent
release.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#31

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Robert Haas (#29)

Re: Remaining beta blockers

Robert Haas <robertmhaas@gmail.com> wrote:

Kevin Grittner <kgrittn@ymail.com> wrote:

Let's look at the "corner" this supposedly paints us into. If a
later major release creates a better mechanism, current pg_dump and
load will already use it, based on the way matviews are created
empty and REFRESHed by pg_dump. Worst case, we need to modify the
behavior of pg_dump running with the switch used by pg_upgrade to
use a new ALTER MATERIALIZED VIEW SET (populated); (or whatever
syntax is chosen) -- a command we would probably want at that point
anyway. I'm not seeing cause for panic here.

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so. What do you see as a problem?

Sure, I wouldn't allow that either. My point is that I feel that
could be engineered around in user space. If you have a materialized
view which could legitimately be empty (there are many for which that
won't be an issue), then you can either arrange the view definition so
that it isn't (e.g. by including a dummy record that clients can look
for), or you can include a sentinel unlogged table that will contain a
row if and only if materialized views have been refreshed since the
last crash. In the examples I can think of,
indefinitely-stale-but-valid-at-some-point wouldn't be very good
either, so I would anticipate needing to do some engineering around
relative freshness anyway - e.g. keeping a side table that lists the
last-refreshed-time for each matview. Or, maybe I'd wouldn't care
about tracking elapsed time per se, but instead want to track
freshness relative to updates - e.g. set things up so that anyone who
performs an action that would invalidate a row in the materialized
view would also update a row someplace that would allow me to identify
the row as stale. In either case, handling the case where the view is
altogether invalid doesn't seem to add a whole lot of additional
complexity.

Now, I can imagine cases where it does. For example, suppose you have
a cron job (which you trust to work) to refresh your materialized
views every night. Well, that means that you'll never be more than 24
hours stale - but if any of those materialized views are unlogged,
that also means that you could have no data at all for up to 24 hours
following a crash. Not great, because now you need some logic to
handle just that one case that wouldn't be necessary if the DB did it
for you. But I just think it's a judgement call how serious one
thinks that scenario is, vs. any other scenario where a boolean isn't
adequate either.

"Staleness" is a completely different issue, in my view, from
quietly returning results that are not, and never were, accurate.
Sure we need to implement more refined "scannability" tests than
whether valid data from *some* point in time is present. But that
should always be *part* of the scannability testing, and without it
I don't feel we have a feature of a quality suitable for delivery.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#32

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Kevin Grittner (#30)

Re: Remaining beta blockers

On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:

Andres Freund <andres@2ndquadrant.com> wrote:

1) vacuum can truncate the table to an empty relation already if there is
   no data returned by the view's query which then leads to the wrong
   scannability state.

   S1: CREATE MATERIALIZED VIEW matview_empty AS SELECT false WHERE random() < -1;
   S2: S2: SELECT * FROM matview_empty; -- works
   S1: VACUUM matview_empty;
   S2: SELECT * FROM matview_empty; -- still works
   S3: SELECT * FROM matview_empty; -- errors out

   So we need to make vacuum skip cleaning out the last page. Once we
   get incrementally updated matviews there are more situations to get
   into this than just a query not returning anything.
   I remember this being discussed somewhere already, but couldn't find
   it quickly enough.

Yeah, I posted a short patch earlier on this thread that fixes
that. Nobody has commented on it, and so far I have not committed
anything related to this without posting details and giving ample
opportunity for anyone to comment. If nobody objects, I can push
that, and this issue is gone.

Well, this bug is gone, but the multiple layer violations aren't.

   Imo this is quite an indicator that the idea of using the filesize is
   just plain wrong. Adding logic to vacuum not to truncate data because
   its a flag for matview scannability is quite the layer violation and
   a sign for a bad design decision. Such a hack has already been added
   to copy_heap_data(), while not as bad, shows that it is hard to find
   all the places where we need to add it.

I agree, but if you review the thread I started with a flag in
pg_class, I tried using the "special" area of the first page of the
heap, and finally wound up with the current technique based on
several people reviewing the patch and offering opinions and
reaching an apparent consensus that this was the least evil way to
do it given current infrastructure for unlogged tables. This
really is a result of trying to preserver *correct* behavior while
catering to those emphasizing how important the performance of
unlogged matviews is.

Imo it now uses the worst of both worlds...

2) Since we don't have a metapage to represent scannability in 9.3 we
   cannot easily use one in 9.4+ without pg_upgrade emptying all
   matviews unless we only rely on the catalogs which we currently
   cannot.

I am not following this argument at all. Doesn't pg_upgrade use
pg_dump to create the tables and matviews WITH NO DATA and take
later action for ones which are populated in the source? It would
not be hard at all to move to a new release which used a different
technique for tracking populated tables by changing what pg_dump
does for populated tables with the switch pg_upgrade uses.

What I am thinking about is a 100GB materialized view which has been
filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
metapage yet and we want to add one we would need to rewrite the whole
100GB which seems like a rather bad idea. Or how are you proposing to
add the metapage? You cannot add it to the end or somesuch.

I am not seeing this at all. Given my comment above about how this
works for pg_upgrade and pg_dump, can you explain how this is a
problem?

Well, that only works if there is a cheap way to add the new notation to
existing matview heaps which I don't see.

3) Using the filesize as a flag will make other stuff like preallocating
on-disk data in bigger chunks and related things more complicated.

Why? You don't need to preallocate for a non-populated matview,
since its heap will be replaced on REFRESH. You would not *want*
to preallocate a lot of space for something guaranteed to remain at
zero length until deleted.

But we would likely also want to optimize reducing the filesize in the
same chunks because you otherwise would end up with even worse
fragmentation. And I am not talking about an unscannable relation but
about one which is currently empty (e.g. SELECT ... WHERE false).

4) doing the check for scannability in the executor imo is a rather bad
   idea. Note that we e.g. don't see an error about a matview which
   won't be scanned because of constraint exclusion or one-time
   filters.

   S1: CREATE MATERIALIZED VIEW matview_unit_false AS SELECT false WHERE false
WITH NO DATA;
   S1: SELECT * FROM matview_unit_false;

You can get there in far more reasonable cases than WHERE false.

I am a little concerned about that, but the case you show doesn't demonstrate a problem:

The view was defined WITH NO DATA and is behaving correctly for
that case. When populated (with zero rows) it behaves correctly:

Ah. Tom already fixed the problem in
5194024d72f33fb209e10f9ab0ada7cc67df45b7. I was working in a branch
based on 3ccae48f44d993351e1f881761bd6c556ebd6638 and it failed there.

I would be interested to see whether anyone can come up with an
actual mis-behavior related to this either before or after the
recent refactoring.

Seems ok after Tom's refactoring.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but ...

That seems far riskier to me than using the current
lame-but-functional approach now and improving it in a subsequent
release.

Why? We have bitten by the lack of such metapages several times now and
I don't think we have bitten by their presence in relation types that
have them by now.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#33

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Andres Freund (#32)

Re: Remaining beta blockers

Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:

Andres Freund <andres@2ndquadrant.com> wrote:

2) Since we don't have a metapage to represent scannability in 9.3
    we cannot easily use one in 9.4+ without pg_upgrade emptying all
    matviews unless we only rely on the catalogs which we currently
    cannot.

I am not following this argument at all. Doesn't pg_upgrade use
pg_dump to create the tables and matviews WITH NO DATA and take
later action for ones which are populated in the source? It would
not be hard at all to move to a new release which used a different
technique for tracking populated tables by changing what pg_dump
does for populated tables with the switch pg_upgrade uses.

What I am thinking about is a 100GB materialized view which has been
filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
metapage yet and we want to add one we would need to rewrite the whole
100GB which seems like a rather bad idea. Or how are you proposing to
add the metapage? You cannot add it to the end or somesuch.

Oh, you are suggesting prepending a metapage to existing matviews
(and tables?)? So to check whether a view has been populated you
not only look at the directory but open the file and read a page?
Now I follow why you think this would be an issue. I'm not sure I
think that is the best solution, though. In what way would it be
better than adding info to pg_class or some other system table?
Why would this be important for unlogged matviews but not unlogged
tables?

I am not seeing this at all. Given my comment above about how this
works for pg_upgrade and pg_dump, can you explain how this is a
problem?

Well, that only works if there is a cheap way to add the new notation to
existing matview heaps which I don't see.

We could perhaps reserve some space in the special area of the
first page, if we can agree on a generous enough amount of space.

3) Using the filesize as a flag will make other stuff like
preallocating on-disk data in bigger chunks and related
things more complicated.

Why? You don't need to preallocate for a non-populated matview,
since its heap will be replaced on REFRESH. You would not *want*
to preallocate a lot of space for something guaranteed to remain at
zero length until deleted.

But we would likely also want to optimize reducing the filesize in the
same chunks because you otherwise would end up with even worse
fragmentation. And I am not talking about an unscannable relation but
about one which is currently empty (e.g. SELECT ... WHERE false).

Yes, if we get to both incremental updates *and* preallocations
before we develop a better technique for this, a special case would
be needed for the case where a matview was incrementally updated to
zero rows.

Not sure what the consequence of this is. The most reasonable solution
seems to be to introduce a metapage somewhere *now* which sucks, but

...

That seems far riskier to me than using the current
lame-but-functional approach now and improving it in a subsequent
release.

Why? We have bitten by the lack of such metapages several times now and
I don't think we have bitten by their presence in relation types that
have them by now.

Like I said, months ago I had a version which used the special area
for the first page of a matview heap, but was convinced to change
that. I could probably be convinced to change back. :-) I don't
know whether you see a problem with using the special like that for
the metadata you envision.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#34

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Kevin Grittner (#31)

Re: Remaining beta blockers

On Tue, Apr 30, 2013 at 10:40 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so. What do you see as a problem?

pg_upgrade only handles changes in catalog state, not on-disk
representation. If the on-disk representation of an non-scannable
view might change in a future release, it's a pg_upgrade hazard.

Sure, I wouldn't allow that either. My point is that I feel that
could be engineered around in user space. If you have a materialized
view which could legitimately be empty (there are many for which that
won't be an issue), then you can either arrange the view definition so
that it isn't (e.g. by including a dummy record that clients can look
for), or you can include a sentinel unlogged table that will contain a
row if and only if materialized views have been refreshed since the
last crash. In the examples I can think of,
indefinitely-stale-but-valid-at-some-point wouldn't be very good
either, so I would anticipate needing to do some engineering around
relative freshness anyway - e.g. keeping a side table that lists the
last-refreshed-time for each matview. Or, maybe I'd wouldn't care
about tracking elapsed time per se, but instead want to track
freshness relative to updates - e.g. set things up so that anyone who
performs an action that would invalidate a row in the materialized
view would also update a row someplace that would allow me to identify
the row as stale. In either case, handling the case where the view is
altogether invalid doesn't seem to add a whole lot of additional
complexity.

Now, I can imagine cases where it does. For example, suppose you have
a cron job (which you trust to work) to refresh your materialized
views every night. Well, that means that you'll never be more than 24
hours stale - but if any of those materialized views are unlogged,
that also means that you could have no data at all for up to 24 hours
following a crash. Not great, because now you need some logic to
handle just that one case that wouldn't be necessary if the DB did it
for you. But I just think it's a judgement call how serious one
thinks that scenario is, vs. any other scenario where a boolean isn't
adequate either.

"Staleness" is a completely different issue, in my view, from
quietly returning results that are not, and never were, accurate.
Sure we need to implement more refined "scannability" tests than
whether valid data from *some* point in time is present. But that
should always be *part* of the scannability testing, and without it
I don't feel we have a feature of a quality suitable for delivery.

I don't really see these as being all that separate. The user is
going to want to know whether the data is usable or not. The answer
to that question depends on the user's business rules, and those could
be anything. The current system implements the following business
rule: "The data can be used if the matview has ever been populated."
But there are lots of other possibilities:

- The data can be used if the matview is fully up-to-date.
- The data can be used if the matview is not out of date by more than
a certain amount of time.
- The data can be used if the matview is out of date with respect to
one of its base tables, but not if it is out of date with respect to
another of its base tables. For example, maybe you're OK with using
an order-summary view if new orders have arrived (but not if the view
is more than a day out of date); but not if any customers have been
renamed.

Not only can the business rules vary, but they can vary between
queries. Query A might need an exact answer, hence can only use the
matview when its up to date. Query B against the very same matview
might only need data that's up to date within some time threshold, and
query C might well want to just use whatever data exists. I'm not at
all clear that it's even feasible to solve all of these problems
inside the database; my suspicion is that at least some of these
things are going to end up being things that have to be handled at
least partially in user-space, while others will be handled in the DB
through mechanisms not yet designed.

I am even willing to go so far as to say that I am unconvinced that
everyone will want a just-truncated materialized view to be
automatically set to non-scannable. Suppose you have an unlogged view
that summarizes a real-time data stream - all SMTP traps received in
the last minute summarized by the node from which they were received.
When the server reboots unexpectedly, the raw data is lost along with
the summary. But for the scannability flag, we'd read the summary as
empty, which would be right. Adding the scannability flag forces the
user to add application logic to treat the error as equivalent to an
empty view.

The real concern I have about using an empty page to differentiate
whether or not the view is scannable is that I think it's a modularity
violation. But on the value of the underlying feature, my concern is
that there are an essentially infinite number of possible business
rules here that someone might want to implement, and I don't think
it's the job of a materialized view, either now or in the future, to
lock the user into any particular set of policy decisions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#35

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Kevin Grittner (#33)

Re: Remaining beta blockers

On 2013-04-30 08:35:32 -0700, Kevin Grittner wrote:

Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:

Andres Freund <andres@2ndquadrant.com> wrote:

2) Since we don't have a metapage to represent scannability in 9.3
    we cannot easily use one in 9.4+ without pg_upgrade emptying all
    matviews unless we only rely on the catalogs which we currently
    cannot.

I am not following this argument at all. Doesn't pg_upgrade use
pg_dump to create the tables and matviews WITH NO DATA and take
later action for ones which are populated in the source? It would
not be hard at all to move to a new release which used a different
technique for tracking populated tables by changing what pg_dump
does for populated tables with the switch pg_upgrade uses.

What I am thinking about is a 100GB materialized view which has been
filled in 9.3 and should now be pg_upgraded into 9.4. If we don't have a
metapage yet and we want to add one we would need to rewrite the whole
100GB which seems like a rather bad idea. Or how are you proposing to
add the metapage? You cannot add it to the end or somesuch.

Oh, you are suggesting prepending a metapage to existing matviews
(and tables?)? So to check whether a view has been populated you
not only look at the directory but open the file and read a page?
Now I follow why you think this would be an issue. I'm not sure I
think that is the best solution, though. In what way would it be
better than adding info to pg_class or some other system table?

You can write/read it without having a database opened. Like in the
startup process.

Then you can abstract away the knowledge about that into
PageIsMetapage(relation, blockno) or such which then can be used by
vacuum, heapam et al in an extensible fashion.

This is far from a fully working solution though - you still have issues
with BEGIN;REFRESH ...; ROLLBACK; if you do it naively. Afair that was
what made Tom protest against this.

Why would this be important for unlogged matviews but not unlogged
tables?

Unlogged tables basically have some very raw version of this - the _init
fork. But yes, a more generic version would have been nice.

I am not seeing this at all. Given my comment above about how this
works for pg_upgrade and pg_dump, can you explain how this is a
problem?

Well, that only works if there is a cheap way to add the new notation to
existing matview heaps which I don't see.

We could perhaps reserve some space in the special area of the
first page, if we can agree on a generous enough amount of space.

If we do it we should just use a whole page, no point in being to
cautious there imo.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#36

Josh Berkus

josh@agliodbs.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

Robert,

- The data can be used if the matview is fully up-to-date.
- The data can be used if the matview is not out of date by more than
a certain amount of time.
- The data can be used if the matview is out of date with respect to
one of its base tables, but not if it is out of date with respect to
another of its base tables. For example, maybe you're OK with using
an order-summary view if new orders have arrived (but not if the view
is more than a day out of date); but not if any customers have been
renamed.

We discussed this around the beginning of CF4. For some reason (which I
didn't quite understand at the time), the consensus was that creating a
"matview last updated" timestamp was not implementable for 9.3 and would
need to wait for 9.4.

Again, all of this points to additional columns, or at least relopts, in
pg_class. Why is that off the table?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WMd4f5ed96732743b16899b6d511c03b7ba9de703fe3dfea3ca3321420adf3528c75b30e78af0e5f8c09d66b8688b957c0@asav-2.01.com

#37

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Josh Berkus (#36)

Re: Remaining beta blockers

Josh Berkus <josh@agliodbs.com> wrote:

We discussed this around the beginning of CF4. For some reason
(which I didn't quite understand at the time), the consensus was
that creating a "matview last updated" timestamp was not
implementable for 9.3 and would need to wait for 9.4.

The reason was that the start of CF4 was deemed too late in the
development cycle to be trying to design what that should look
like. No sooner had you suggested that one column than someone
suggested two others which might also be useful, and it seemed to
me that it would be hard to get it right before we had a better
sense of what incremental maintenance based on the view declaration
would look like. So, as I recall it, the consensus developed
around the simplest possible test, which from the user perspective
was hidden behind a function, should be used for this release.

Again, all of this points to additional columns, or at least
relopts, in pg_class. Why is that off the table?

That was deemed to be incompatible with unlogged matviews, which
some didn't want to give up in this initial release.

Basically, what this patch aims at is more or less what some other
databases had in their initial releases of materialized views 10 to
20 years ago. Other products have built on those foundations with
each major release. I was hoping we could do the same. We are not
going to reach parity on this with any other major database product
in one release, or probably even two or three.

I didn't dig back through all the threads to confirm these
recollections, so take them with a grain of salt. The one thing I
am sure of is that they are a rehash of discussions which go back
to before the start of CF3. Only with more tone and urgency.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#38

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Andres Freund (#32)

Re: Remaining beta blockers

Andres Freund <andres@2ndquadrant.com> wrote:

Ah. Tom already fixed the problem in
5194024d72f33fb209e10f9ab0ada7cc67df45b7. I was working in a
branch based on 3ccae48f44d993351e1f881761bd6c556ebd6638 and it
failed there.

Since previous regression tests missed this bug, I've added the
test you posted, to make sure it doesn't come back unnoticed.

Thanks for the test case!

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#39

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Andres Freund (#32)

1 attachment(s)

Re: Remaining beta blockers

Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-04-30 07:33:05 -0700, Kevin Grittner wrote:

Andres Freund <andres@2ndquadrant.com> wrote:

1) vacuum can truncate the table to an empty relation already
if there is no data returned by the view's query which then
leads to the wrong scannability state.

   So we need to make vacuum skip cleaning out the last page.
   Once we get incrementally updated matviews there are more
   situations to get into this than just a query not returning
   anything.

Yeah, I posted a short patch earlier on this thread that fixes
that. Nobody has commented on it, and so far I have not
committed anything related to this without posting details and
giving ample opportunity for anyone to comment. If nobody
objects, I can push that, and this issue is gone.

Well, this bug is gone, but the multiple layer violations aren't.

Attached is a patch with regression test if we don't obviate the
need for it by tracking the populated status in a different way or
allowing unpopulated matviews to be used in queries. I'll hold off
on pushing it until we decide.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

matview-vacuum-trunc.patchtext/x-patch; name=matview-vacuum-trunc.patchDownload

*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***************
*** 230,236 **** lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
--- 230,242 ----
  	 *
  	 * Don't even think about it unless we have a shot at releasing a goodly
  	 * number of pages.  Otherwise, the time taken isn't worth it.
+ 	 *
+ 	 * Leave a populated materialized view with at least one page.
  	 */
+ 	if (onerel->rd_rel->relkind == RELKIND_MATVIEW &&
+ 		vacrelstats->nonempty_pages == 0)
+ 		vacrelstats->nonempty_pages = 1;
+ 
  	possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages;
  	if (possibly_freeable > 0 &&
  		(possibly_freeable >= REL_TRUNCATE_MINIMUM ||
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 430,432 **** SELECT * FROM matview_unit_false;
--- 430,452 ----
  (0 rows)
  
  DROP MATERIALIZED VIEW matview_unit_false;
+ -- test that vacuum does not make empty matview look unpopulated
+ CREATE TABLE hoge (i int);
+ INSERT INTO hoge VALUES (generate_series(1,100000));
+ CREATE MATERIALIZED VIEW hogeview AS SELECT * FROM hoge WHERE i % 2 = 0;
+ CREATE INDEX hogeviewidx ON hogeview (i);
+ DELETE FROM hoge;
+ REFRESH MATERIALIZED VIEW hogeview;
+ SELECT * FROM hogeview WHERE i < 10;
+  i 
+ ---
+ (0 rows)
+ 
+ VACUUM ANALYZE;
+ SELECT * FROM hogeview WHERE i < 10;
+  i 
+ ---
+ (0 rows)
+ 
+ DROP TABLE hoge CASCADE;
+ NOTICE:  drop cascades to materialized view hogeview
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 136,138 **** SELECT * FROM matview_unit_false;
--- 136,150 ----
  REFRESH MATERIALIZED VIEW matview_unit_false;
  SELECT * FROM matview_unit_false;
  DROP MATERIALIZED VIEW matview_unit_false;
+ 
+ -- test that vacuum does not make empty matview look unpopulated
+ CREATE TABLE hoge (i int);
+ INSERT INTO hoge VALUES (generate_series(1,100000));
+ CREATE MATERIALIZED VIEW hogeview AS SELECT * FROM hoge WHERE i % 2 = 0;
+ CREATE INDEX hogeviewidx ON hogeview (i);
+ DELETE FROM hoge;
+ REFRESH MATERIALIZED VIEW hogeview;
+ SELECT * FROM hogeview WHERE i < 10;
+ VACUUM ANALYZE;
+ SELECT * FROM hogeview WHERE i < 10;
+ DROP TABLE hoge CASCADE;

#40

Josh Berkus

josh@agliodbs.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

Kevin,

The reason was that the start of CF4 was deemed too late in the
development cycle to be trying to design what that should look
like. No sooner had you suggested that one column than someone
suggested two others which might also be useful, and it seemed to

Yeah, I'm just pointing out that we *already had* this discussion, so
there isn't any point in having it again.

That was deemed to be incompatible with unlogged matviews, which
some didn't want to give up in this initial release.

Huh? Unlogged tables don't go in pg_class?

Basically, what this patch aims at is more or less what some other
databases had in their initial releases of materialized views 10 to
20 years ago. Other products have built on those foundations with
each major release. I was hoping we could do the same. We are not
going to reach parity on this with any other major database product
in one release, or probably even two or three.

Yep.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Import Notes

Reply to msg id not found: WM1a4e1bceef0be56c85e0fa43089515b88ee8acba1a4c0ac9cd975e61c3fa1805b67bc56ae671a4e194ffe34d5cc82588@asav-3.01.com

#41

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Josh Berkus (#40)

Re: Remaining beta blockers

Josh Berkus <josh@agliodbs.com> wrote:

That was deemed to be incompatible with unlogged matviews, which
some didn't want to give up in this initial release.

Huh? Unlogged tables don't go in pg_class?

Sorry if I wasn't clear. I try to do incremental development --
get one part working and then go on to the next. I had stored the
"populated" state in pg_class, although under what, in retrospect,
was a bad name -- I had a bool called relisvalid. It should have
been relispopulated or a bit in a flag byte. Anyway, as I
proceeded to implement the unlogged feature I discovered that the
heap is truncated at recovery time based on the init fork, before
the server is at the point where we can access the system tables.
So then I tried to keep the pg_class state but populate it when an
unlogged matview was opened, based on using the "special" space in
the first page of the init fork, updating pg_class if that was
found to indicate a truncated heap. I was roundly criticized for
"keeping this state in two places" -- both the heap and pg_class,
and urged to keep it only in the heap, because only keeping it in
pg_class would not allow the proper recovery for an unlogged
matview. So I responded to that feedback with the current
implementation.

Clearly we would need to change how we do recovery of unlogged
relations to both allow unlogged matviews and keep the populated
state in pg_class. I don't think we can really make the "two
places" technique work, where the recovery state of the unlogged
matview is used to trigger a pg_class change. For one thing, it
was really messy -- far more so than current code. For another
thing, the matview would show as good until it was first opened,
which was nasty.

Kinda verbose, but I hope that makes it more clear.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#42

Greg Stark

stark@mit.edu

over 12 years ago

In reply to: Kevin Grittner (#41)

Re: Remaining beta blockers

On Tue, Apr 30, 2013 at 10:19 PM, Kevin Grittner <kgrittn@ymail.com> wrote:

Clearly we would need to change how we do recovery of unlogged
relations to both allow unlogged matviews and keep the populated
state in pg_class. I don't think we can really make the "two
places" technique work, where the recovery state of the unlogged
matview is used to trigger a pg_class change. For one thing, it
was really messy -- far more so than current code. For another
thing, the matview would show as good until it was first opened,
which was nasty.

Can I ask what is the design goal of unlogged relations? Are they just
an optimization so you can load lots of data without generating piles
of wal log? Or are they intended to generate zero wal traffic so they
can be populated on a standby for example, or outside a transaction
entirely?

Because if it's the former then I don't see any problem generating wal
traffic to update the pg_class entry. If they can satisfy the latter
that gets a bit trickier.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#43

Simon Riggs

simon@2ndQuadrant.com

over 12 years ago

In reply to: Tom Lane (#1)

Re: Remaining beta blockers

On 27 April 2013 15:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

2. The checksum algorithm business. Again, we don't get to tinker with
that anymore once we're in beta.

Checksum changes to output value and control file are now complete and
we are ready to go to beta with it.

Robert has an outstanding comment on vismap handling, but thats not to
anything that will cause a re-initdb. There are no further tinkerings
planned, only bug fixes if and when they occur.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#44

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Greg Stark (#42)

Re: Remaining beta blockers

On Tue, Apr 30, 2013 at 9:23 PM, Greg Stark <stark@mit.edu> wrote:

Can I ask what is the design goal of unlogged relations? Are they just
an optimization so you can load lots of data without generating piles
of wal log? Or are they intended to generate zero wal traffic so they
can be populated on a standby for example, or outside a transaction
entirely?

Unlogged relations don't generate any WAL traffic on inserts, updates,
deletes, or tuple locks. So they're not just for bulk-loading;
indeed, for bulk-loading, you can often avoid WAL without an unlogged
relation, if you can arrange to create or truncate the table in the
same transaction that does the load. They cannot be populated on a
standby or outside a transaction because they still require XIDs. If
someone can solve the problem of allowing multiple XID spaces, though,
they might be usable on a standby, which would be pretty cool.

Because if it's the former then I don't see any problem generating wal
traffic to update the pg_class entry. If they can satisfy the latter
that gets a bit trickier.

The problem is that you need to update the pg_class entry in response
to the crash. The startup process has no ability to go hunt down the
pg_class record in the relevant database and update it - it only
understands pages, not tuples.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#45

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Robert Haas (#34)

Re: Remaining beta blockers

On Tue, Apr 30, 2013 at 12:02:59PM -0400, Robert Haas wrote:

On Tue, Apr 30, 2013 at 10:40 AM, Kevin Grittner <kgrittn@ymail.com> wrote:

What is a real problem or risk with using this mechanism until we
engineer something better? What problems with converting to a
later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so. What do you see as a problem?

pg_upgrade only handles changes in catalog state, not on-disk
representation. If the on-disk representation of an non-scannable
view might change in a future release, it's a pg_upgrade hazard.

Yes, pg_upgrade is never going to write to data pages as that would be
slow and prevent the ability to roll back to the previous cluster on
error.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#46

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Bruce Momjian (#45)

Re: Remaining beta blockers

Bruce Momjian <bruce@momjian.us> wrote:

Robert Haas wrote:

Kevin Grittner <kgrittn@ymail.com> wrote:

What is a real problem or risk with using this mechanism
until we engineer something better? What problems with
converting to a later major release does anyone see?

Well, it's a pg_upgrade hazard, if nothing else, isn't it?

I don't think so. What do you see as a problem?

pg_upgrade only handles changes in catalog state, not on-disk
representation. If the on-disk representation of an
non-scannable view might change in a future release, it's a
pg_upgrade hazard.

Yes, pg_upgrade is never going to write to data pages as that
would be slow and prevent the ability to roll back to the
previous cluster on error.

The only person who has suggested anything which would require that
is Andres, who suggests adding a metadata page to the front of the
heap to store information on whether the matview is populated. I
think it is the direct opposite of what Tom is suggesting, and has
too many issues to be considered at this time.

Nobody has proposed how the technique currently used creates a
pg_upgrade hazard now or in some future release where we provide a
way for recovery to put the information into the catalog. I have
gone into more detail on this earlier on this thread.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#47

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Kevin Grittner (#46)

Re: Remaining beta blockers

On Thu, May 2, 2013 at 02:20:15PM -0700, Kevin Grittner wrote:

Yes, pg_upgrade is never going to write to data pages as that
would be slow and prevent the ability to roll back to the
previous cluster on error.

The only person who has suggested anything which would require that
is Andres, who suggests adding a metadata page to the front of the
heap to store information on whether the matview is populated.ï¿½ I
think it is the direct opposite of what Tom is suggesting, and has
too many issues to be considered at this time.

Nobody has proposed how the technique currently used creates a
pg_upgrade hazard now or in some future release where we provide a
way for recovery to put the information into the catalog.ï¿½ I have
gone into more detail on this earlier on this thread.

I was more thinking of the idea of having some status on the first page
that might need to change in a future release.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#48

Greg Stark

stark@mit.edu

over 12 years ago

In reply to: Bruce Momjian (#47)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 1:12 AM, Bruce Momjian <bruce@momjian.us> wrote:

I was more thinking of the idea of having some status on the first page
that might need to change in a future release.

Incidentally, another option might be to have a <relfilenode>.meta
fork that has information like this. It doesn't fundamentally change
anything but it does mean that code that doesn't need to know about it
doesn't need to know to skip the first page. It also means we could
maybe expand it more easily. There have been previous wishlist items
to have some meta information so external tools can more easily parse
the data without needing access to the full catalog for example.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#49

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Greg Stark (#48)

Re: Remaining beta blockers

On 2013-05-03 16:11:13 +0100, Greg Stark wrote:

On Fri, May 3, 2013 at 1:12 AM, Bruce Momjian <bruce@momjian.us> wrote:

I was more thinking of the idea of having some status on the first page
that might need to change in a future release.

Incidentally, another option might be to have a <relfilenode>.meta
fork that has information like this. It doesn't fundamentally change
anything but it does mean that code that doesn't need to know about it
doesn't need to know to skip the first page. It also means we could
maybe expand it more easily. There have been previous wishlist items
to have some meta information so external tools can more easily parse
the data without needing access to the full catalog for example.

The problem with an extra metadata fork is that it essentially would
double the files in a cluster and it would also noticeably increase the
amount of open files we need.
There have been quite some complaints about CREATE DATABASE speed, I
am not sure we want to make it even slower :(

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#50

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Andres Freund (#49)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 05:24:54PM +0200, Andres Freund wrote:

On 2013-05-03 16:11:13 +0100, Greg Stark wrote:

On Fri, May 3, 2013 at 1:12 AM, Bruce Momjian <bruce@momjian.us> wrote:

I was more thinking of the idea of having some status on the first page
that might need to change in a future release.

Incidentally, another option might be to have a <relfilenode>.meta
fork that has information like this. It doesn't fundamentally change
anything but it does mean that code that doesn't need to know about it
doesn't need to know to skip the first page. It also means we could
maybe expand it more easily. There have been previous wishlist items
to have some meta information so external tools can more easily parse
the data without needing access to the full catalog for example.

The problem with an extra metadata fork is that it essentially would
double the files in a cluster and it would also noticeably increase the
amount of open files we need.
There have been quite some complaints about CREATE DATABASE speed, I
am not sure we want to make it even slower :(

Agreed. We start to get into file system performance issues at that
point. I have often wondered if we need to create hash the files into
subdirectories for databases with many tables. Has anyone profiled
this?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#51

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Andres Freund (#49)

Re: Remaining beta blockers

* Andres Freund (andres@2ndquadrant.com) wrote:

The problem with an extra metadata fork is that it essentially would
double the files in a cluster and it would also noticeably increase the
amount of open files we need.

Why would we need it for every relation? We have other forks (fsm, vm),
but they exist when needed.

I'm more concerned about moving information which really should be in
the system catalogs out into magic files on disk..

Thanks,

Stephen

#52

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Stephen Frost (#51)

Re: Remaining beta blockers

Stephen Frost <sfrost@snowman.net> writes:

I'm more concerned about moving information which really should be in
the system catalogs out into magic files on disk..

Right. The whole thing is just a kluge, which I'm convinced we'll
regret sooner or later --- probably sooner. I would much rather drop
unlogged matviews for now and put scannability status into pg_class
where it belongs.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#53

Alvaro Herrera

alvherre@2ndquadrant.com

over 12 years ago

In reply to: Bruce Momjian (#50)

Re: Remaining beta blockers

Bruce Momjian escribió:

On Fri, May 3, 2013 at 05:24:54PM +0200, Andres Freund wrote:

The problem with an extra metadata fork is that it essentially would
double the files in a cluster and it would also noticeably increase the
amount of open files we need.
There have been quite some complaints about CREATE DATABASE speed, I
am not sure we want to make it even slower :(

Agreed. We start to get into file system performance issues at that
point. I have often wondered if we need to create hash the files into
subdirectories for databases with many tables. Has anyone profiled
this?

Modern filesystems use trees to store file nowadays, not linear arrays,
and so that old scenario (long time to find one specific directory entry
within a directory containing lots of files) is no longer that serious a
problem.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#54

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Alvaro Herrera (#53)

Re: Remaining beta blockers

On 2013-05-03 12:15:27 -0400, Alvaro Herrera wrote:

Bruce Momjian escribió:

On Fri, May 3, 2013 at 05:24:54PM +0200, Andres Freund wrote:

The problem with an extra metadata fork is that it essentially would
double the files in a cluster and it would also noticeably increase the
amount of open files we need.
There have been quite some complaints about CREATE DATABASE speed, I
am not sure we want to make it even slower :(

Agreed. We start to get into file system performance issues at that
point. I have often wondered if we need to create hash the files into
subdirectories for databases with many tables. Has anyone profiled
this?

Modern filesystems use trees to store file nowadays, not linear arrays,
and so that old scenario (long time to find one specific directory entry
within a directory containing lots of files) is no longer that serious a
problem.

Absolutely. Also, we normally shouldn't open/close files constantly
since we cache open files so open(2) performance shouldn't be *that*
critical.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#55

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Tom Lane (#52)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 12:10:14PM -0400, Tom Lane wrote:

Stephen Frost <sfrost@snowman.net> writes:

I'm more concerned about moving information which really should be in
the system catalogs out into magic files on disk..

Right. The whole thing is just a kluge, which I'm convinced we'll
regret sooner or later --- probably sooner. I would much rather drop
unlogged matviews for now and put scannability status into pg_class
where it belongs.

Right. I think the big question is whether we can add crash recovery
system catalog writes in the time before beta, and whether we would need
that even if we remove unlogged table support.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#56

Andres Freund

andres@2ndquadrant.com

over 12 years ago

In reply to: Tom Lane (#52)

Re: Remaining beta blockers

On 2013-05-03 12:10:14 -0400, Tom Lane wrote:

Stephen Frost <sfrost@snowman.net> writes:

I'm more concerned about moving information which really should be in
the system catalogs out into magic files on disk..

Right. The whole thing is just a kluge, which I'm convinced we'll
regret sooner or later --- probably sooner.

I tentatively agree as well. The only argument for introducing some
additional location for such information is that it would be the start
of an infrastructure for information we would need for incrementally
adding checksums, page upgrades and such.

I would much rather drop
unlogged matviews for now and put scannability status into pg_class
where it belongs.

I have no problem with that. And getting to the point were we can switch
databases in a backend in at least a restricted environment is a good
thing.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#57

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Andres Freund (#56)

Re: Remaining beta blockers

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-05-03 12:10:14 -0400, Tom Lane wrote:

Right. The whole thing is just a kluge, which I'm convinced we'll
regret sooner or later --- probably sooner.

I tentatively agree as well. The only argument for introducing some
additional location for such information is that it would be the start
of an infrastructure for information we would need for incrementally
adding checksums, page upgrades and such.

It's possible that a metadata fork would be a good design for such
stuff, but I'd want to see a pretty completely worked-out design before
committing to the idea. In any case we're way too late in the 9.3 cycle
to be considering something like that right now.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#58

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Tom Lane (#57)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 12:45:36PM -0400, Tom Lane wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-05-03 12:10:14 -0400, Tom Lane wrote:

Right. The whole thing is just a kluge, which I'm convinced we'll
regret sooner or later --- probably sooner.

I tentatively agree as well. The only argument for introducing some
additional location for such information is that it would be the start
of an infrastructure for information we would need for incrementally
adding checksums, page upgrades and such.

It's possible that a metadata fork would be a good design for such
stuff, but I'd want to see a pretty completely worked-out design before
committing to the idea. In any case we're way too late in the 9.3 cycle
to be considering something like that right now.

Yes, I think the big question is how much information do we want per
relation that we don't need in the system tables.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#59

Greg Stark

stark@mit.edu

over 12 years ago

In reply to: Bruce Momjian (#58)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 5:49 PM, Bruce Momjian <bruce@momjian.us> wrote:

Yes, I think the big question is how much information do we want per
relation that we don't need in the system tables.

It's not that we don't need it in the system tables. It's that there's
some state that we *can't* have in the system tables because we need
it to be accessible without access to the catalog or we need to be
able to change it on a standby.

But note that this all sounds very similar to the global temp table
discussion a while ago. I think we're gong to need some infrastructure
for table state that isn't transactional and it will make sense to
solve that with something general that we can then depend on for lots
of things. If I had to guess it would look more like a cached copy of
the pg_class row or the whole relcache entry rather than an entirely
independent structure.

--
greg

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#60

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Greg Stark (#59)

Re: Remaining beta blockers

On Sat, May 4, 2013 at 03:04:33AM +0100, Greg Stark wrote:

On Fri, May 3, 2013 at 5:49 PM, Bruce Momjian <bruce@momjian.us> wrote:

Yes, I think the big question is how much information do we want per
relation that we don't need in the system tables.

It's not that we don't need it in the system tables. It's that there's
some state that we *can't* have in the system tables because we need
it to be accessible without access to the catalog or we need to be
able to change it on a standby.

But note that this all sounds very similar to the global temp table
discussion a while ago. I think we're gong to need some infrastructure
for table state that isn't transactional and it will make sense to
solve that with something general that we can then depend on for lots
of things. If I had to guess it would look more like a cached copy of
the pg_class row or the whole relcache entry rather than an entirely
independent structure.

Well, the big question is whether this state _eventually_ will need to
be accessible from SQL, so we might as well add code to allow crash
recovery to write to system tables.

Things like how fresh the materialized view is certainly should be
accessible via SQL and transactional.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#61

Bruce Momjian

bruce@momjian.us

over 12 years ago

In reply to: Bruce Momjian (#60)

Re: Remaining beta blockers

On Fri, May 3, 2013 at 10:19:42PM -0400, Bruce Momjian wrote:

On Sat, May 4, 2013 at 03:04:33AM +0100, Greg Stark wrote:

On Fri, May 3, 2013 at 5:49 PM, Bruce Momjian <bruce@momjian.us> wrote:

Yes, I think the big question is how much information do we want per
relation that we don't need in the system tables.

It's not that we don't need it in the system tables. It's that there's
some state that we *can't* have in the system tables because we need
it to be accessible without access to the catalog or we need to be
able to change it on a standby.

But note that this all sounds very similar to the global temp table
discussion a while ago. I think we're gong to need some infrastructure
for table state that isn't transactional and it will make sense to
solve that with something general that we can then depend on for lots
of things. If I had to guess it would look more like a cached copy of
the pg_class row or the whole relcache entry rather than an entirely
independent structure.

Well, the big question is whether this state _eventually_ will need to
be accessible from SQL, so we might as well add code to allow crash
recovery to write to system tables.

Things like how fresh the materialized view is certainly should be
accessible via SQL and transactional.

OK, how are we for bata packaging on Monday? I don't see how we can do
that until we decide on how to handle unlogged materialized views.

Can someone summarize what we have considered for a meta-page as the
first page in the past? Have they always been things we couldn't
recording in the catalogs, or things we didn't want in the catalogs,
e.g. visibility/hint bits?

If we would eventually want the materialized information in the system
catalogs rather than on the first heap page, or recorded as the size of
the help file, I suggest we just remove anything that depends on it and
move to beta.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#62

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Bruce Momjian (#61)

2 attachment(s)

Re: Remaining beta blockers

Bruce Momjian <bruce@momjian.us> writes:

OK, how are we for bata packaging on Monday? I don't see how we can do
that until we decide on how to handle unlogged materialized views.

In the interests of moving the discussion along, attached are draft
patches to show what I think we should do in 9.3. The first patch
disables the unlogged-matviews feature at the user level (and perhaps
could be reverted someday), while the second one moves
scannability-state storage into a pg_class column.

I had previously proposed that we keep scannability state in reloptions,
but that turns out not to be terribly easy because the relcache doesn't
store the reloptions part of the pg_class row; so this patch does it
with a new boolean column instead.

regards, tom lane

Attachments:

disallow-unlogged.patchtext/x-patch; charset=us-asciiDownload

diff --git a/doc/src/sgml/ref/create_materialized_view.sgml b/doc/src/sgml/ref/create_materialized_view.sgml
index a7e4e210eeb0ef20bb7ea046f5c72a36cc4ad880..0ed764b353394137afd73c9a013e031e197ae79c 100644
*** a/doc/src/sgml/ref/create_materialized_view.sgml
--- b/doc/src/sgml/ref/create_materialized_view.sgml
*************** PostgreSQL documentation
*** 21,27 ****
  
   <refsynopsisdiv>
  <synopsis>
! CREATE [ UNLOGGED ] MATERIALIZED VIEW <replaceable>table_name</replaceable>
      [ (<replaceable>column_name</replaceable> [, ...] ) ]
      [ WITH ( <replaceable class="PARAMETER">storage_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] ) ]
      [ TABLESPACE <replaceable class="PARAMETER">tablespace_name</replaceable> ]
--- 21,27 ----
  
   <refsynopsisdiv>
  <synopsis>
! CREATE MATERIALIZED VIEW <replaceable>table_name</replaceable>
      [ (<replaceable>column_name</replaceable> [, ...] ) ]
      [ WITH ( <replaceable class="PARAMETER">storage_parameter</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] ) ]
      [ TABLESPACE <replaceable class="PARAMETER">tablespace_name</replaceable> ]
*************** CREATE [ UNLOGGED ] MATERIALIZED VIEW <r
*** 55,70 ****
  
    <variablelist>
     <varlistentry>
-     <term><literal>UNLOGGED</></term>
-     <listitem>
-      <para>
-       If specified, the materialized view will be unlogged.
-       Refer to <xref linkend="sql-createtable"> for details.
-      </para>
-     </listitem>
-    </varlistentry>
- 
-    <varlistentry>
      <term><replaceable>table_name</replaceable></term>
      <listitem>
       <para>
--- 55,60 ----
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index fb28e471685c7462e8076556a78ef45b561f3d2e..53ff4ed9c1eed923219ee3f0bfc1eff6ccdcf202 100644
*** a/src/backend/parser/analyze.c
--- b/src/backend/parser/analyze.c
*************** transformCreateTableAsStmt(ParseState *p
*** 2167,2172 ****
--- 2167,2184 ----
  					 errmsg("materialized views may not be defined using bound parameters")));
  
  		/*
+ 		 * For now, we disallow unlogged materialized views, because it
+ 		 * seems like a bad idea for them to just go to empty after a crash.
+ 		 * (If we could make them go into unscannable state, that would be
+ 		 * better, but that requires catalog changes which crash recovery
+ 		 * can't presently do.)
+ 		 */
+ 		if (stmt->into->rel->relpersistence == RELPERSISTENCE_UNLOGGED)
+ 			ereport(ERROR,
+ 					(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ 					 errmsg("materialized views cannot be UNLOGGED")));
+ 
+ 		/*
  		 * At runtime, we'll need a copy of the parsed-but-not-rewritten Query
  		 * for purposes of creating the view's ON SELECT rule.  We stash that
  		 * in the IntoClause because that's where intorel_startup() can
diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out
index e87775679ac9eb059af776e6e4c20e3bed7360da..06bb2551a83480361ddfce5ad02b76240d9eaf63 100644
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
*************** SELECT * FROM tvvm;
*** 279,337 ****
  (1 row)
  
  -- test diemv when the mv does not exist
! DROP MATERIALIZED VIEW IF EXISTS tum;
! NOTICE:  materialized view "tum" does not exist, skipping
! -- make sure that an unlogged materialized view works (in the absence of a crash)
! CREATE UNLOGGED MATERIALIZED VIEW tum AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
!  pg_relation_is_scannable 
! --------------------------
!  f
! (1 row)
! 
! SELECT * FROM tum;
! ERROR:  materialized view "tum" has not been populated
! HINT:  Use the REFRESH MATERIALIZED VIEW command.
! REFRESH MATERIALIZED VIEW tum;
! SELECT pg_relation_is_scannable('tum'::regclass);
!  pg_relation_is_scannable 
! --------------------------
!  t
! (1 row)
! 
! SELECT * FROM tum;
!  type | totamt 
! ------+--------
!  y    |     12
!  z    |     24
!  x    |      5
! (3 rows)
! 
! REFRESH MATERIALIZED VIEW tum WITH NO DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
!  pg_relation_is_scannable 
! --------------------------
!  f
! (1 row)
! 
! SELECT * FROM tum;
! ERROR:  materialized view "tum" has not been populated
! HINT:  Use the REFRESH MATERIALIZED VIEW command.
! REFRESH MATERIALIZED VIEW tum WITH DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
!  pg_relation_is_scannable 
! --------------------------
!  t
! (1 row)
! 
! SELECT * FROM tum;
!  type | totamt 
! ------+--------
!  y    |     12
!  z    |     24
!  x    |      5
! (3 rows)
! 
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
--- 279,286 ----
  (1 row)
  
  -- test diemv when the mv does not exist
! DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
! NOTICE:  materialized view "no_such_mv" does not exist, skipping
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
   type | mtot | vtot 
*************** SELECT type, m.totamt AS mtot, v.totamt 
*** 341,348 ****
   z    |   24 |   24
  (3 rows)
  
- -- test diemv when the mv does exist
- DROP MATERIALIZED VIEW IF EXISTS tum;
  -- make sure that dependencies are reported properly when they block the drop
  DROP TABLE t;
  ERROR:  cannot drop table t because other objects depend on it
--- 290,295 ----
diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql
index 9fbaafac6d0ae609e80f777754d6f71189cc4af0..09a7378133c86d66755dafd1629719a712badfdf 100644
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
*************** SELECT * FROM tvmm;
*** 87,114 ****
  SELECT * FROM tvvm;
  
  -- test diemv when the mv does not exist
! DROP MATERIALIZED VIEW IF EXISTS tum;
! 
! -- make sure that an unlogged materialized view works (in the absence of a crash)
! CREATE UNLOGGED MATERIALIZED VIEW tum AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
! SELECT * FROM tum;
! REFRESH MATERIALIZED VIEW tum;
! SELECT pg_relation_is_scannable('tum'::regclass);
! SELECT * FROM tum;
! REFRESH MATERIALIZED VIEW tum WITH NO DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
! SELECT * FROM tum;
! REFRESH MATERIALIZED VIEW tum WITH DATA;
! SELECT pg_relation_is_scannable('tum'::regclass);
! SELECT * FROM tum;
  
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
- -- test diemv when the mv does exist
- DROP MATERIALIZED VIEW IF EXISTS tum;
- 
  -- make sure that dependencies are reported properly when they block the drop
  DROP TABLE t;
  
--- 87,97 ----
  SELECT * FROM tvvm;
  
  -- test diemv when the mv does not exist
! DROP MATERIALIZED VIEW IF EXISTS no_such_mv;
  
  -- test join of mv and view
  SELECT type, m.totamt AS mtot, v.totamt AS vtot FROM tm m LEFT JOIN tv v USING (type) ORDER BY type;
  
  -- make sure that dependencies are reported properly when they block the drop
  DROP TABLE t;

move-scannability-state.patchtext/x-patch; charset=us-asciiDownload

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 6c0ef5bd195bac55e1b3b799d49e60ccda83738a..f6df8a8cedddddbdfefca46a5f47eb2868450c13 100644
*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***************
*** 1864,1869 ****
--- 1864,1877 ----
       </row>
  
       <row>
+       <entry><structfield>relisscannable</structfield></entry>
+       <entry><type>bool</type></entry>
+       <entry></entry>
+       <entry>True if okay to read contents of relation (this is true for all
+       relations other than some materialized views)</entry>
+      </row>
+ 
+      <row>
        <entry><structfield>relfrozenxid</structfield></entry>
        <entry><type>xid</type></entry>
        <entry></entry>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3ae2f23390f05a1c4e1797804813cbc6fa1d84d6..af00527fde3c99d4164f124667c241f91eea34f0 100644
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*************** SELECT pg_type_is_visible('myschema.widg
*** 14239,14248 ****
     </indexterm>
  
     <indexterm>
-     <primary>pg_relation_is_scannable</primary>
-    </indexterm>
- 
-    <indexterm>
      <primary>pg_typeof</primary>
     </indexterm>
  
--- 14239,14244 ----
*************** SELECT pg_type_is_visible('myschema.widg
*** 14411,14421 ****
         <entry>get the path in the file system that this tablespace is located in</entry>
        </row>
        <row>
-        <entry><literal><function>pg_relation_is_scannable(<parameter>relation_oid</parameter>)</function></literal></entry>
-        <entry><type>boolean</type></entry>
-        <entry>is the relation scannable; a materialized view which has not been loaded will not be scannable</entry>
-       </row>
-       <row>
         <entry><literal><function>pg_typeof(<parameter>any</parameter>)</function></literal></entry>
         <entry><type>regtype</type></entry>
         <entry>get the data type of any value</entry>
--- 14407,14412 ----
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 0b4c659ff0751a48e69c4b7886a7ba38804eb46a..131e4c601694c22ad7981d85870447eaff9d5013 100644
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
*************** InsertPgClassTuple(Relation pg_class_des
*** 780,785 ****
--- 780,786 ----
  	values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
  	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
  	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
+ 	values[Anum_pg_class_relisscannable - 1] = BoolGetDatum(rd_rel->relisscannable);
  	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
  	values[Anum_pg_class_relminmxid - 1] = MultiXactIdGetDatum(rd_rel->relminmxid);
  	if (relacl != (Datum) 0)
*************** heap_create_init_fork(Relation rel)
*** 1346,1371 ****
  }
  
  /*
-  * Check whether a materialized view is in an initial, unloaded state.
-  *
-  * The check here must match what is set up in heap_create_init_fork().
-  * Currently the init fork is an empty file.  A missing heap is also
-  * considered to be unloaded.
-  */
- bool
- heap_is_matview_init_state(Relation rel)
- {
- 	Assert(rel->rd_rel->relkind == RELKIND_MATVIEW);
- 
- 	RelationOpenSmgr(rel);
- 
- 	if (!smgrexists(rel->rd_smgr, MAIN_FORKNUM))
- 		return true;
- 
- 	return (smgrnblocks(rel->rd_smgr, MAIN_FORKNUM) < 1);
- }
- 
- /*
   *		RelationRemoveInheritance
   *
   * Formerly, this routine checked for child relations and aborted the
--- 1347,1352 ----
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 57adbf607de1a866727e905c8387c7a4618b4237..5dbc6632ad59a52af0a89163af1cb42b0cec40fc 100644
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
*************** CREATE VIEW pg_matviews AS
*** 101,107 ****
          pg_get_userbyid(C.relowner) AS matviewowner,
          T.spcname AS tablespace,
          C.relhasindex AS hasindexes,
!         pg_relation_is_scannable(C.oid) AS isscannable,
          pg_get_viewdef(C.oid) AS definition
      FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
           LEFT JOIN pg_tablespace T ON (T.oid = C.reltablespace)
--- 101,107 ----
          pg_get_userbyid(C.relowner) AS matviewowner,
          T.spcname AS tablespace,
          C.relhasindex AS hasindexes,
!         C.relisscannable AS isscannable,
          pg_get_viewdef(C.oid) AS definition
      FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
           LEFT JOIN pg_tablespace T ON (T.oid = C.reltablespace)
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index ed62246cc52db0f00dd66b4139c21ef9f7564acf..878b6254f540e613d125823269e1e23f7b99d002 100644
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 30,36 ****
  #include "catalog/objectaccess.h"
  #include "catalog/toasting.h"
  #include "commands/cluster.h"
- #include "commands/matview.h"
  #include "commands/tablecmds.h"
  #include "commands/vacuum.h"
  #include "miscadmin.h"
--- 30,35 ----
*************** cluster_rel(Oid tableOid, Oid indexOid, 
*** 388,394 ****
  	 * database.
  	 */
  	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW &&
! 		!OldHeap->rd_ispopulated)
  	{
  		relation_close(OldHeap, AccessExclusiveLock);
  		return;
--- 387,393 ----
  	 * database.
  	 */
  	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW &&
! 		!RelationIsPopulated(OldHeap))
  	{
  		relation_close(OldHeap, AccessExclusiveLock);
  		return;
*************** copy_heap_data(Oid OIDNewHeap, Oid OIDOl
*** 922,931 ****
  						get_namespace_name(RelationGetNamespace(OldHeap)),
  						RelationGetRelationName(OldHeap))));
  
- 	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW)
- 		/* Make sure the heap looks good even if no rows are written. */
- 		SetMatViewToPopulated(NewHeap);
- 
  	/*
  	 * Scan through the OldHeap, either in OldIndex order or sequentially;
  	 * copy each tuple into the NewHeap, or transiently to the tuplesort
--- 921,926 ----
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index de65c4c78174d6c7daa72d49367dfc2bc789df61..14973f8e7c46eb95620a0af49a15b8db504dbd72 100644
*** a/src/backend/commands/createas.c
--- b/src/backend/commands/createas.c
*************** intorel_startup(DestReceiver *self, int 
*** 359,368 ****
  	 */
  	intoRelationDesc = heap_open(intoRelationId, AccessExclusiveLock);
  
- 	if (is_matview && !into->skipData)
- 		/* Make sure the heap looks good even if no rows are written. */
- 		SetMatViewToPopulated(intoRelationDesc);
- 
  	/*
  	 * Check INSERT permission on the constructed table.
  	 *
--- 359,364 ----
*************** intorel_startup(DestReceiver *self, int 
*** 382,387 ****
--- 378,390 ----
  	ExecCheckRTPerms(list_make1(rte), true);
  
  	/*
+ 	 * Tentatively mark the target as populated, if it's a matview and we're
+ 	 * going to fill it; otherwise, no change needed.
+ 	 */
+ 	if (is_matview && !into->skipData)
+ 		SetMatViewPopulatedState(intoRelationDesc, true);
+ 
+ 	/*
  	 * Fill private fields of myState for use by later routines
  	 */
  	myState->rel = intoRelationDesc;
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index da373045cc02fd62cade8b5b222e5de7627f3705..390aba112563f3ad5f4663b0aa6a9dc2f2b3a94e 100644
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 31,36 ****
--- 31,37 ----
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
  #include "utils/snapmgr.h"
+ #include "utils/syscache.h"
  
  
  typedef struct
*************** static void refresh_matview_datafill(Des
*** 52,89 ****
  									 const char *queryString);
  
  /*
!  * SetMatViewToPopulated
!  *		Indicate that the materialized view has been populated by its query.
!  *
!  * NOTE: The heap starts out in a state that doesn't look scannable, and can
!  * only transition from there to scannable at the time a new heap is created.
   *
   * NOTE: caller must be holding an appropriate lock on the relation.
   */
  void
! SetMatViewToPopulated(Relation relation)
  {
! 	Page        page;
  
  	Assert(relation->rd_rel->relkind == RELKIND_MATVIEW);
- 	Assert(relation->rd_ispopulated == false);
  
! 	page = (Page) palloc(BLCKSZ);
! 	PageInit(page, BLCKSZ, 0);
! 
! 	if (RelationNeedsWAL(relation))
! 		log_newpage(&(relation->rd_node), MAIN_FORKNUM, 0, page);
  
! 	RelationOpenSmgr(relation);
  
! 	PageSetChecksumInplace(page, 0);
! 	smgrextend(relation->rd_smgr, MAIN_FORKNUM, 0, (char *) page, true);
  
! 	pfree(page);
  
! 	smgrimmedsync(relation->rd_smgr, MAIN_FORKNUM);
  
! 	RelationCacheInvalidateEntry(relation->rd_id);
  }
  
  /*
--- 53,97 ----
  									 const char *queryString);
  
  /*
!  * SetMatViewPopulatedState
!  *		Mark a materialized view as populated, or not.
   *
   * NOTE: caller must be holding an appropriate lock on the relation.
   */
  void
! SetMatViewPopulatedState(Relation relation, bool newstate)
  {
! 	Relation	pgrel;
! 	HeapTuple	tuple;
  
  	Assert(relation->rd_rel->relkind == RELKIND_MATVIEW);
  
! 	/*
! 	 * Update relation's pg_class entry.  Crucial side-effect: other backends
! 	 * (and this one too!) are sent SI message to make them rebuild relcache
! 	 * entries.
! 	 */
! 	pgrel = heap_open(RelationRelationId, RowExclusiveLock);
! 	tuple = SearchSysCacheCopy1(RELOID,
! 								ObjectIdGetDatum(RelationGetRelid(relation)));
! 	if (!HeapTupleIsValid(tuple))
! 		elog(ERROR, "cache lookup failed for relation %u",
! 			 RelationGetRelid(relation));
  
! 	((Form_pg_class) GETSTRUCT(tuple))->relisscannable = newstate;
  
! 	simple_heap_update(pgrel, &tuple->t_self, tuple);
  
! 	CatalogUpdateIndexes(pgrel, tuple);
  
! 	heap_freetuple(tuple);
! 	heap_close(pgrel, RowExclusiveLock);
  
! 	/*
! 	 * Advance command counter to make the updated pg_class row locally
! 	 * visible.
! 	 */
! 	CommandCounterIncrement();
  }
  
  /*
*************** SetMatViewToPopulated(Relation relation)
*** 97,110 ****
   * If WITH NO DATA was specified, this is effectively like a TRUNCATE;
   * otherwise it is like a TRUNCATE followed by an INSERT using the SELECT
   * statement associated with the materialized view.  The statement node's
!  * skipData field is used to indicate that the clause was used.
   *
   * Indexes are rebuilt too, via REINDEX. Since we are effectively bulk-loading
   * the new heap, it's better to create the indexes afterwards than to fill them
   * incrementally while we load.
   *
!  * The scannable state is changed based on whether the contents reflect the
!  * result set of the materialized view's query.
   */
  void
  ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
--- 105,118 ----
   * If WITH NO DATA was specified, this is effectively like a TRUNCATE;
   * otherwise it is like a TRUNCATE followed by an INSERT using the SELECT
   * statement associated with the materialized view.  The statement node's
!  * skipData field shows whether the clause was used.
   *
   * Indexes are rebuilt too, via REINDEX. Since we are effectively bulk-loading
   * the new heap, it's better to create the indexes afterwards than to fill them
   * incrementally while we load.
   *
!  * The matview's "populated" state is changed based on whether the contents
!  * reflect the result set of the materialized view's query.
   */
  void
  ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
*************** ExecRefreshMatView(RefreshMatViewStmt *s
*** 184,189 ****
--- 192,203 ----
  	 */
  	CheckTableNotInUse(matviewRel, "REFRESH MATERIALIZED VIEW");
  
+ 	/*
+ 	 * Tentatively mark the matview as populated or not (this will roll back
+ 	 * if we fail later).
+ 	 */
+ 	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
+ 
  	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
*************** ExecRefreshMatView(RefreshMatViewStmt *s
*** 192,197 ****
--- 206,212 ----
  	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
+ 	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
*************** transientrel_startup(DestReceiver *self,
*** 300,307 ****
  		myState->hi_options |= HEAP_INSERT_SKIP_WAL;
  	myState->bistate = GetBulkInsertState();
  
- 	SetMatViewToPopulated(transientrel);
- 
  	/* Not using WAL requires smgr_targblock be initially invalid */
  	Assert(RelationGetTargetBlock(transientrel) == InvalidBlockNumber);
  }
--- 315,320 ----
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
index 02f3cf3c205483678e1637ac612134c99fd786d6..9d304153b8bee4a4cd02701dc70173cdc06a60e1 100644
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*************** lazy_vacuum_rel(Relation onerel, VacuumS
*** 230,242 ****
  	 *
  	 * Don't even think about it unless we have a shot at releasing a goodly
  	 * number of pages.  Otherwise, the time taken isn't worth it.
- 	 *
- 	 * Leave a populated materialized view with at least one page.
  	 */
- 	if (onerel->rd_rel->relkind == RELKIND_MATVIEW &&
- 		vacrelstats->nonempty_pages == 0)
- 		vacrelstats->nonempty_pages = 1;
- 
  	possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages;
  	if (possibly_freeable > 0 &&
  		(possibly_freeable >= REL_TRUNCATE_MINIMUM ||
--- 230,236 ----
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index d32d9014c7ef5f16ee45e0cd8df2ab538e383fcf..4c4e1ed82022553dc9f785b8827eb08c6a9c7a20 100644
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
*************** pg_relation_filepath(PG_FUNCTION_ARGS)
*** 834,863 ****
  
  	PG_RETURN_TEXT_P(cstring_to_text(path));
  }
- 
- 
- /*
-  * Indicate whether a relation is scannable.
-  *
-  * Currently, this is always true except for a materialized view which has not
-  * been populated.  It is expected that other conditions for allowing a
-  * materialized view to be scanned will be added in later releases.
-  */
- Datum
- pg_relation_is_scannable(PG_FUNCTION_ARGS)
- {
- 	Oid			relid;
- 	Relation	relation;
- 	bool		result;
- 
- 	relid = PG_GETARG_OID(0);
- 	relation = try_relation_open(relid, AccessShareLock);
- 
- 	if (relation == NULL)
- 		PG_RETURN_BOOL(false);
- 
- 	result = RelationIsScannable(relation);
- 
- 	relation_close(relation, AccessShareLock);
- 	PG_RETURN_BOOL(result);
- }
--- 834,836 ----
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 670fa8c1667e53024b520f2cc463c3a24d488994..482882bc197eff72974e35018aa955a00d1acaca 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
***************
*** 37,43 ****
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
- #include "catalog/heap.h"
  #include "catalog/index.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
--- 37,42 ----
*************** RelationBuildDesc(Oid targetRelId, bool 
*** 956,967 ****
  	/* make sure relation is marked as having no open file yet */
  	relation->rd_smgr = NULL;
  
- 	if (relation->rd_rel->relkind == RELKIND_MATVIEW &&
- 		heap_is_matview_init_state(relation))
- 		relation->rd_ispopulated = false;
- 	else
- 		relation->rd_ispopulated = true;
- 
  	/*
  	 * now we can free the memory allocated for pg_class_tuple
  	 */
--- 955,960 ----
*************** formrdesc(const char *relationName, Oid 
*** 1459,1464 ****
--- 1452,1460 ----
  	/* formrdesc is used only for permanent relations */
  	relation->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
  
+ 	/* ... and they're always scannable, too */
+ 	relation->rd_rel->relisscannable = true;
+ 
  	relation->rd_rel->relpages = 0;
  	relation->rd_rel->reltuples = 0;
  	relation->rd_rel->relallvisible = 0;
*************** formrdesc(const char *relationName, Oid 
*** 1531,1537 ****
  	 * initialize physical addressing information for the relation
  	 */
  	RelationInitPhysicalAddr(relation);
- 	relation->rd_ispopulated = true;
  
  	/*
  	 * initialize the rel-has-index flag, using hardwired knowledge
--- 1527,1532 ----
*************** RelationReloadIndexInfo(Relation relatio
*** 1756,1762 ****
  	heap_freetuple(pg_class_tuple);
  	/* We must recalculate physical address in case it changed */
  	RelationInitPhysicalAddr(relation);
- 	relation->rd_ispopulated = true;
  
  	/*
  	 * For a non-system index, there are fields of the pg_index row that are
--- 1751,1756 ----
*************** RelationClearRelation(Relation relation,
*** 1905,1915 ****
  	if (relation->rd_isnailed)
  	{
  		RelationInitPhysicalAddr(relation);
- 		if (relation->rd_rel->relkind == RELKIND_MATVIEW &&
- 			heap_is_matview_init_state(relation))
- 			relation->rd_ispopulated = false;
- 		else
- 			relation->rd_ispopulated = true;
  
  		if (relation->rd_rel->relkind == RELKIND_INDEX)
  		{
--- 1899,1904 ----
*************** RelationBuildLocalRelation(const char *r
*** 2671,2676 ****
--- 2660,2671 ----
  			break;
  	}
  
+ 	/* if it's a materialized view, it's not scannable initially */
+ 	if (relkind == RELKIND_MATVIEW)
+ 		rel->rd_rel->relisscannable = false;
+ 	else
+ 		rel->rd_rel->relisscannable = true;
+ 
  	/*
  	 * Insert relation physical and logical identifiers (OIDs) into the right
  	 * places.	For a mapped relation, we set relfilenode to zero and rely on
*************** RelationBuildLocalRelation(const char *r
*** 2698,2709 ****
  
  	RelationInitPhysicalAddr(rel);
  
- 	/* materialized view not initially scannable */
- 	if (relkind == RELKIND_MATVIEW)
- 		rel->rd_ispopulated = false;
- 	else
- 		rel->rd_ispopulated = true;
- 
  	/*
  	 * Okay to insert into the relcache hash tables.
  	 */
--- 2693,2698 ----
*************** load_relcache_init_file(bool shared)
*** 4448,4458 ****
  		 */
  		RelationInitLockInfo(rel);
  		RelationInitPhysicalAddr(rel);
- 		if (rel->rd_rel->relkind == RELKIND_MATVIEW &&
- 			heap_is_matview_init_state(rel))
- 			rel->rd_ispopulated = false;
- 		else
- 			rel->rd_ispopulated = true;
  	}
  
  	/*
--- 4437,4442 ----
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 007b0865eb506fc8a8263974c09cc44b7986ac71..9db6cc9e35ff8d3c2d5ac416151a5112b6ff81af 100644
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
*************** refreshMatViewData(Archive *fout, TableD
*** 1759,1765 ****
  	PQExpBuffer q;
  
  	/* If the materialized view is not flagged as scannable, skip this. */
! 	if (!tbinfo->isscannable)
  		return;
  
  	q = createPQExpBuffer();
--- 1759,1765 ----
  	PQExpBuffer q;
  
  	/* If the materialized view is not flagged as scannable, skip this. */
! 	if (!tbinfo->relisscannable)
  		return;
  
  	q = createPQExpBuffer();
*************** buildMatViewRefreshDependencies(Archive 
*** 1966,1973 ****
  
  		addObjectDependency(dobj, refdobj->dumpId);
  
! 		if (!reftbinfo->isscannable)
! 			tbinfo->isscannable = false;
  	}
  
  	PQclear(res);
--- 1966,1973 ----
  
  		addObjectDependency(dobj, refdobj->dumpId);
  
! 		if (!reftbinfo->relisscannable)
! 			tbinfo->relisscannable = false;
  	}
  
  	PQclear(res);
*************** getTables(Archive *fout, int *numTables)
*** 4218,4224 ****
  	int			i_toastoid;
  	int			i_toastfrozenxid;
  	int			i_relpersistence;
! 	int			i_isscannable;
  	int			i_owning_tab;
  	int			i_owning_col;
  	int			i_reltablespace;
--- 4218,4224 ----
  	int			i_toastoid;
  	int			i_toastfrozenxid;
  	int			i_relpersistence;
! 	int			i_relisscannable;
  	int			i_owning_tab;
  	int			i_owning_col;
  	int			i_reltablespace;
*************** getTables(Archive *fout, int *numTables)
*** 4264,4271 ****
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, "
! 						  "CASE WHEN c.relkind = '%c' THEN pg_relation_is_scannable(c.oid) ELSE 't'::bool END as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4264,4270 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, c.relisscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4283,4289 ****
  				   "WHERE c.relkind in ('%c', '%c', '%c', '%c', '%c', '%c') "
  						  "ORDER BY c.oid",
  						  username_subquery,
- 						  RELKIND_MATVIEW,
  						  RELKIND_SEQUENCE,
  						  RELKIND_RELATION, RELKIND_SEQUENCE,
  						  RELKIND_VIEW, RELKIND_COMPOSITE_TYPE,
--- 4282,4287 ----
*************** getTables(Archive *fout, int *numTables)
*** 4303,4309 ****
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4301,4307 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, 't' as relisscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4340,4346 ****
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4338,4344 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4376,4382 ****
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4374,4380 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4412,4418 ****
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4410,4416 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4449,4455 ****
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4447,4453 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4485,4491 ****
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4483,4489 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4517,4523 ****
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4515,4521 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4544,4550 ****
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4542,4548 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4581,4587 ****
  						  "0 as relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4579,4585 ----
  						  "0 as relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relisscannable, "
  						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
*************** getTables(Archive *fout, int *numTables)
*** 4630,4636 ****
  	i_toastoid = PQfnumber(res, "toid");
  	i_toastfrozenxid = PQfnumber(res, "tfrozenxid");
  	i_relpersistence = PQfnumber(res, "relpersistence");
! 	i_isscannable = PQfnumber(res, "isscannable");
  	i_relpages = PQfnumber(res, "relpages");
  	i_owning_tab = PQfnumber(res, "owning_tab");
  	i_owning_col = PQfnumber(res, "owning_col");
--- 4628,4634 ----
  	i_toastoid = PQfnumber(res, "toid");
  	i_toastfrozenxid = PQfnumber(res, "tfrozenxid");
  	i_relpersistence = PQfnumber(res, "relpersistence");
! 	i_relisscannable = PQfnumber(res, "relisscannable");
  	i_relpages = PQfnumber(res, "relpages");
  	i_owning_tab = PQfnumber(res, "owning_tab");
  	i_owning_col = PQfnumber(res, "owning_col");
*************** getTables(Archive *fout, int *numTables)
*** 4673,4679 ****
  		tblinfo[i].hasrules = (strcmp(PQgetvalue(res, i, i_relhasrules), "t") == 0);
  		tblinfo[i].hastriggers = (strcmp(PQgetvalue(res, i, i_relhastriggers), "t") == 0);
  		tblinfo[i].hasoids = (strcmp(PQgetvalue(res, i, i_relhasoids), "t") == 0);
! 		tblinfo[i].isscannable = (strcmp(PQgetvalue(res, i, i_isscannable), "t") == 0);
  		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  		tblinfo[i].frozenxid = atooid(PQgetvalue(res, i, i_relfrozenxid));
  		tblinfo[i].toast_oid = atooid(PQgetvalue(res, i, i_toastoid));
--- 4671,4677 ----
  		tblinfo[i].hasrules = (strcmp(PQgetvalue(res, i, i_relhasrules), "t") == 0);
  		tblinfo[i].hastriggers = (strcmp(PQgetvalue(res, i, i_relhastriggers), "t") == 0);
  		tblinfo[i].hasoids = (strcmp(PQgetvalue(res, i, i_relhasoids), "t") == 0);
! 		tblinfo[i].relisscannable = (strcmp(PQgetvalue(res, i, i_relisscannable), "t") == 0);
  		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  		tblinfo[i].frozenxid = atooid(PQgetvalue(res, i, i_relfrozenxid));
  		tblinfo[i].toast_oid = atooid(PQgetvalue(res, i, i_toastoid));
*************** dumpTableSchema(Archive *fout, TableInfo
*** 13100,13105 ****
--- 13098,13104 ----
  
  		/*
  		 * For materialized views, create the AS clause just like a view.
+ 		 * At this point, we always mark the view as not populated.
  		 */
  		if (tbinfo->relkind == RELKIND_MATVIEW)
  		{
*************** dumpTableSchema(Archive *fout, TableInfo
*** 13229,13234 ****
--- 13228,13250 ----
  		}
  
  		/*
+ 		 * In binary_upgrade mode, restore matviews' scannability status by
+ 		 * poking pg_class directly.  This is pretty ugly, but we can't use
+ 		 * REFRESH MATERIALIZED VIEW since it's possible that some underlying
+ 		 * matview is not populated even though this matview is.
+ 		 */
+ 		if (binary_upgrade && tbinfo->relkind == RELKIND_MATVIEW &&
+ 			tbinfo->relisscannable)
+ 		{
+ 			appendPQExpBuffer(q, "\n-- For binary upgrade, mark materialized view as scannable\n");
+ 			appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
+ 							  "SET relisscannable = 't'\n"
+ 							  "WHERE oid = ");
+ 			appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
+ 			appendPQExpBuffer(q, "::pg_catalog.regclass;\n");
+ 		}
+ 
+ 		/*
  		 * Dump additional per-column properties that we can't handle in the
  		 * main CREATE TABLE command.
  		 */
diff --git a/src/bin/pg_dump/pg_dump.h b/src/bin/pg_dump/pg_dump.h
index 7970a359bd80ceec7bca4aa62cb625c84a351bf6..8e761afcf46d21f9e8e1fb1fdf66932bd5add393 100644
*** a/src/bin/pg_dump/pg_dump.h
--- b/src/bin/pg_dump/pg_dump.h
*************** typedef struct _tableInfo
*** 236,241 ****
--- 236,242 ----
  	char	   *relacl;
  	char		relkind;
  	char		relpersistence; /* relation persistence */
+ 	bool		relisscannable;	/* is valid for use in queries */
  	char	   *reltablespace;	/* relation tablespace */
  	char	   *reloptions;		/* options specified by WITH (...) */
  	char	   *toast_reloptions;		/* ditto, for the TOAST table */
*************** typedef struct _tableInfo
*** 243,249 ****
  	bool		hasrules;		/* does it have any rules? */
  	bool		hastriggers;	/* does it have any triggers? */
  	bool		hasoids;		/* does it have OIDs? */
- 	bool		isscannable;	/* is valid for use in queries */
  	uint32		frozenxid;		/* for restore frozen xid */
  	Oid			toast_oid;		/* for restore toast frozen xid */
  	uint32		toast_frozenxid;	/* for restore toast frozen xid */
--- 244,249 ----
diff --git a/src/include/catalog/heap.h b/src/include/catalog/heap.h
index 97d507e4c2321fbd3c35fd6a2ff58fadc99f33df..6b60d55a362df9027c7022f24dd0565814c0ba8a 100644
*** a/src/include/catalog/heap.h
--- b/src/include/catalog/heap.h
*************** extern Oid heap_create_with_catalog(cons
*** 70,76 ****
  						 bool is_internal);
  
  extern void heap_create_init_fork(Relation rel);
- extern bool heap_is_matview_init_state(Relation rel);
  
  extern void heap_drop_with_catalog(Oid relid);
  
--- 70,75 ----
diff --git a/src/include/catalog/pg_class.h b/src/include/catalog/pg_class.h
index fd97141e9ef489431c7fc3600c6e19e81712da11..fb68d6f91cbdcb0fa8a8a32fd0b2896f7caced1e 100644
*** a/src/include/catalog/pg_class.h
--- b/src/include/catalog/pg_class.h
*************** CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI
*** 66,71 ****
--- 66,72 ----
  	bool		relhasrules;	/* has (or has had) any rules */
  	bool		relhastriggers; /* has (or has had) any TRIGGERs */
  	bool		relhassubclass; /* has (or has had) derived classes */
+ 	bool		relisscannable;	/* T if okay to read contents of relation */
  	TransactionId relfrozenxid; /* all Xids < this are frozen in this rel */
  	TransactionId relminmxid;	/* all multixacts in this rel are >= this.
  								 * this is really a MultiXactId */
*************** typedef FormData_pg_class *Form_pg_class
*** 93,99 ****
   * ----------------
   */
  
! #define Natts_pg_class					28
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
--- 94,100 ----
   * ----------------
   */
  
! #define Natts_pg_class					29
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
*************** typedef FormData_pg_class *Form_pg_class
*** 118,127 ****
  #define Anum_pg_class_relhasrules		22
  #define Anum_pg_class_relhastriggers	23
  #define Anum_pg_class_relhassubclass	24
! #define Anum_pg_class_relfrozenxid		25
! #define Anum_pg_class_relminmxid		26
! #define Anum_pg_class_relacl			27
! #define Anum_pg_class_reloptions		28
  
  /* ----------------
   *		initial contents of pg_class
--- 119,129 ----
  #define Anum_pg_class_relhasrules		22
  #define Anum_pg_class_relhastriggers	23
  #define Anum_pg_class_relhassubclass	24
! #define Anum_pg_class_relisscannable	25
! #define Anum_pg_class_relfrozenxid		26
! #define Anum_pg_class_relminmxid		27
! #define Anum_pg_class_relacl			28
! #define Anum_pg_class_reloptions		29
  
  /* ----------------
   *		initial contents of pg_class
*************** typedef FormData_pg_class *Form_pg_class
*** 136,148 ****
   * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId;
   * similarly, "1" in relminmxid stands for FirstMultiXactId
   */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 30 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 28 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
  
  
--- 138,150 ----
   * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId;
   * similarly, "1" in relminmxid stands for FirstMultiXactId
   */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 30 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
  
  
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index ef892978ffbabf891219394d9c796cd2a243d751..685b9c76cfac4b006dea50fdd874881691bc7da2 100644
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 3842 (  pg_view_is_ins
*** 1980,1987 ****
  DESCR("is a view insertable-into");
  DATA(insert OID = 3843 (  pg_view_is_updatable	PGNSP PGUID 12 10 0 0 0 f f f f t f s 1 0 16 "26" _null_ _null_ _null_ _null_ pg_view_is_updatable _null_ _null_ _null_ ));
  DESCR("is a view updatable");
- DATA(insert OID = 3846 (  pg_relation_is_scannable	PGNSP PGUID 12 10 0 0 0 f f f f t f s 1 0 16 "26" _null_ _null_ _null_ _null_ pg_relation_is_scannable _null_ _null_ _null_ ));
- DESCR("is a relation scannable");
  
  /* Deferrable unique constraint trigger */
  DATA(insert OID = 1250 (  unique_key_recheck	PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ unique_key_recheck _null_ _null_ _null_ ));
--- 1980,1985 ----
diff --git a/src/include/commands/matview.h b/src/include/commands/matview.h
index 09bc384086fa1054b615e4f7c01dcb0eb125ab68..e3ce2f2953139b850aa93da6aa34d5c0bde360e6 100644
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 20,26 ****
  #include "utils/relcache.h"
  
  
! extern void SetMatViewToPopulated(Relation relation);
  
  extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				  ParamListInfo params, char *completionTag);
--- 20,26 ----
  #include "utils/relcache.h"
  
  
! extern void SetMatViewPopulatedState(Relation relation, bool newstate);
  
  extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				  ParamListInfo params, char *completionTag);
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index e71876502e1e6e56aab6066ee7388383ce9c0a59..15b60abfcd9360730ff86fe85a6918599e5546e3 100644
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
*************** extern Datum pg_table_size(PG_FUNCTION_A
*** 461,467 ****
  extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filepath(PG_FUNCTION_ARGS);
- extern Datum pg_relation_is_scannable(PG_FUNCTION_ARGS);
  
  /* genfile.c */
  extern bytea *read_binary_file(const char *filename,
--- 461,466 ----
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 632743af9436e14f8b537e58d2fc0acbbb319873..67cb11193aeb660e869eb66ef36077fac67e9a7f 100644
*** a/src/include/utils/rel.h
--- b/src/include/utils/rel.h
*************** typedef struct RelationData
*** 77,83 ****
  	BackendId	rd_backend;		/* owning backend id, if temporary relation */
  	bool		rd_islocaltemp; /* rel is a temp rel of this session */
  	bool		rd_isnailed;	/* rel is nailed in cache */
- 	bool		rd_ispopulated;	/* matview has query results */
  	bool		rd_isvalid;		/* relcache entry is valid */
  	char		rd_indexvalid;	/* state of rd_indexlist: 0 = not valid, 1 =
  								 * valid, 2 = temporarily forced */
--- 77,82 ----
*************** typedef struct StdRdOptions
*** 408,414 ****
   * 		populated by its query.  This is likely to get more complicated later,
   * 		so use a macro which looks like a function.
   */
! #define RelationIsScannable(relation) ((relation)->rd_ispopulated)
  
  
  /* routines in utils/cache/relcache.c */
--- 407,421 ----
   * 		populated by its query.  This is likely to get more complicated later,
   * 		so use a macro which looks like a function.
   */
! #define RelationIsScannable(relation) ((relation)->rd_rel->relisscannable)
! 
! /*
!  * RelationIsPopulated
!  * 		Currently, we don't physically distinguish the "populated" and
!  *		"scannable" properties of matviews, but that may change later.
!  *		Hence, use the appropriate one of these macros in code tests.
!  */
! #define RelationIsPopulated(relation) ((relation)->rd_rel->relisscannable)
  
  
  /* routines in utils/cache/relcache.c */
diff --git a/src/test/regress/expected/matview.out b/src/test/regress/expected/matview.out
index 06bb2551a83480361ddfce5ad02b76240d9eaf63..87da43e85c777ca9d864be92e409616599a408e9 100644
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
*************** EXPLAIN (costs off)
*** 26,34 ****
  (2 rows)
  
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tm'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   f
  (1 row)
  
--- 26,34 ----
  (2 rows)
  
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT relisscannable FROM pg_class WHERE oid = 'tm'::regclass;
!  relisscannable 
! ----------------
   f
  (1 row)
  
*************** SELECT * FROM tm;
*** 36,44 ****
  ERROR:  materialized view "tm" has not been populated
  HINT:  Use the REFRESH MATERIALIZED VIEW command.
  REFRESH MATERIALIZED VIEW tm;
! SELECT pg_relation_is_scannable('tm'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   t
  (1 row)
  
--- 36,44 ----
  ERROR:  materialized view "tm" has not been populated
  HINT:  Use the REFRESH MATERIALIZED VIEW command.
  REFRESH MATERIALIZED VIEW tm;
! SELECT relisscannable FROM pg_class WHERE oid = 'tm'::regclass;
!  relisscannable 
! ----------------
   t
  (1 row)
  
*************** UNION ALL 
*** 354,362 ****
             FROM v_test2;
  
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT pg_relation_is_scannable('mv_test3'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   t
  (1 row)
  
--- 354,362 ----
             FROM v_test2;
  
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT relisscannable FROM pg_class WHERE oid = 'mv_test3'::regclass;
!  relisscannable 
! ----------------
   t
  (1 row)
  
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index a4ecfd2aeac36ba5137a3d17592a9b0751cbb201..b710ec62f46d806c04683eb33f386bb53eeda92c 100644
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
*************** SELECT viewname, definition FROM pg_view
*** 1347,1353 ****
                                   |     pg_get_userbyid(c.relowner) AS matviewowner,                                                                                                                                                               +
                                   |     t.spcname AS tablespace,                                                                                                                                                                                   +
                                   |     c.relhasindex AS hasindexes,                                                                                                                                                                               +
!                                  |     pg_relation_is_scannable(c.oid) AS isscannable,                                                                                                                                                            +
                                   |     pg_get_viewdef(c.oid) AS definition                                                                                                                                                                        +
                                   |    FROM ((pg_class c                                                                                                                                                                                           +
                                   |    LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))                                                                                                                                                     +
--- 1347,1353 ----
                                   |     pg_get_userbyid(c.relowner) AS matviewowner,                                                                                                                                                               +
                                   |     t.spcname AS tablespace,                                                                                                                                                                                   +
                                   |     c.relhasindex AS hasindexes,                                                                                                                                                                               +
!                                  |     c.relisscannable AS isscannable,                                                                                                                                                                           +
                                   |     pg_get_viewdef(c.oid) AS definition                                                                                                                                                                        +
                                   |    FROM ((pg_class c                                                                                                                                                                                           +
                                   |    LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))                                                                                                                                                     +
diff --git a/src/test/regress/sql/matview.sql b/src/test/regress/sql/matview.sql
index 09a7378133c86d66755dafd1629719a712badfdf..84b1af004288a2af9f7e583feed0fb424725051b 100644
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
*************** SELECT * FROM tv ORDER BY type;
*** 15,24 ****
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tm'::regclass);
  SELECT * FROM tm;
  REFRESH MATERIALIZED VIEW tm;
! SELECT pg_relation_is_scannable('tm'::regclass);
  CREATE UNIQUE INDEX tm_type ON tm (type);
  SELECT * FROM tm;
  
--- 15,24 ----
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT relisscannable FROM pg_class WHERE oid = 'tm'::regclass;
  SELECT * FROM tm;
  REFRESH MATERIALIZED VIEW tm;
! SELECT relisscannable FROM pg_class WHERE oid = 'tm'::regclass;
  CREATE UNIQUE INDEX tm_type ON tm (type);
  SELECT * FROM tm;
  
*************** CREATE VIEW v_test2 AS SELECT moo, 2*moo
*** 109,115 ****
  CREATE MATERIALIZED VIEW mv_test2 AS SELECT moo, 2*moo FROM v_test2 UNION ALL SELECT moo, 3*moo FROM v_test2;
  \d+ mv_test2
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT pg_relation_is_scannable('mv_test3'::regclass);
  
  DROP VIEW v_test1 CASCADE;
  
--- 109,115 ----
  CREATE MATERIALIZED VIEW mv_test2 AS SELECT moo, 2*moo FROM v_test2 UNION ALL SELECT moo, 3*moo FROM v_test2;
  \d+ mv_test2
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT relisscannable FROM pg_class WHERE oid = 'mv_test3'::regclass;
  
  DROP VIEW v_test1 CASCADE;

#63

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Tom Lane (#62)

Re: Remaining beta blockers

Tom Lane <tgl@sss.pgh.pa.us> wrote:

In the interests of moving the discussion along, attached are
draft patches to show what I think we should do in 9.3. The
first patch disables the unlogged-matviews feature at the user
level (and perhaps could be reverted someday), while the second
one moves scannability-state storage into a pg_class column.

I'm going to submit a modified version of the second patch today.
My biggest problems with it as it stands are the name chosen for
the new pg_class column, and the hard-coded assumption that this
relation-level flag is a good long-term indicator of whether all
connections find a matview to be scannable. Scannability should
be abstracted to allow easy addition of other factors as we add
them. Whether or not the "populated" state is in the catalog, it
is a serious mistake to conflate that with scannability.

Scannability will always be influenced by whether the matview has
been populated, but it is short-sighted not to draw a distinction
now, so that work people do in their applications is does not have
to be redone as we make scannability tests better. Not to mention
the confusion factor of treating them as this patch does and then
trying to properly draw the distinction later. IMV this patch, as
it stands, does much more to paint us into a corner regarding
future expansion than what came before.

As one example, I think it is highly desirable, long term, to allow
different sessions to set GUCs to different tolerances for old
data, and thus have different perspectives on whether a matview is
scannable; and this patch forces that to be the same for every
session. The code I committed allows expansion in the direction of
different session perspectives on scannability, and the suggested
patch closes that door.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#64

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Kevin Grittner (#63)

Re: Remaining beta blockers

Kevin Grittner <kgrittn@ymail.com> writes:

I'm going to submit a modified version of the second patch today.
My biggest problems with it as it stands are the name chosen for
the new pg_class column, and the hard-coded assumption that this
relation-level flag is a good long-term indicator of whether all
connections find a matview to be scannable.� Scannability should
be abstracted to allow easy addition of other factors as we add
them.� Whether or not the "populated" state is in the catalog, it
is a serious mistake to conflate that with scannability.

Scannability will always be influenced by whether the matview has
been populated, but it is short-sighted not to draw a distinction
now, so that work people do in their applications is does not have
to be redone as we make scannability tests better.� Not to mention
the confusion factor of treating them as this patch does and then
trying to properly draw the distinction later.� IMV this patch, as
it stands, does much more to paint us into a corner regarding
future expansion than what came before.

As one example, I think it is highly desirable, long term, to allow
different sessions to set GUCs to different tolerances for old
data, and thus have different perspectives on whether a matview is
scannable; and this patch forces that to be the same for every
session.� The code I committed allows expansion in the direction of
different session perspectives on scannability, and the suggested
patch closes that door.

[ raised eyebrow... ] There is certainly nothing about file-size-based
state that would particularly support per-session scannability
determination. If you want to call the pg_class column relispopulated
rather than relisscannable, I have no particular objection to that ---
but otherwise it sounds like you're moving the goalposts at the last
minute.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#65

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Tom Lane (#64)

Re: Remaining beta blockers

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kevin Grittner <kgrittn@ymail.com> writes:

I'm going to submit a modified version of the second patch today.
My biggest problems with it as it stands are the name chosen for
the new pg_class column, and the hard-coded assumption that this
relation-level flag is a good long-term indicator of whether all
connections find a matview to be scannable. Scannability should
be abstracted to allow easy addition of other factors as we add
them. Whether or not the "populated" state is in the catalog, it
is a serious mistake to conflate that with scannability.

Scannability will always be influenced by whether the matview has
been populated, but it is short-sighted not to draw a distinction
now, so that work people do in their applications is does not have
to be redone as we make scannability tests better. Not to mention
the confusion factor of treating them as this patch does and then
trying to properly draw the distinction later. IMV this patch, as
it stands, does much more to paint us into a corner regarding
future expansion than what came before.

As one example, I think it is highly desirable, long term, to allow
different sessions to set GUCs to different tolerances for old
data, and thus have different perspectives on whether a matview is
scannable; and this patch forces that to be the same for every
session. The code I committed allows expansion in the direction of
different session perspectives on scannability, and the suggested
patch closes that door.

[ raised eyebrow... ] There is certainly nothing about file-size-based
state that would particularly support per-session scannability
determination.

I didn't mean to suggest that there was; I was talking about
enshrining the notion that the relation was either scannable by all
or by none into pg_class.

If you want to call the pg_class column relispopulated rather
than relisscannable, I have no particular objection to that

That column name and the wording of some comments are the main
things, although I'm also wondering whether it is bad form to force
users to test the pg_class.relispopulated column if they want to
test whether they can currently scan a matview, by removing the
pg_relation_is_scannable() function. As I mentioned earlier when
you asked why these two distinct properties weren't both exposed, I
mentioned that I hadn't thought that the "populated" property was
likely to be useful at the SQL level, but then questioned that,
saying that I wasn't sure I picked the right property to pay
attention to in pg_dump - and if pg_dump needed the "populated"
property it had to be exposed. I've come around to thinking that
it is more proper to use populated, but I have the same question
you asked earlier -- If it will be important or users to understand
that these are distinct properties, why are we just exposing one of
them?

In the absence of a specific use case, maybe it is OK to leave
pg_relation_is_scannable() off for this release, but I worry we'll
encourage fuzzy thinking on the point that could be hard to undo
later. The flip side of that is that it might be confusing to try
to explain why users should care which test they use before they
are capable of returning different results. They might "get it"
more easily if the function is introduced at the same time we
introduce other criteria for determining scannability (primarily
based around how current the results are) and other methods for
dealing with stale data (like the ability to force a
concurrency-friendly form of REFRESH on an attempt to query stale
data).

Also, rather than do the direct update to pg_class in pg_dump, how
would you feel about an ALTER MATERIALIZED VIEW option to set the
populated state? I haven't written that yet, but I figured that
when we went to an in-catalog representation of populated state, we
would want that.

I'm just reviewing the changes I made, and figured it might be good
to show a diff between my form of the patch and yours, but I'm
getting a lot spurious differences based on how we generate our
context diff files (or maybe the versions of some software
involved). You you share how you generate your patch file?

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#66

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Bruce Momjian (#60)

Re: Remaining beta blockers

Bruce Momjian <bruce@momjian.us> wrote:

Things like how fresh the materialized view is certainly should
be accessible via SQL and transactional.

Keep in mind that something like "whether the materialized view is
fresh enough to be scanned by this connection" may eventually
depend on session GUCs and the passage of time in the absence of
any database modifications. That's why I believe that checking for
scannability should be done through a function, not a check of a
system table.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#67

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Kevin Grittner (#65)

1 attachment(s)

Re: Remaining beta blockers

Kevin Grittner <kgrittn@ymail.com> wrote:

That column name and the wording of some comments are the main
things

Patch for that attached. I left the part where you got rid of the
SQL function to allow users to test whether a matview is currently
scannable, and I did not add an AMV option to change the populated
flag, since those haven't had any real discussion yet.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

move-populated-state.patchtext/x-patch; name=move-populated-state.patchDownload

*** a/doc/src/sgml/catalogs.sgml
--- b/doc/src/sgml/catalogs.sgml
***************
*** 1864,1869 ****
--- 1864,1877 ----
       </row>
  
       <row>
+       <entry><structfield>relispopulated</structfield></entry>
+       <entry><type>bool</type></entry>
+       <entry></entry>
+       <entry>True if okay to read contents of relation (this is true for all
+       relations other than some materialized views)</entry>
+      </row>
+ 
+      <row>
        <entry><structfield>relfrozenxid</structfield></entry>
        <entry><type>xid</type></entry>
        <entry></entry>
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 14239,14248 **** SELECT pg_type_is_visible('myschema.widget'::regtype);
     </indexterm>
  
     <indexterm>
-     <primary>pg_relation_is_scannable</primary>
-    </indexterm>
- 
-    <indexterm>
      <primary>pg_typeof</primary>
     </indexterm>
  
--- 14239,14244 ----
***************
*** 14411,14421 **** SELECT pg_type_is_visible('myschema.widget'::regtype);
         <entry>get the path in the file system that this tablespace is located in</entry>
        </row>
        <row>
-        <entry><literal><function>pg_relation_is_scannable(<parameter>relation_oid</parameter>)</function></literal></entry>
-        <entry><type>boolean</type></entry>
-        <entry>is the relation scannable; a materialized view which has not been loaded will not be scannable</entry>
-       </row>
-       <row>
         <entry><literal><function>pg_typeof(<parameter>any</parameter>)</function></literal></entry>
         <entry><type>regtype</type></entry>
         <entry>get the data type of any value</entry>
--- 14407,14412 ----
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
***************
*** 780,785 **** InsertPgClassTuple(Relation pg_class_desc,
--- 780,786 ----
  	values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
  	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
  	values[Anum_pg_class_relhassubclass - 1] = BoolGetDatum(rd_rel->relhassubclass);
+ 	values[Anum_pg_class_relispopulated - 1] = BoolGetDatum(rd_rel->relispopulated);
  	values[Anum_pg_class_relfrozenxid - 1] = TransactionIdGetDatum(rd_rel->relfrozenxid);
  	values[Anum_pg_class_relminmxid - 1] = MultiXactIdGetDatum(rd_rel->relminmxid);
  	if (relacl != (Datum) 0)
***************
*** 1346,1371 **** heap_create_init_fork(Relation rel)
  }
  
  /*
-  * Check whether a materialized view is in an initial, unloaded state.
-  *
-  * The check here must match what is set up in heap_create_init_fork().
-  * Currently the init fork is an empty file.  A missing heap is also
-  * considered to be unloaded.
-  */
- bool
- heap_is_matview_init_state(Relation rel)
- {
- 	Assert(rel->rd_rel->relkind == RELKIND_MATVIEW);
- 
- 	RelationOpenSmgr(rel);
- 
- 	if (!smgrexists(rel->rd_smgr, MAIN_FORKNUM))
- 		return true;
- 
- 	return (smgrnblocks(rel->rd_smgr, MAIN_FORKNUM) < 1);
- }
- 
- /*
   *		RelationRemoveInheritance
   *
   * Formerly, this routine checked for child relations and aborted the
--- 1347,1352 ----
*** a/src/backend/catalog/system_views.sql
--- b/src/backend/catalog/system_views.sql
***************
*** 101,107 **** CREATE VIEW pg_matviews AS
          pg_get_userbyid(C.relowner) AS matviewowner,
          T.spcname AS tablespace,
          C.relhasindex AS hasindexes,
!         pg_relation_is_scannable(C.oid) AS isscannable,
          pg_get_viewdef(C.oid) AS definition
      FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
           LEFT JOIN pg_tablespace T ON (T.oid = C.reltablespace)
--- 101,107 ----
          pg_get_userbyid(C.relowner) AS matviewowner,
          T.spcname AS tablespace,
          C.relhasindex AS hasindexes,
!         C.relispopulated AS ispopulated,
          pg_get_viewdef(C.oid) AS definition
      FROM pg_class C LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
           LEFT JOIN pg_tablespace T ON (T.oid = C.reltablespace)
*** a/src/backend/commands/cluster.c
--- b/src/backend/commands/cluster.c
***************
*** 30,36 ****
  #include "catalog/objectaccess.h"
  #include "catalog/toasting.h"
  #include "commands/cluster.h"
- #include "commands/matview.h"
  #include "commands/tablecmds.h"
  #include "commands/vacuum.h"
  #include "miscadmin.h"
--- 30,35 ----
***************
*** 388,394 **** cluster_rel(Oid tableOid, Oid indexOid, bool recheck, bool verbose,
  	 * database.
  	 */
  	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW &&
! 		!OldHeap->rd_ispopulated)
  	{
  		relation_close(OldHeap, AccessExclusiveLock);
  		return;
--- 387,393 ----
  	 * database.
  	 */
  	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW &&
! 		!RelationIsPopulated(OldHeap))
  	{
  		relation_close(OldHeap, AccessExclusiveLock);
  		return;
***************
*** 922,931 **** copy_heap_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex,
  						get_namespace_name(RelationGetNamespace(OldHeap)),
  						RelationGetRelationName(OldHeap))));
  
- 	if (OldHeap->rd_rel->relkind == RELKIND_MATVIEW)
- 		/* Make sure the heap looks good even if no rows are written. */
- 		SetMatViewToPopulated(NewHeap);
- 
  	/*
  	 * Scan through the OldHeap, either in OldIndex order or sequentially;
  	 * copy each tuple into the NewHeap, or transiently to the tuplesort
--- 921,926 ----
*** a/src/backend/commands/createas.c
--- b/src/backend/commands/createas.c
***************
*** 359,368 **** intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  	 */
  	intoRelationDesc = heap_open(intoRelationId, AccessExclusiveLock);
  
- 	if (is_matview && !into->skipData)
- 		/* Make sure the heap looks good even if no rows are written. */
- 		SetMatViewToPopulated(intoRelationDesc);
- 
  	/*
  	 * Check INSERT permission on the constructed table.
  	 *
--- 359,364 ----
***************
*** 382,387 **** intorel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
--- 378,390 ----
  	ExecCheckRTPerms(list_make1(rte), true);
  
  	/*
+ 	 * Tentatively mark the target as populated, if it's a matview and we're
+ 	 * going to fill it; otherwise, no change needed.
+ 	 */
+ 	if (is_matview && !into->skipData)
+ 		SetMatViewPopulatedState(intoRelationDesc, true);
+ 
+ 	/*
  	 * Fill private fields of myState for use by later routines
  	 */
  	myState->rel = intoRelationDesc;
*** a/src/backend/commands/matview.c
--- b/src/backend/commands/matview.c
***************
*** 31,36 ****
--- 31,37 ----
  #include "storage/smgr.h"
  #include "tcop/tcopprot.h"
  #include "utils/snapmgr.h"
+ #include "utils/syscache.h"
  
  
  typedef struct
***************
*** 52,89 **** static void refresh_matview_datafill(DestReceiver *dest, Query *query,
  									 const char *queryString);
  
  /*
!  * SetMatViewToPopulated
!  *		Indicate that the materialized view has been populated by its query.
!  *
!  * NOTE: The heap starts out in a state that doesn't look scannable, and can
!  * only transition from there to scannable at the time a new heap is created.
   *
   * NOTE: caller must be holding an appropriate lock on the relation.
   */
  void
! SetMatViewToPopulated(Relation relation)
  {
! 	Page        page;
  
  	Assert(relation->rd_rel->relkind == RELKIND_MATVIEW);
- 	Assert(relation->rd_ispopulated == false);
- 
- 	page = (Page) palloc(BLCKSZ);
- 	PageInit(page, BLCKSZ, 0);
  
! 	if (RelationNeedsWAL(relation))
! 		log_newpage(&(relation->rd_node), MAIN_FORKNUM, 0, page);
  
! 	RelationOpenSmgr(relation);
  
! 	PageSetChecksumInplace(page, 0);
! 	smgrextend(relation->rd_smgr, MAIN_FORKNUM, 0, (char *) page, true);
  
! 	pfree(page);
  
! 	smgrimmedsync(relation->rd_smgr, MAIN_FORKNUM);
  
! 	RelationCacheInvalidateEntry(relation->rd_id);
  }
  
  /*
--- 53,97 ----
  									 const char *queryString);
  
  /*
!  * SetMatViewPopulatedState
!  *		Mark a materialized view as populated, or not.
   *
   * NOTE: caller must be holding an appropriate lock on the relation.
   */
  void
! SetMatViewPopulatedState(Relation relation, bool newstate)
  {
! 	Relation	pgrel;
! 	HeapTuple	tuple;
  
  	Assert(relation->rd_rel->relkind == RELKIND_MATVIEW);
  
! 	/*
! 	 * Update relation's pg_class entry.  Crucial side-effect: other backends
! 	 * (and this one too!) are sent SI message to make them rebuild relcache
! 	 * entries.
! 	 */
! 	pgrel = heap_open(RelationRelationId, RowExclusiveLock);
! 	tuple = SearchSysCacheCopy1(RELOID,
! 								ObjectIdGetDatum(RelationGetRelid(relation)));
! 	if (!HeapTupleIsValid(tuple))
! 		elog(ERROR, "cache lookup failed for relation %u",
! 			 RelationGetRelid(relation));
  
! 	((Form_pg_class) GETSTRUCT(tuple))->relispopulated = newstate;
  
! 	simple_heap_update(pgrel, &tuple->t_self, tuple);
  
! 	CatalogUpdateIndexes(pgrel, tuple);
  
! 	heap_freetuple(tuple);
! 	heap_close(pgrel, RowExclusiveLock);
  
! 	/*
! 	 * Advance command counter to make the updated pg_class row locally
! 	 * visible.
! 	 */
! 	CommandCounterIncrement();
  }
  
  /*
***************
*** 97,110 **** SetMatViewToPopulated(Relation relation)
   * If WITH NO DATA was specified, this is effectively like a TRUNCATE;
   * otherwise it is like a TRUNCATE followed by an INSERT using the SELECT
   * statement associated with the materialized view.  The statement node's
!  * skipData field is used to indicate that the clause was used.
   *
   * Indexes are rebuilt too, via REINDEX. Since we are effectively bulk-loading
   * the new heap, it's better to create the indexes afterwards than to fill them
   * incrementally while we load.
   *
!  * The scannable state is changed based on whether the contents reflect the
!  * result set of the materialized view's query.
   */
  void
  ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
--- 105,118 ----
   * If WITH NO DATA was specified, this is effectively like a TRUNCATE;
   * otherwise it is like a TRUNCATE followed by an INSERT using the SELECT
   * statement associated with the materialized view.  The statement node's
!  * skipData field shows whether the clause was used.
   *
   * Indexes are rebuilt too, via REINDEX. Since we are effectively bulk-loading
   * the new heap, it's better to create the indexes afterwards than to fill them
   * incrementally while we load.
   *
!  * The matview's "populated" state is changed based on whether the contents
!  * reflect the result set of the materialized view's query.
   */
  void
  ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
***************
*** 184,189 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
--- 192,203 ----
  	 */
  	CheckTableNotInUse(matviewRel, "REFRESH MATERIALIZED VIEW");
  
+ 	/*
+ 	 * Tentatively mark the matview as populated or not (this will roll back
+ 	 * if we fail later).
+ 	 */
+ 	SetMatViewPopulatedState(matviewRel, !stmt->skipData);
+ 
  	tableSpace = matviewRel->rd_rel->reltablespace;
  
  	heap_close(matviewRel, NoLock);
***************
*** 192,197 **** ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
--- 206,212 ----
  	OIDNewHeap = make_new_heap(matviewOid, tableSpace);
  	dest = CreateTransientRelDestReceiver(OIDNewHeap);
  
+ 	/* Generate the data, if wanted. */
  	if (!stmt->skipData)
  		refresh_matview_datafill(dest, dataQuery, queryString);
  
***************
*** 300,307 **** transientrel_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  		myState->hi_options |= HEAP_INSERT_SKIP_WAL;
  	myState->bistate = GetBulkInsertState();
  
- 	SetMatViewToPopulated(transientrel);
- 
  	/* Not using WAL requires smgr_targblock be initially invalid */
  	Assert(RelationGetTargetBlock(transientrel) == InvalidBlockNumber);
  }
--- 315,320 ----
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***************
*** 230,242 **** lazy_vacuum_rel(Relation onerel, VacuumStmt *vacstmt,
  	 *
  	 * Don't even think about it unless we have a shot at releasing a goodly
  	 * number of pages.  Otherwise, the time taken isn't worth it.
- 	 *
- 	 * Leave a populated materialized view with at least one page.
  	 */
- 	if (onerel->rd_rel->relkind == RELKIND_MATVIEW &&
- 		vacrelstats->nonempty_pages == 0)
- 		vacrelstats->nonempty_pages = 1;
- 
  	possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages;
  	if (possibly_freeable > 0 &&
  		(possibly_freeable >= REL_TRUNCATE_MINIMUM ||
--- 230,236 ----
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***************
*** 834,863 **** pg_relation_filepath(PG_FUNCTION_ARGS)
  
  	PG_RETURN_TEXT_P(cstring_to_text(path));
  }
- 
- 
- /*
-  * Indicate whether a relation is scannable.
-  *
-  * Currently, this is always true except for a materialized view which has not
-  * been populated.  It is expected that other conditions for allowing a
-  * materialized view to be scanned will be added in later releases.
-  */
- Datum
- pg_relation_is_scannable(PG_FUNCTION_ARGS)
- {
- 	Oid			relid;
- 	Relation	relation;
- 	bool		result;
- 
- 	relid = PG_GETARG_OID(0);
- 	relation = try_relation_open(relid, AccessShareLock);
- 
- 	if (relation == NULL)
- 		PG_RETURN_BOOL(false);
- 
- 	result = RelationIsScannable(relation);
- 
- 	relation_close(relation, AccessShareLock);
- 	PG_RETURN_BOOL(result);
- }
--- 834,836 ----
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
***************
*** 37,43 ****
  #include "access/transam.h"
  #include "access/xact.h"
  #include "catalog/catalog.h"
- #include "catalog/heap.h"
  #include "catalog/index.h"
  #include "catalog/indexing.h"
  #include "catalog/namespace.h"
--- 37,42 ----
***************
*** 956,967 **** RelationBuildDesc(Oid targetRelId, bool insertIt)
  	/* make sure relation is marked as having no open file yet */
  	relation->rd_smgr = NULL;
  
- 	if (relation->rd_rel->relkind == RELKIND_MATVIEW &&
- 		heap_is_matview_init_state(relation))
- 		relation->rd_ispopulated = false;
- 	else
- 		relation->rd_ispopulated = true;
- 
  	/*
  	 * now we can free the memory allocated for pg_class_tuple
  	 */
--- 955,960 ----
***************
*** 1459,1464 **** formrdesc(const char *relationName, Oid relationReltype,
--- 1452,1460 ----
  	/* formrdesc is used only for permanent relations */
  	relation->rd_rel->relpersistence = RELPERSISTENCE_PERMANENT;
  
+ 	/* ... and they're always scannable, too */
+ 	relation->rd_rel->relispopulated = true;
+ 
  	relation->rd_rel->relpages = 0;
  	relation->rd_rel->reltuples = 0;
  	relation->rd_rel->relallvisible = 0;
***************
*** 1531,1537 **** formrdesc(const char *relationName, Oid relationReltype,
  	 * initialize physical addressing information for the relation
  	 */
  	RelationInitPhysicalAddr(relation);
- 	relation->rd_ispopulated = true;
  
  	/*
  	 * initialize the rel-has-index flag, using hardwired knowledge
--- 1527,1532 ----
***************
*** 1756,1762 **** RelationReloadIndexInfo(Relation relation)
  	heap_freetuple(pg_class_tuple);
  	/* We must recalculate physical address in case it changed */
  	RelationInitPhysicalAddr(relation);
- 	relation->rd_ispopulated = true;
  
  	/*
  	 * For a non-system index, there are fields of the pg_index row that are
--- 1751,1756 ----
***************
*** 1905,1915 **** RelationClearRelation(Relation relation, bool rebuild)
  	if (relation->rd_isnailed)
  	{
  		RelationInitPhysicalAddr(relation);
- 		if (relation->rd_rel->relkind == RELKIND_MATVIEW &&
- 			heap_is_matview_init_state(relation))
- 			relation->rd_ispopulated = false;
- 		else
- 			relation->rd_ispopulated = true;
  
  		if (relation->rd_rel->relkind == RELKIND_INDEX)
  		{
--- 1899,1904 ----
***************
*** 2671,2676 **** RelationBuildLocalRelation(const char *relname,
--- 2660,2671 ----
  			break;
  	}
  
+ 	/* if it's a materialized view, it's not populated initially */
+ 	if (relkind == RELKIND_MATVIEW)
+ 		rel->rd_rel->relispopulated = false;
+ 	else
+ 		rel->rd_rel->relispopulated = true;
+ 
  	/*
  	 * Insert relation physical and logical identifiers (OIDs) into the right
  	 * places.	For a mapped relation, we set relfilenode to zero and rely on
***************
*** 2698,2709 **** RelationBuildLocalRelation(const char *relname,
  
  	RelationInitPhysicalAddr(rel);
  
- 	/* materialized view not initially scannable */
- 	if (relkind == RELKIND_MATVIEW)
- 		rel->rd_ispopulated = false;
- 	else
- 		rel->rd_ispopulated = true;
- 
  	/*
  	 * Okay to insert into the relcache hash tables.
  	 */
--- 2693,2698 ----
***************
*** 4448,4458 **** load_relcache_init_file(bool shared)
  		 */
  		RelationInitLockInfo(rel);
  		RelationInitPhysicalAddr(rel);
- 		if (rel->rd_rel->relkind == RELKIND_MATVIEW &&
- 			heap_is_matview_init_state(rel))
- 			rel->rd_ispopulated = false;
- 		else
- 			rel->rd_ispopulated = true;
  	}
  
  	/*
--- 4437,4442 ----
*** a/src/bin/pg_dump/pg_dump.c
--- b/src/bin/pg_dump/pg_dump.c
***************
*** 1758,1765 **** refreshMatViewData(Archive *fout, TableDataInfo *tdinfo)
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	PQExpBuffer q;
  
! 	/* If the materialized view is not flagged as scannable, skip this. */
! 	if (!tbinfo->isscannable)
  		return;
  
  	q = createPQExpBuffer();
--- 1758,1765 ----
  	TableInfo  *tbinfo = tdinfo->tdtable;
  	PQExpBuffer q;
  
! 	/* If the materialized view is not flagged as populated, skip this. */
! 	if (!tbinfo->ispopulated)
  		return;
  
  	q = createPQExpBuffer();
***************
*** 1966,1973 **** buildMatViewRefreshDependencies(Archive *fout)
  
  		addObjectDependency(dobj, refdobj->dumpId);
  
! 		if (!reftbinfo->isscannable)
! 			tbinfo->isscannable = false;
  	}
  
  	PQclear(res);
--- 1966,1973 ----
  
  		addObjectDependency(dobj, refdobj->dumpId);
  
! 		if (!reftbinfo->relispopulated)
! 			tbinfo->relispopulated = false;
  	}
  
  	PQclear(res);
***************
*** 4218,4224 **** getTables(Archive *fout, int *numTables)
  	int			i_toastoid;
  	int			i_toastfrozenxid;
  	int			i_relpersistence;
! 	int			i_isscannable;
  	int			i_owning_tab;
  	int			i_owning_col;
  	int			i_reltablespace;
--- 4218,4224 ----
  	int			i_toastoid;
  	int			i_toastfrozenxid;
  	int			i_relpersistence;
! 	int			i_relispopulated;
  	int			i_owning_tab;
  	int			i_owning_col;
  	int			i_reltablespace;
***************
*** 4264,4271 **** getTables(Archive *fout, int *numTables)
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, "
! 						  "CASE WHEN c.relkind = '%c' THEN pg_relation_is_scannable(c.oid) ELSE 't'::bool END as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4264,4270 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, c.relispopulated, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4283,4289 **** getTables(Archive *fout, int *numTables)
  				   "WHERE c.relkind in ('%c', '%c', '%c', '%c', '%c', '%c') "
  						  "ORDER BY c.oid",
  						  username_subquery,
- 						  RELKIND_MATVIEW,
  						  RELKIND_SEQUENCE,
  						  RELKIND_RELATION, RELKIND_SEQUENCE,
  						  RELKIND_VIEW, RELKIND_COMPOSITE_TYPE,
--- 4282,4287 ----
***************
*** 4303,4309 **** getTables(Archive *fout, int *numTables)
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4301,4307 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "c.relpersistence, 't' as relispopulated, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4340,4346 **** getTables(Archive *fout, int *numTables)
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4338,4344 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "c.relpages, "
  						  "CASE WHEN c.reloftype <> 0 THEN c.reloftype::pg_catalog.regtype ELSE NULL END AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4376,4382 **** getTables(Archive *fout, int *numTables)
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4374,4380 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4412,4418 **** getTables(Archive *fout, int *numTables)
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4410,4416 ----
  						  "c.relhasindex, c.relhasrules, c.relhasoids, "
  						  "c.relfrozenxid, tc.oid AS toid, "
  						  "tc.relfrozenxid AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "c.relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4449,4455 **** getTables(Archive *fout, int *numTables)
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4447,4453 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4485,4491 **** getTables(Archive *fout, int *numTables)
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
--- 4483,4489 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "d.refobjid AS owning_tab, "
***************
*** 4517,4523 **** getTables(Archive *fout, int *numTables)
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4515,4521 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
***************
*** 4544,4550 **** getTables(Archive *fout, int *numTables)
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4542,4548 ----
  						  "0 AS relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
***************
*** 4581,4587 **** getTables(Archive *fout, int *numTables)
  						  "0 as relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't'::bool as isscannable, "
  						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
--- 4579,4585 ----
  						  "0 as relfrozenxid, "
  						  "0 AS toid, "
  						  "0 AS tfrozenxid, "
! 						  "'p' AS relpersistence, 't' as relispopulated, "
  						  "0 AS relpages, "
  						  "NULL AS reloftype, "
  						  "NULL::oid AS owning_tab, "
***************
*** 4630,4636 **** getTables(Archive *fout, int *numTables)
  	i_toastoid = PQfnumber(res, "toid");
  	i_toastfrozenxid = PQfnumber(res, "tfrozenxid");
  	i_relpersistence = PQfnumber(res, "relpersistence");
! 	i_isscannable = PQfnumber(res, "isscannable");
  	i_relpages = PQfnumber(res, "relpages");
  	i_owning_tab = PQfnumber(res, "owning_tab");
  	i_owning_col = PQfnumber(res, "owning_col");
--- 4628,4634 ----
  	i_toastoid = PQfnumber(res, "toid");
  	i_toastfrozenxid = PQfnumber(res, "tfrozenxid");
  	i_relpersistence = PQfnumber(res, "relpersistence");
! 	i_relispopulated = PQfnumber(res, "relispopulated");
  	i_relpages = PQfnumber(res, "relpages");
  	i_owning_tab = PQfnumber(res, "owning_tab");
  	i_owning_col = PQfnumber(res, "owning_col");
***************
*** 4673,4679 **** getTables(Archive *fout, int *numTables)
  		tblinfo[i].hasrules = (strcmp(PQgetvalue(res, i, i_relhasrules), "t") == 0);
  		tblinfo[i].hastriggers = (strcmp(PQgetvalue(res, i, i_relhastriggers), "t") == 0);
  		tblinfo[i].hasoids = (strcmp(PQgetvalue(res, i, i_relhasoids), "t") == 0);
! 		tblinfo[i].isscannable = (strcmp(PQgetvalue(res, i, i_isscannable), "t") == 0);
  		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  		tblinfo[i].frozenxid = atooid(PQgetvalue(res, i, i_relfrozenxid));
  		tblinfo[i].toast_oid = atooid(PQgetvalue(res, i, i_toastoid));
--- 4671,4677 ----
  		tblinfo[i].hasrules = (strcmp(PQgetvalue(res, i, i_relhasrules), "t") == 0);
  		tblinfo[i].hastriggers = (strcmp(PQgetvalue(res, i, i_relhastriggers), "t") == 0);
  		tblinfo[i].hasoids = (strcmp(PQgetvalue(res, i, i_relhasoids), "t") == 0);
! 		tblinfo[i].relispopulated = (strcmp(PQgetvalue(res, i, i_relispopulated), "t") == 0);
  		tblinfo[i].relpages = atoi(PQgetvalue(res, i, i_relpages));
  		tblinfo[i].frozenxid = atooid(PQgetvalue(res, i, i_relfrozenxid));
  		tblinfo[i].toast_oid = atooid(PQgetvalue(res, i, i_toastoid));
***************
*** 13100,13105 **** dumpTableSchema(Archive *fout, TableInfo *tbinfo)
--- 13098,13104 ----
  
  		/*
  		 * For materialized views, create the AS clause just like a view.
+ 		 * At this point, we always mark the view as not populated.
  		 */
  		if (tbinfo->relkind == RELKIND_MATVIEW)
  		{
***************
*** 13229,13234 **** dumpTableSchema(Archive *fout, TableInfo *tbinfo)
--- 13228,13250 ----
  		}
  
  		/*
+ 		 * In binary_upgrade mode, restore matviews' populated status by
+ 		 * poking pg_class directly.  This is pretty ugly, but we can't use
+ 		 * REFRESH MATERIALIZED VIEW since it's possible that some underlying
+ 		 * matview is not populated even though this matview is.
+ 		 */
+ 		if (binary_upgrade && tbinfo->relkind == RELKIND_MATVIEW &&
+ 			tbinfo->relispopulated)
+ 		{
+ 			appendPQExpBuffer(q, "\n-- For binary upgrade, mark materialized view as populated\n");
+ 			appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
+ 							  "SET relispopulated = 't'\n"
+ 							  "WHERE oid = ");
+ 			appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
+ 			appendPQExpBuffer(q, "::pg_catalog.regclass;\n");
+ 		}
+ 
+ 		/*
  		 * Dump additional per-column properties that we can't handle in the
  		 * main CREATE TABLE command.
  		 */
*** a/src/bin/pg_dump/pg_dump.h
--- b/src/bin/pg_dump/pg_dump.h
***************
*** 236,241 **** typedef struct _tableInfo
--- 236,242 ----
  	char	   *relacl;
  	char		relkind;
  	char		relpersistence; /* relation persistence */
+ 	bool		relispopulated;	/* can be considered for use in queries */
  	char	   *reltablespace;	/* relation tablespace */
  	char	   *reloptions;		/* options specified by WITH (...) */
  	char	   *toast_reloptions;		/* ditto, for the TOAST table */
***************
*** 243,249 **** typedef struct _tableInfo
  	bool		hasrules;		/* does it have any rules? */
  	bool		hastriggers;	/* does it have any triggers? */
  	bool		hasoids;		/* does it have OIDs? */
! 	bool		isscannable;	/* is valid for use in queries */
  	uint32		frozenxid;		/* for restore frozen xid */
  	Oid			toast_oid;		/* for restore toast frozen xid */
  	uint32		toast_frozenxid;	/* for restore toast frozen xid */
--- 244,250 ----
  	bool		hasrules;		/* does it have any rules? */
  	bool		hastriggers;	/* does it have any triggers? */
  	bool		hasoids;		/* does it have OIDs? */
! 	bool		ispopulated;	/* does matview holds query results? */
  	uint32		frozenxid;		/* for restore frozen xid */
  	Oid			toast_oid;		/* for restore toast frozen xid */
  	uint32		toast_frozenxid;	/* for restore toast frozen xid */
*** a/src/include/catalog/heap.h
--- b/src/include/catalog/heap.h
***************
*** 70,76 **** extern Oid heap_create_with_catalog(const char *relname,
  						 bool is_internal);
  
  extern void heap_create_init_fork(Relation rel);
- extern bool heap_is_matview_init_state(Relation rel);
  
  extern void heap_drop_with_catalog(Oid relid);
  
--- 70,75 ----
*** a/src/include/catalog/pg_class.h
--- b/src/include/catalog/pg_class.h
***************
*** 66,71 **** CATALOG(pg_class,1259) BKI_BOOTSTRAP BKI_ROWTYPE_OID(83) BKI_SCHEMA_MACRO
--- 66,72 ----
  	bool		relhasrules;	/* has (or has had) any rules */
  	bool		relhastriggers; /* has (or has had) any TRIGGERs */
  	bool		relhassubclass; /* has (or has had) derived classes */
+ 	bool		relispopulated;	/* matview currently holds query results */
  	TransactionId relfrozenxid; /* all Xids < this are frozen in this rel */
  	TransactionId relminmxid;	/* all multixacts in this rel are >= this.
  								 * this is really a MultiXactId */
***************
*** 93,99 **** typedef FormData_pg_class *Form_pg_class;
   * ----------------
   */
  
! #define Natts_pg_class					28
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
--- 94,100 ----
   * ----------------
   */
  
! #define Natts_pg_class					29
  #define Anum_pg_class_relname			1
  #define Anum_pg_class_relnamespace		2
  #define Anum_pg_class_reltype			3
***************
*** 118,127 **** typedef FormData_pg_class *Form_pg_class;
  #define Anum_pg_class_relhasrules		22
  #define Anum_pg_class_relhastriggers	23
  #define Anum_pg_class_relhassubclass	24
! #define Anum_pg_class_relfrozenxid		25
! #define Anum_pg_class_relminmxid		26
! #define Anum_pg_class_relacl			27
! #define Anum_pg_class_reloptions		28
  
  /* ----------------
   *		initial contents of pg_class
--- 119,129 ----
  #define Anum_pg_class_relhasrules		22
  #define Anum_pg_class_relhastriggers	23
  #define Anum_pg_class_relhassubclass	24
! #define Anum_pg_class_relispopulated	25
! #define Anum_pg_class_relfrozenxid		26
! #define Anum_pg_class_relminmxid		27
! #define Anum_pg_class_relacl			28
! #define Anum_pg_class_reloptions		29
  
  /* ----------------
   *		initial contents of pg_class
***************
*** 136,148 **** typedef FormData_pg_class *Form_pg_class;
   * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId;
   * similarly, "1" in relminmxid stands for FirstMultiXactId
   */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 30 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 28 0 t f f f f 3 1 _null_ _null_ ));
  DESCR("");
  
  
--- 138,150 ----
   * Note: "3" in the relfrozenxid column stands for FirstNormalTransactionId;
   * similarly, "1" in relminmxid stands for FirstMultiXactId
   */
! DATA(insert OID = 1247 (  pg_type		PGNSP 71 0 PGUID 0 0 0 0 0 0 0 0 f f p r 30 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1249 (  pg_attribute	PGNSP 75 0 PGUID 0 0 0 0 0 0 0 0 f f p r 21 0 f f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1255 (  pg_proc		PGNSP 81 0 PGUID 0 0 0 0 0 0 0 0 f f p r 27 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
! DATA(insert OID = 1259 (  pg_class		PGNSP 83 0 PGUID 0 0 0 0 0 0 0 0 f f p r 29 0 t f f f f t 3 1 _null_ _null_ ));
  DESCR("");
  
  
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***************
*** 1980,1987 **** DATA(insert OID = 3842 (  pg_view_is_insertable PGNSP PGUID 12 10 0 0 0 f f f f
  DESCR("is a view insertable-into");
  DATA(insert OID = 3843 (  pg_view_is_updatable	PGNSP PGUID 12 10 0 0 0 f f f f t f s 1 0 16 "26" _null_ _null_ _null_ _null_ pg_view_is_updatable _null_ _null_ _null_ ));
  DESCR("is a view updatable");
- DATA(insert OID = 3846 (  pg_relation_is_scannable	PGNSP PGUID 12 10 0 0 0 f f f f t f s 1 0 16 "26" _null_ _null_ _null_ _null_ pg_relation_is_scannable _null_ _null_ _null_ ));
- DESCR("is a relation scannable");
  
  /* Deferrable unique constraint trigger */
  DATA(insert OID = 1250 (  unique_key_recheck	PGNSP PGUID 12 1 0 0 0 f f f f t f v 0 0 2279 "" _null_ _null_ _null_ _null_ unique_key_recheck _null_ _null_ _null_ ));
--- 1980,1985 ----
*** a/src/include/commands/matview.h
--- b/src/include/commands/matview.h
***************
*** 20,26 ****
  #include "utils/relcache.h"
  
  
! extern void SetMatViewToPopulated(Relation relation);
  
  extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				  ParamListInfo params, char *completionTag);
--- 20,26 ----
  #include "utils/relcache.h"
  
  
! extern void SetMatViewPopulatedState(Relation relation, bool newstate);
  
  extern void ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
  				  ParamListInfo params, char *completionTag);
*** a/src/include/utils/builtins.h
--- b/src/include/utils/builtins.h
***************
*** 461,467 **** extern Datum pg_table_size(PG_FUNCTION_ARGS);
  extern Datum pg_indexes_size(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filenode(PG_FUNCTION_ARGS);
  extern Datum pg_relation_filepath(PG_FUNCTION_ARGS);
- extern Datum pg_relation_is_scannable(PG_FUNCTION_ARGS);
  
  /* genfile.c */
  extern bytea *read_binary_file(const char *filename,
--- 461,466 ----
*** a/src/include/utils/rel.h
--- b/src/include/utils/rel.h
***************
*** 408,414 **** typedef struct StdRdOptions
   * 		populated by its query.  This is likely to get more complicated later,
   * 		so use a macro which looks like a function.
   */
! #define RelationIsScannable(relation) ((relation)->rd_ispopulated)
  
  
  /* routines in utils/cache/relcache.c */
--- 408,422 ----
   * 		populated by its query.  This is likely to get more complicated later,
   * 		so use a macro which looks like a function.
   */
! #define RelationIsScannable(relation) ((relation)->rd_rel->relispopulated)
! 
! /*
!  * RelationIsPopulated
!  * 		Currently, we don't physically distinguish the "populated" and
!  *		"scannable" properties of matviews, but that may change later.
!  *		Hence, use the appropriate one of these macros in code tests.
!  */
! #define RelationIsPopulated(relation) ((relation)->rd_rel->relispopulated)
  
  
  /* routines in utils/cache/relcache.c */
*** a/src/test/regress/expected/matview.out
--- b/src/test/regress/expected/matview.out
***************
*** 26,34 **** EXPLAIN (costs off)
  (2 rows)
  
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tm'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   f
  (1 row)
  
--- 26,34 ----
  (2 rows)
  
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT relispopulated FROM pg_class WHERE oid = 'tm'::regclass;
!  relispopulated 
! ----------------
   f
  (1 row)
  
***************
*** 36,44 **** SELECT * FROM tm;
  ERROR:  materialized view "tm" has not been populated
  HINT:  Use the REFRESH MATERIALIZED VIEW command.
  REFRESH MATERIALIZED VIEW tm;
! SELECT pg_relation_is_scannable('tm'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   t
  (1 row)
  
--- 36,44 ----
  ERROR:  materialized view "tm" has not been populated
  HINT:  Use the REFRESH MATERIALIZED VIEW command.
  REFRESH MATERIALIZED VIEW tm;
! SELECT relispopulated FROM pg_class WHERE oid = 'tm'::regclass;
!  relispopulated 
! ----------------
   t
  (1 row)
  
***************
*** 354,362 **** UNION ALL
             FROM v_test2;
  
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT pg_relation_is_scannable('mv_test3'::regclass);
!  pg_relation_is_scannable 
! --------------------------
   t
  (1 row)
  
--- 354,362 ----
             FROM v_test2;
  
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT relispopulated FROM pg_class WHERE oid = 'mv_test3'::regclass;
!  relispopulated 
! ----------------
   t
  (1 row)
  
*** a/src/test/regress/expected/rules.out
--- b/src/test/regress/expected/rules.out
***************
*** 1347,1353 **** SELECT viewname, definition FROM pg_views WHERE schemaname <> 'information_schem
                                   |     pg_get_userbyid(c.relowner) AS matviewowner,                                                                                                                                                               +
                                   |     t.spcname AS tablespace,                                                                                                                                                                                   +
                                   |     c.relhasindex AS hasindexes,                                                                                                                                                                               +
!                                  |     pg_relation_is_scannable(c.oid) AS isscannable,                                                                                                                                                            +
                                   |     pg_get_viewdef(c.oid) AS definition                                                                                                                                                                        +
                                   |    FROM ((pg_class c                                                                                                                                                                                           +
                                   |    LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))                                                                                                                                                     +
--- 1347,1353 ----
                                   |     pg_get_userbyid(c.relowner) AS matviewowner,                                                                                                                                                               +
                                   |     t.spcname AS tablespace,                                                                                                                                                                                   +
                                   |     c.relhasindex AS hasindexes,                                                                                                                                                                               +
!                                  |     c.relispopulated AS ispopulated,                                                                                                                                                                           +
                                   |     pg_get_viewdef(c.oid) AS definition                                                                                                                                                                        +
                                   |    FROM ((pg_class c                                                                                                                                                                                           +
                                   |    LEFT JOIN pg_namespace n ON ((n.oid = c.relnamespace)))                                                                                                                                                     +
*** a/src/test/regress/sql/matview.sql
--- b/src/test/regress/sql/matview.sql
***************
*** 15,24 **** SELECT * FROM tv ORDER BY type;
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT pg_relation_is_scannable('tm'::regclass);
  SELECT * FROM tm;
  REFRESH MATERIALIZED VIEW tm;
! SELECT pg_relation_is_scannable('tm'::regclass);
  CREATE UNIQUE INDEX tm_type ON tm (type);
  SELECT * FROM tm;
  
--- 15,24 ----
  EXPLAIN (costs off)
    CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
  CREATE MATERIALIZED VIEW tm AS SELECT type, sum(amt) AS totamt FROM t GROUP BY type WITH NO DATA;
! SELECT relispopulated FROM pg_class WHERE oid = 'tm'::regclass;
  SELECT * FROM tm;
  REFRESH MATERIALIZED VIEW tm;
! SELECT relispopulated FROM pg_class WHERE oid = 'tm'::regclass;
  CREATE UNIQUE INDEX tm_type ON tm (type);
  SELECT * FROM tm;
  
***************
*** 109,115 **** CREATE VIEW v_test2 AS SELECT moo, 2*moo FROM v_test1 UNION ALL SELECT moo, 3*mo
  CREATE MATERIALIZED VIEW mv_test2 AS SELECT moo, 2*moo FROM v_test2 UNION ALL SELECT moo, 3*moo FROM v_test2;
  \d+ mv_test2
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT pg_relation_is_scannable('mv_test3'::regclass);
  
  DROP VIEW v_test1 CASCADE;
  
--- 109,115 ----
  CREATE MATERIALIZED VIEW mv_test2 AS SELECT moo, 2*moo FROM v_test2 UNION ALL SELECT moo, 3*moo FROM v_test2;
  \d+ mv_test2
  CREATE MATERIALIZED VIEW mv_test3 AS SELECT * FROM mv_test2 WHERE moo = 12345;
! SELECT relispopulated FROM pg_class WHERE oid = 'mv_test3'::regclass;
  
  DROP VIEW v_test1 CASCADE;

#68

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Kevin Grittner (#65)

Re: Remaining beta blockers

Kevin Grittner <kgrittn@ymail.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

If you want to call the pg_class column relispopulated rather
than relisscannable, I have no particular objection to that

That column name and the wording of some comments are the main
things, although I'm also wondering whether it is bad form to force
users to test the pg_class.relispopulated column if they want to
test whether they can currently scan a matview, by removing the
pg_relation_is_scannable() function.� As I mentioned earlier when
you asked why these two distinct properties weren't both exposed, I
mentioned that I hadn't thought that the "populated" property was
likely to be useful at the SQL level, but then questioned that,
saying that I wasn't sure I picked the right property to pay
attention to in pg_dump - and if pg_dump needed the "populated"
property it had to be exposed.� I've come around to thinking that
it is more proper to use populated, but I have the same question
you asked earlier -- If it will be important or users to understand
that these are distinct properties, why are we just exposing one of
them?

That's fair. So what say we call the pg_class column relispopulated
or something like that, and reinstate pg_relation_is_scannable()
as a function, for any client-side code that wants to test that
property as distinct from is-populated?

The flip side of that is that it might be confusing to try
to explain why users should care which test they use before they
are capable of returning different results.

That's a good point too, though; if they are returning the same thing
right now, it's not very clear that users will pick the right test to
make anyway. Especially not if pg_relation_is_scannable() is a couple
orders of magnitude more expensive, which it will be, cf my original
complaint about pg_dump slowdown.

Also, rather than do the direct update to pg_class in pg_dump, how
would you feel about an ALTER MATERIALIZED VIEW option to set the
populated state?

It seems a bit late to be adding such a thing; moreover, how would
you inject any data without doing something like what pg_upgrade is
doing? I see no point in an ALTER command until there's some other
SQL-level infrastructure for incremental matview updates.

In the context of pg_dump's binary upgrade option, I had thought of
adding a new pg_upgrade_support function, but I saw that we already use
direct pg_class updates for other nearby binary-upgrade hacking; so it
didn't seem unreasonable to do it that way here.

I'm just reviewing the changes I made, and figured it might be good
to show a diff between my form of the patch and yours, but I'm
getting a lot spurious differences based on how we generate our
context diff files (or maybe the versions of some software
involved).� You you share how you generate your patch file?

I use git diff with the context-style-diff external helper that's
described in our wiki. It could well be a version-discrepancy
problem... this machine has got git version 1.7.9.6 and diffutils
2.8.1, and I think the latter is pretty old.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#69

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Kevin Grittner (#67)

Re: Remaining beta blockers

Kevin Grittner <kgrittn@ymail.com> writes:

Kevin Grittner <kgrittn@ymail.com> wrote:

That column name and the wording of some comments are the main
things

Patch for that attached.� I left the part where you got rid of the
SQL function to allow users to test whether a matview is currently
scannable, and I did not add an AMV option to change the populated
flag, since those haven't had any real discussion yet.

Per my other mail, I think adding an AMV option at this time is
inadvisable. I could go either way on removing or keeping the
is_scannable function --- anybody else have an opinion on that point?

Which of us is going to commit this? We're running low on time ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#70

Joshua D. Drake

jd@commandprompt.com

over 12 years ago

In reply to: Tom Lane (#69)

Re: Remaining beta blockers

On 05/06/2013 08:17 AM, Tom Lane wrote:

Per my other mail, I think adding an AMV option at this time is
inadvisable. I could go either way on removing or keeping the
is_scannable function --- anybody else have an opinion on that point?

Which of us is going to commit this? We're running low on time ...

As a my two cents, I have been watching this thread and the concern on
timeline is bothering me. I fully understand our want to get into Beta
and I know we don't want to slip schedule too much but quality is
important. It is what makes our project what it is more than any other
value we hold.

I also know we already slipped the beta once but we are not a
corporation, we do not have shareholders and nobody can fire us. If we
need to push it again for quality, shouldn't we?

Sincerely,

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#71

Kevin Grittner

kgrittn@ymail.com

over 12 years ago

In reply to: Tom Lane (#68)

Re: Remaining beta blockers

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Kevin Grittner <kgrittn@ymail.com> writes:

The flip side of that is that it might be confusing to try
to explain why users should care which test they use before they
are capable of returning different results.

That's a good point too, though; if they are returning the same
thing right now, it's not very clear that users will pick the
right test to make anyway. Especially not if
pg_relation_is_scannable() is a couple orders of magnitude more
expensive, which it will be, cf my original complaint about
pg_dump slowdown.

Since the patch we have floating around drops it, let's leave it
that way, in the interest of saving time getting to beta. If it
was still there, I'd probably vote to leave it for the same reason.
It's pretty close to a toss-up at this point in terms of
cost/benefit, and that seems like the tie-breaker.

Also, rather than do the direct update to pg_class in pg_dump,
how would you feel about an ALTER MATERIALIZED VIEW option to
set the populated state?

It seems a bit late to be adding such a thing;

No kidding. The same could be said for the rest of this. It was
all talked to death months ago before I posted a patch which was
proposed for commit. All this eleventh hour drama bothers me.
I've always maintained we should add an ALTER capabilities for such
things once they are in the catalog. A few days ago they weren't.
Now they are.

moreover, how would you inject any data without doing something
like what pg_upgrade is doing?

I wouldn't. I'm talking about taking that code out of pg_upgrade
and putting it in the server under an ALTER command. If the point
of moving the info to the catalog was to avoid hacks, it would be
nice not to add a hack like that in the process.

I see no point in an ALTER command until there's some other
SQL-level infrastructure for incremental matview updates.

It's only important to avoid having client code directly update
system tables, which I generally view as a worthwhile goal.

In the context of pg_dump's binary upgrade option, I had thought
of adding a new pg_upgrade_support function, but I saw that we
already use direct pg_class updates for other nearby
binary-upgrade hacking; so it didn't seem unreasonable to do it
that way here.

In that case, I guess we might as well follow suit and do it the
way you have it for 9.3.

I didn't see anything I thought needed changing in your first patch
(to disable unlogged matviews), and my suggested changes to your
second patch (to move tracking of populated status to pg_class) are
just names, aliases, and comments. I suggest you review my
proposed tweak to your patch and apply both with any final
polishing you feel are appropriate.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#72

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Kevin Grittner (#71)

Re: Remaining beta blockers

Kevin Grittner <kgrittn@ymail.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

It seems a bit late to be adding such a thing;

No kidding.� The same could be said for the rest of this.� It was
all talked to death months ago before I posted a patch which was
proposed for commit.� All this eleventh hour drama bothers me.

Well, we've been going back and forth about it for weeks. Without a
looming deadline, we'd probably still just be arguing inconclusively ...

I didn't see anything I thought needed changing in your first patch
(to disable unlogged matviews), and my suggested changes to your
second patch (to move tracking of populated status to pg_class) are
just names, aliases, and comments.� I suggest you review my
proposed tweak to your patch and apply both with any final
polishing you feel are appropriate.

OK, will do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#73

Stephen Frost

sfrost@snowman.net

over 12 years ago

In reply to: Kevin Grittner (#71)

Re: Remaining beta blockers

* Kevin Grittner (kgrittn@ymail.com) wrote:

Since the patch we have floating around drops it, let's leave it
that way, in the interest of saving time getting to beta. If it
was still there, I'd probably vote to leave it for the same reason.

I'll vote for dropping it also, though for a slightly different reason-
people can't build things on something that isn't there. Given that
we're still discussing it, that strikes me as the best idea. What goes
into 9.4 could be quite different and it's a lot easier if we don't have
to deal with supporting what may end up being the 'old' approach.

Thanks,

Stephen