what to revert

Started by Robert Haasalmost 10 years ago56 messageshackers
Jump to latest
#1Robert Haas
robertmhaas@gmail.com

On Tue, May 3, 2016 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

There are a lot more than 2 patchsets that are busted at the moment,
unfortunately, but I assume you are referring to "snapshot too old"
and "Use Foreign Key relationships to infer multi-column join
selectivity".

Yeah, those are the ones I'm thinking of. I've not heard serious
proposals to revert any others, have you?

Here's a list of what I think is currently broken in 9.6 that we might
conceivably fix by reverting patches:

- Snapshot Too Old. Tom, Andres, and Bruce want this reverted. It
regresses performance significantly when turned on. It originally
regressed performance significantly even when turned off, but that
might be fixed now. Also, it seems to be broken for hash indexes, per
Amit Kapila's report.

- Atomic Pin/Unpin. There are two different open items related to
this, both apparently relating to testing done by Jeff Janes. I am
not sure whether they are really independent reports of different
problems or just two reports of the same problem. But they sound like
fairly serious breakage.

- Predefined Roles. Neither you nor I liked Stephen's design. It
slowed down pg_dump. It also broke pg_dump for non-superusers and
something about bypassrls. None of these issues have been fixed
despite considerable time having gone by.

- Checkpoint Sorting and Flushing. Andres tried to fix the last
report of problems with this in
72a98a639574d2e25ed94652848555900c81a799, but there were almost
immediately new reports.

- Foreign Keys and Multi-Column Outer Joins. You called for a revert,
and the author responded with various thoughts and counterproposals.

There are a few other open items, but I would consider reverting the
antecedent patches as either impractical or disproportionate in those
cases. Aside from the two cases you mentioned, I don't think that
anyone's actually called for these other patches to be reverted, but
I'm not sure that we shouldn't be considering it. What do you (and
others) think?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#1)
Re: what to revert

On Tue, May 3, 2016 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:

- Snapshot Too Old. Tom, Andres, and Bruce want this reverted.
It regresses performance significantly when turned on.

When turned on, it improves performance in some cases and regresses
performance in others. Don't forget it is currently back-patched
to 9.4 and in use for production by users who could not use
PostgreSQL without the feature. PostgreSQL failed their
performance tests miserably without the feature, and passes with
it.

It originally regressed performance significantly even when
turned off,

Which was wildly exaggerated since most of the benchmarks
purporting to show that actually had it turned on. I don't think
the FUD from that has really evaporated.

but that might be fixed now.

Certainly all evidence suggests that, FUD to the contrary.

Also, it seems to be broken for hash indexes, per Amit Kapila's
report.

Yeah, with a fairly simple fix suggested immediately by Amit. I'm
looking into a couple other angles for possible fixes, but
certainly what he suggested could be done before beta1.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#1)
Re: what to revert

On 2016-05-03 11:58:34 -0400, Robert Haas wrote:

- Atomic Pin/Unpin. There are two different open items related to
this, both apparently relating to testing done by Jeff Janes. I am
not sure whether they are really independent reports of different
problems or just two reports of the same problem. But they sound like
fairly serious breakage.

It's a bit hard to say, given the test take this long to run, but so far
I've a fair amount of doubt that the bug report is actually related to
the atomic pin/unpin. It appears to me - I'm in the process of trying
to confirm (again long runs) that the pin/unpin patch merely changed the
timing.

- Checkpoint Sorting and Flushing. Andres tried to fix the last
report of problems with this in
72a98a639574d2e25ed94652848555900c81a799, but there were almost
immediately new reports.

Yea, it's an issue with the 72a98a639574d2e25ed94652848555900c81a799,
not checkpoint flushing itself. Turns out that mdsync() *wants/needs* to
access deleted segments. Easily enough fixed. I've posted a patch, which
I plan to commit unless somebody wants to give input on the flag design.

There are a few other open items, but I would consider reverting the
antecedent patches as either impractical or disproportionate in those
cases. Aside from the two cases you mentioned, I don't think that
anyone's actually called for these other patches to be reverted, but
I'm not sure that we shouldn't be considering it. What do you (and
others) think?

I'm *really* doubtful about the logical timeline following patches. I
think they're, as committed, over-complicated and pretty much unusable.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#2)
Re: what to revert

On 2016-05-03 11:12:03 -0500, Kevin Grittner wrote:

On Tue, May 3, 2016 at 10:58 AM, Robert Haas <robertmhaas@gmail.com> wrote:

- Snapshot Too Old. Tom, Andres, and Bruce want this reverted.
It regresses performance significantly when turned on.

When turned on, it improves performance in some cases and regresses
performance in others. Don't forget it is currently back-patched
to 9.4 and in use for production by users who could not use
PostgreSQL without the feature. PostgreSQL failed their
performance tests miserably without the feature, and passes with
it.

It originally regressed performance significantly even when
turned off,

Which was wildly exaggerated since most of the benchmarks
purporting to show that actually had it turned on. I don't think
the FUD from that has really evaporated.

Oh, ffs. The first report of the whole issue was *with default
parameters*:
http://archives.postgresql.org/message-id/CAPpHfdtMONZFOXSsw1HkrD9Eb4ozF8Q8oCqH4tkpH_girJPPuA%40mail.gmail.com

The issue with 0 v. -1 (and 0 vs. > 0 makes a big performance
difference, so it's not that surprising to interpret numbers that way)
was immediately addressed by another round of benchmarks after you
pointed it out.

but that might be fixed now.

Certainly all evidence suggests that, FUD to the contrary.

So it's now FUD to report issues with a patch that obviously hasn't
received sufficient benchmarking? Give me break.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#4)
Re: what to revert

On Tue, May 3, 2016 at 12:22 PM, Andres Freund <andres@anarazel.de> wrote:

but that might be fixed now.

Certainly all evidence suggests that, FUD to the contrary.

So it's now FUD to report issues with a patch that obviously hasn't
received sufficient benchmarking? Give me break.

Yeah, I don't think that's FUD. Kevin, since your last fix, we don't
have a round of benchmarking on a big machine to show whether that
fixed the issue or not. I think that to really know whether this is
fixed, somebody would need to compare current master with current
master after reverting snapshot too old on a big machine and see if
there's a difference. If anyone has done that, they have not posted
the results. So it's more accurate to say that we just don't know.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#4)
Re: what to revert

On Tue, May 3, 2016 at 11:22 AM, Andres Freund <andres@anarazel.de> wrote:

The issue with 0 v. -1 (and 0 vs. > 0 makes a big performance
difference, so it's not that surprising to interpret numbers that way)

... if you fail to read the documentation of the feature or the
code implementing it before testing.

was immediately addressed by another round of benchmarks after you
pointed it out.

Which showed a 4% maximum hit before moving the test for whether it
was "off" inline. (I'm not clear from the posted results whether
that was before or after skipping the spinlock when the feature was
off.) All tests that I have done and that others have done (some
on big NUMA machines) and shared with me show no regression now. I
haven't been willing to declare victory on that basis without
hearing back from others who were able to show a regression before.
If there is still a regression found when "off", there is one more
test of old_snapshot_threshold which could easily be shifted
in-line; it just seems unnecessary given the other work done in
that area unless there is evidence that it is really needed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#3)
Re: what to revert

Andres Freund wrote:

I'm *really* doubtful about the logical timeline following patches. I
think they're, as committed, over-complicated and pretty much unusable.

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Alvaro Herrera (#7)
Re: what to revert

On Tue, May 3, 2016 at 9:58 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

I think it's a positive development that we can take this attitude to
reverting patches. It should not be seen as a big personal failure,
because it isn't. Stigmatizing reverts incentivizes behavior that
leads to bad outcomes.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#1)
Re: what to revert

* Robert Haas (robertmhaas@gmail.com) wrote:

Here's a list of what I think is currently broken in 9.6 that we might
conceivably fix by reverting patches:

[...]

- Predefined Roles. Neither you nor I liked Stephen's design. It
slowed down pg_dump. It also broke pg_dump for non-superusers and
something about bypassrls. None of these issues have been fixed
despite considerable time having gone by.

The issues you list are not with predefined roles at all, but with the
changes to dump ACLs defined against objects in pg_catalog. I've also
got patches to address those issues already developed and (mostly)
posted. I'll be posting a new set later today which addresses all of
the known issues with dumping catalog ACLs.

There is an ongoing thread where Noah and I have been discussing the
dumping of catalog ACLs issues and the TAP test suite which I've been
developing for pg_dump. Certainly, anyone is welcome to join in that
discussion.

As mentioned before, the concern raised about predefined roles are the
checks which were added to make them unlike regular roles. I'll be
posting a patch tomorrow to revert those checks, as I mentioned on
another thread earlier today.

Thanks!

Stephen

#10Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#6)
Re: what to revert

On 2016-05-03 11:46:23 -0500, Kevin Grittner wrote:

was immediately addressed by another round of benchmarks after you
pointed it out.

Which showed a 4% maximum hit before moving the test for whether it
was "off" inline.

(I'm not clear from the posted results whether that was before or
after skipping the spinlock when the feature was off.)

They're from after the spinlock issue was resolved. Before that the
issue was a lot worse (see mail linked two messages upthread).

I'm pretty sure that I said that somewhere else at least once: But to be
absolutely clear, I'm *not* really concerned with the performance with
the feature turned off. I'm concerned about the performance with it
turned on.

Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#8)
Re: what to revert

On 2016-05-03 10:12:51 -0700, Peter Geoghegan wrote:

On Tue, May 3, 2016 at 9:58 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

I think it's a positive development that we can take this attitude to
reverting patches. It should not be seen as a big personal failure,
because it isn't. Stigmatizing reverts incentivizes behavior that
leads to bad outcomes.

+1

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Peter Geoghegan (#8)
Re: what to revert

On 05/03/2016 07:12 PM, Peter Geoghegan wrote:

On Tue, May 3, 2016 at 9:58 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

I think it's a positive development that we can take this attitude to
reverting patches. It should not be seen as a big personal failure,
because it isn't. Stigmatizing reverts incentivizes behavior that
leads to bad outcomes.

Absolutely +1

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#10)
Re: what to revert

On 05/03/2016 07:41 PM, Andres Freund wrote:

On 2016-05-03 11:46:23 -0500, Kevin Grittner wrote:

was immediately addressed by another round of benchmarks after you
pointed it out.

Which showed a 4% maximum hit before moving the test for whether it
was "off" inline.

(I'm not clear from the posted results whether that was before or
after skipping the spinlock when the feature was off.)

They're from after the spinlock issue was resolved. Before that the
issue was a lot worse (see mail linked two messages upthread).

I'm pretty sure that I said that somewhere else at least once: But to
be absolutely clear, I'm *not* really concerned with the performance
with the feature turned off. I'm concerned about the performance with
it turned on.

If you tell me how to best test it, I do have a 4-socket server sitting
idly in the corner (well, a corner reachable by SSH). I can get us some
numbers, but I haven't been following the snapshot_too_old so I'll need
some guidance on what to test.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Ants Aasma
ants.aasma@cybertec.at
In reply to: Robert Haas (#1)
Re: what to revert

On Tue, May 3, 2016 at 9:57 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

If you tell me how to best test it, I do have a 4-socket server sitting idly
in the corner (well, a corner reachable by SSH). I can get us some numbers,
but I haven't been following the snapshot_too_old so I'll need some guidance
on what to test.

I worry about two contention points with the current implementation.

The main one is the locking within MaintainOldSnapshotTimeMapping()
that gets called every time a snapshot is taken. AFAICS this should
show up by setting old_snapshot_threshold to any positive value and
then running a simple within shared buffers scale factor read only
pgbench at high concurrency (number of CPUs or a small multiple). On a
single socket system this does not show up.

The second one is probably a bit harder to hit,
GetOldSnapshotThresholdTimestamp() has a spinlock that gets hit
everytime a scan sees a page that has been modified after the snapshot
was taken. A workload that would tickle this is something that uses a
repeatable read snapshot, builds a non-temporary table and runs
reporting on it. Something like this would work:

BEGIN ISOLATION LEVEL REPEATABLE READ;
DROP TABLE IF EXISTS test_:client_id;
CREATE TABLE test_:client_id (x int, filler text);
INSERT INTO test_:client_id SELECT x, repeat(' ', 1000) AS filler
FROM generate_series(1,1000) x;
SELECT (SELECT COUNT(*) FROM test_:client_id WHERE x != y) FROM
generate_series(1,1000) y;
COMMIT;

With this script running with -c4 on a 4 core workstation I'm seeing
the following kind of contention and a >2x loss in throughput:

+   14.77%  postgres  postgres           [.] GetOldSnapshotThresholdTimestamp
-    8.01%  postgres  postgres           [.] s_lock
   - s_lock
      + 88.15% GetOldSnapshotThresholdTimestamp
      + 10.47% TransactionIdLimitedForOldSnapshots
      + 0.71% TestForOldSnapshot_impl
      + 0.57% GetSnapshotCurrentTimestamp

Now this is kind of an extreme example, but I'm willing to bet that on
multi socket hosts similar issues can crop up with common real world
use cases.

Regards,
Ants Aasma

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Ants Aasma (#14)
Re: what to revert

On 2016-05-04 00:01:20 +0300, Ants Aasma wrote:

On Tue, May 3, 2016 at 9:57 PM, Tomas Vondra
<tomas.vondra@2ndquadrant.com> wrote:

If you tell me how to best test it, I do have a 4-socket server sitting idly
in the corner (well, a corner reachable by SSH). I can get us some numbers,
but I haven't been following the snapshot_too_old so I'll need some guidance
on what to test.

I worry about two contention points with the current implementation.

The main one is the locking within MaintainOldSnapshotTimeMapping()
that gets called every time a snapshot is taken. AFAICS this should
show up by setting old_snapshot_threshold to any positive value and
then running a simple within shared buffers scale factor read only
pgbench at high concurrency (number of CPUs or a small multiple). On a
single socket system this does not show up.

On a two socket system it does, check the bottom of:
http://archives.postgresql.org/message-id/20160413171955.i53me46fqqhdlztq%40alap3.anarazel.de
Note that this (accidentally) compares thresholds 0 and 10 (instead of
-1 and 10), but that's actually interesting for this question because of
the quick exit in MaintainOldSnapshotData:
/* No further tracking needed for 0 (used for testing). */
if (old_snapshot_threshold == 0)
return;
which happens before the lwlock is taken.

The second one is probably a bit harder to hit,
GetOldSnapshotThresholdTimestamp() has a spinlock that gets hit
everytime a scan sees a page that has been modified after the snapshot
was taken. A workload that would tickle this is something that uses a
repeatable read snapshot, builds a non-temporary table and runs
reporting on it.

I'm not particularly concerned about that kind of issue - we can quite
easily replace that spinlock with 64bit atomic reads/writes for which
I've already proposed a patch. I'd expect that to go into 9.7.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Andres Freund
andres@anarazel.de
In reply to: Tomas Vondra (#13)
Re: what to revert

On 2016-05-03 20:57:13 +0200, Tomas Vondra wrote:

On 05/03/2016 07:41 PM, Andres Freund wrote:

On 2016-05-03 11:46:23 -0500, Kevin Grittner wrote:

was immediately addressed by another round of benchmarks after you
pointed it out.

Which showed a 4% maximum hit before moving the test for whether it
was "off" inline.

(I'm not clear from the posted results whether that was before or
after skipping the spinlock when the feature was off.)

They're from after the spinlock issue was resolved. Before that the
issue was a lot worse (see mail linked two messages upthread).

I'm pretty sure that I said that somewhere else at least once: But to
be absolutely clear, I'm *not* really concerned with the performance
with the feature turned off. I'm concerned about the performance with
it turned on.

If you tell me how to best test it, I do have a 4-socket server sitting idly
in the corner (well, a corner reachable by SSH). I can get us some numbers,
but I haven't been following the snapshot_too_old so I'll need some guidance
on what to test.

I think it'd be cool if you could test the effect of the feature in
read-only (and additionally read-mostly?) workload with various client
counts and snapshot_too_old values. For the latter maybe -1, 0, 10, 60
or such? I've done so (accidentally comparing 0 and 1 instead of -1 and
1) on a two socket machine in:
www.postgresql.org/message-id/20160413171955.i53me46fqqhdlztq@alap3.anarazel.de

It'd be very interesting to see how big the penalty is on a bigger box.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Craig Ringer
craig@2ndquadrant.com
In reply to: Peter Geoghegan (#8)
Re: what to revert

On 4 May 2016 at 01:12, Peter Geoghegan <pg@heroku.com> wrote:

On Tue, May 3, 2016 at 9:58 AM, Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

I think it's a positive development that we can take this attitude to
reverting patches. It should not be seen as a big personal failure,
because it isn't. Stigmatizing reverts incentivizes behavior that
leads to bad outcomes.

Indeed. Being scared to revert also makes the barrier to committing
something and getting it into more hands, for longer, under a variety of
different conditions higher. Not a ton of help with this particular feature
but quite important with others.

While I'm personally disappointed by this outcome, I also can't really
disagree with it. The whole area is a bit of a mess and hard to work on,
and as demonstrated my understanding of it when I wrote the original patch
was incomplete. We could commit the rewritten function, but ... rewriting a
feature just before beta1 probably says it's just not baked yet, right?

The upside of all this is that now I have a decent picture of how it should
all look and how timeline following, failover capability etc can be
introduced in a staged way. I'd also like to get rid of the use of various
globals to pass timeline information "around" the walsender and share more
of the logic between the code paths.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#18Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#1)
Re: what to revert

On Tue, May 3, 2016 at 9:28 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 3, 2016 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

There are a lot more than 2 patchsets that are busted at the moment,
unfortunately, but I assume you are referring to "snapshot too old"
and "Use Foreign Key relationships to infer multi-column join
selectivity".

Yeah, those are the ones I'm thinking of. I've not heard serious
proposals to revert any others, have you?

Here's a list of what I think is currently broken in 9.6 that we might
conceivably fix by reverting patches:

Yes, that would be a way forward for 9.6 if we are not able to close
blocking open items before beta1. However, I think it would be bad if we
miss some of the below listed important features like snapshot_too_old or
atomic pin/unpin for 9.6. Can we consider to postpone beta1, so that the
patch authors get time to resolve blocking issues? I think there could be
a strong argument that it is just a waste of time if the situation doesn't
improve much even after delay, but it seems we can rely on people involved
in those patch sets to make a progress.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#18)
Re: what to revert

Amit Kapila <amit.kapila16@gmail.com> writes:

Yes, that would be a way forward for 9.6 if we are not able to close
blocking open items before beta1. However, I think it would be bad if we
miss some of the below listed important features like snapshot_too_old or
atomic pin/unpin for 9.6. Can we consider to postpone beta1, so that the
patch authors get time to resolve blocking issues?

This was already considered and rejected by the release team. Most of
the patches in question went in very close to the feature freeze deadline
(all but one, in fact, in the last week) and there is every reason to
suspect that they were rushed rather than really being ready to commit.
We should not allow an entire year's worth of work to slide in the
possibly-vain hope that these few patches can get fixed up if they're
given more time.

The bigger picture here is that we'd all like to get back to development
sooner rather than later. The longer it takes to stabilize 9.6, the
longer it will be before the tree reopens for new development. That
consideration should make us very willing to revert problematic changes
and let their authors try again in the next cycle.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Craig Ringer (#17)
Re: what to revert

On 03-05-2016 20:25, Craig Ringer wrote:

On 4 May 2016 at 01:12, Peter Geoghegan <pg@heroku.com
<mailto:pg@heroku.com>> wrote:

On Tue, May 3, 2016 at 9:58 AM, Alvaro Herrera
<alvherre@2ndquadrant.com <mailto:alvherre@2ndquadrant.com>> wrote:

As its committer, I tend to agree about reverting that feature. Craig
was just posting some more patches, and I have the pg_recvlogical
changes here (--endpos) which after some testing are not quite looking
ready to go -- plus we still have to write the actual Perl test scripts
that would use it. Taken together, this is now looking to me a bit
rushed, so I prefer to cut my losses here and revert the patch so that
we can revisit it for 9.7.

I think it's a positive development that we can take this attitude to
reverting patches. It should not be seen as a big personal failure,
because it isn't. Stigmatizing reverts incentivizes behavior that
leads to bad outcomes.

Indeed. Being scared to revert also makes the barrier to committing
something and getting it into more hands, for longer, under a variety of
different conditions higher. Not a ton of help with this particular
feature but quite important with others.

While I'm personally disappointed by this outcome, I also can't really
disagree with it. The whole area is a bit of a mess and hard to work on,
and as demonstrated my understanding of it when I wrote the original
patch was incomplete. We could commit the rewritten function, but ...
rewriting a feature just before beta1 probably says it's just not baked
yet, right?

You said that in another thread...

The upside of all this is that now I have a decent picture of how it
should all look and how timeline following, failover capability etc can
be introduced in a staged way. I'd also like to get rid of the use of
various globals to pass timeline information "around" the walsender and
share more of the logic between the code paths.

Question is: is the actual code so useless that it can't even be a 1.0
release? A lot of (complex) features were introduced with the notion
that will be improved in the next version. However, if the code is buggy
or there are serious API problems, revert them.

--
Euler Taveira Timbira - http://www.timbira.com.br/
PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Craig Ringer
craig@2ndquadrant.com
In reply to: Euler Taveira de Oliveira (#20)
#22Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Craig Ringer (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#15)
#24Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#23)
#25Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#24)
#26Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#25)
#28Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andres Freund (#27)
In reply to: Andres Freund (#27)
#30Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#28)
#31Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#30)
#32Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#31)
#33Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#32)
#34Ants Aasma
ants.aasma@cybertec.at
In reply to: Andres Freund (#15)
#35Andres Freund
andres@anarazel.de
In reply to: Ants Aasma (#34)
#36Amit Kapila
amit.kapila16@gmail.com
In reply to: Andres Freund (#35)
#37Ants Aasma
ants.aasma@cybertec.at
In reply to: Andres Freund (#15)
#38Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Ants Aasma (#37)
#39Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Andres Freund (#16)
#40Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#39)
#41Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tomas Vondra (#40)
#42Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#42)
#44Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tom Lane (#43)
#45Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kevin Grittner (#41)
#46Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tomas Vondra (#45)
#47Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kevin Grittner (#42)
#48Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#47)
#49Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Robert Haas (#48)
#50Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Tomas Vondra (#45)
#51Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#50)
#52Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#48)
#53Noah Misch
noah@leadboat.com
In reply to: Kevin Grittner (#44)
#54Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Kevin Grittner (#50)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Tomas Vondra (#54)
#56Tomas Vondra
tomas.vondra@2ndquadrant.com
In reply to: Tomas Vondra (#54)