MMAP Buffers

Started by Radosław Smoguraover 14 years ago179 messages
#1Radosław Smogura
rsmogura@softperience.eu

Hello,

If I may, I want to share some concept to use mmap in PG. It's far, far
away from perfect, but it's keeps WAL before data. As well I crated
table, with index, inserted few values, and I done vacuum full on this
table. Db inits welcome from orginal sources.

Performance of read (if backend is loaded) is really good, query time
goes down from 450ms to about 410ms. Update may be slower - but work is
in progress (I will start with write, as I went to point when simple
updates may be performed). Even that I didn't covered all aspects off
updating, it's simple to do it, just to call PreopareBufferToUpdate
before modifing buffer, ofc some ideas of increasing this are still in
my head.

Any comments, suggestions welcome.

I didn't included this, as diff, because of ~150kb size (mainly
configure scripts, which are included in SVC). Due to this, You may
download it from
http://softperience.eu/downloads/pg_mmap_20110415.diff.bz2 (Legal: Work
under PostgreSQL BSD Lincense). Patch is just GIT diff, later I will try
to grab some git.

Regards and have a nice day,
Radek.

P.S. This problem about assert with signals, I wrote... I merged with
last master, and rebuilded code. I think, I forgot to rebuild it after
previous merge.

#2Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Radosław Smogura (#1)
Re: MMAP Buffers

On 15.04.2011 13:32, Radosław Smogura wrote:

If I may, I want to share some concept to use mmap in PG. It's far, far
away from perfect, but it's keeps WAL before data. As well I crated
table, with index, inserted few values, and I done vacuum full on this
table. Db inits welcome from orginal sources.

Performance of read (if backend is loaded) is really good, query time
goes down from 450ms to about 410ms. Update may be slower - but work is
in progress (I will start with write, as I went to point when simple
updates may be performed). Even that I didn't covered all aspects off
updating, it's simple to do it, just to call PreopareBufferToUpdate
before modifing buffer, ofc some ideas of increasing this are still in
my head.

Any comments, suggestions welcome.

The patch is quite hard to read because of random whitespace changes and
other stylistic issues, but I have a couple of high-level questions on
the design:

* Does each process have its own mmappings, or are the mmapping done to
shared buffers?

* How do you handle locking? Do you still need to allocate a shared
buffer for each mmapped page?

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#3Radosław Smogura
rsmogura@softperience.eu
In reply to: Heikki Linnakangas (#2)
Re: MMAP Buffers

On Fri, 15 Apr 2011 14:33:37 +0300, Heikki Linnakangas wrote:

On 15.04.2011 13:32, Radosław Smogura wrote:

If I may, I want to share some concept to use mmap in PG. It's far,
far
away from perfect, but it's keeps WAL before data. As well I crated
table, with index, inserted few values, and I done vacuum full on
this
table. Db inits welcome from orginal sources.

Performance of read (if backend is loaded) is really good, query
time
goes down from 450ms to about 410ms. Update may be slower - but work
is
in progress (I will start with write, as I went to point when simple
updates may be performed). Even that I didn't covered all aspects
off
updating, it's simple to do it, just to call PreopareBufferToUpdate
before modifing buffer, ofc some ideas of increasing this are still
in
my head.

Any comments, suggestions welcome.

The patch is quite hard to read because of random whitespace changes
and other stylistic issues, but I have a couple of high-level
questions on the design:

Yes, but, hmm... in Netbeans I had really long gaps (probably 8 spaces,
from tabs), so deeper "ifs", comments at the and of variables, went of
out my screen. I really wanted to not format this, but sometimes I
needed.

* Does each process have its own mmappings, or are the mmapping done
to shared buffers?

Those are MAP_SHARED mappings, but each process has it's own pointer to
this.

* How do you handle locking?

I do not do locking... I do different thing (worst)... When buffer
should be updated, it gets shared buffer, content is copied (so
situation almost same like fread), and depending on situation content is
used directly or pages between mmaping and shared (updatable) regions
are swapped - it keeps tuple pointers, etc. I really would be happy if
such method (lock flushing to file) could exists.

Do you still need to allocate a shared buffer for each mmapped page?

Currently each mmaped page has additional shared buffer, but it's
almost ready to use independent pool of shared buffers. This will be
good, as mmaped buffers could cover whole system cache, keeping maybe
10%-20% of this size for write in SHMEM.

I think about MAP_PRIVATE, but those has some pluses and minuses, e.g.
MAP_SHARED may be, for less critical systems, simplier equipped with GUC
mmap_direct_write=true.

Regards,
Radek

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Radosław Smogura (#3)
Re: MMAP Buffers

On 04/15/2011 08:12 AM, Radosław Smogura wrote:

The patch is quite hard to read because of random whitespace changes
and other stylistic issues, but I have a couple of high-level
questions on the design:

Yes, but, hmm... in Netbeans I had really long gaps (probably 8
spaces, from tabs), so deeper "ifs", comments at the and of variables,
went of out my screen. I really wanted to not format this, but
sometimes I needed.

Netbeans is possibly not very well suited to working on postgres code.
AFAIK emacs and/or vi(m) are used by almost all the major developers.

cheers

andrew

#5Joshua Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#4)
Re: MMAP Buffers

Radoslaw,

10% improvement isn't very impressive from a switch to mmap. What workload did you test with? What I'd really like to see is testing with databases which are 50%, 90% and 200% the size of RAM ... that's where I'd expect the greatest gain from limiting copying.

Netbeans is possibly not very well suited to working on postgres code.
AFAIK emacs and/or vi(m) are used by almost all the major developers.

Guys, can we *please* focus on the patch for now, rather than the formatting, which is fixable with sed?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

#6Radosław Smogura
rsmogura@softperience.eu
In reply to: Joshua Berkus (#5)
1 attachment(s)
Re: MMAP Buffers

Joshua Berkus <josh@agliodbs.com> Friday 15 April 2011 18:55:04

Radoslaw,

10% improvement isn't very impressive from a switch to mmap. What workload
did you test with? What I'd really like to see is testing with databases
which are 50%, 90% and 200% the size of RAM ... that's where I'd expect
the greatest gain from limiting copying.

I think 10% is quite good, as my stand-alone test of mmap vs. read shown that
speed up of copying 100MB data to mem may be from ~20ms to ~100ms (depends on
destination address). Of course deeper, system test simulating real usage will
say more. In any case after good deals with writes, I will speed up reads. I
think to bypass smgr/md much more and to expose shared id's (1,2,3...) for
each file segment.

Going to topic...

In attachment I sent test-scripts which I used to fill data, nothing complex
(left from 2nd level caches).

Query I've used to measure was SELECT count(substr(content, 1, 1)) FROM
testcase1 WHERE multi_id > 50000;

Timings ware taken from psql.

I didn't made load (I have about 2GB of free sapce at /home, and 4GB RAM) and
stress (I'm not quite ready to try concurrent updates of same page - may fail,
notice is and place to fix is in code) tests yet.

Netbeans is possibly not very well suited to working on postgres code.
AFAIK emacs and/or vi(m) are used by almost all the major developers.

Guys, can we *please* focus on the patch for now, rather than the
formatting, which is fixable with sed?

Netbeans is quite good, of course it depends who likes what. Just try 7.0 RC
2.

Regards,
Radek

Attachments:

test-scritps_20110319_0026.tar.bz2application/x-bzip-compressed-tar; name=test-scritps_20110319_0026.tar.bz2Download
#7Greg Smith
greg@2ndquadrant.com
In reply to: Joshua Berkus (#5)
Re: MMAP Buffers

Joshua Berkus wrote:

Guys, can we *please* focus on the patch for now, rather than the formatting, which is fixable with sed?

Never, and that's not true. Heikki was being nice; I wouldn't have even
slogged through it long enough to ask the questions he did before
kicking it back as unusable. A badly formatted patch makes it
impossible to evaluate whether the changes from a submission are
reasonable or not without the reviewer fixing it first. And you can't
automate correcting it, it takes a lot of tedious manual work. Start
doing a patch review every CommitFest cycle and you very quickly realize
it's not an ignorable problem. And lack of discipline in minimizing
one's diff is always a sign of other code quality issues.

Potential contributors to PostgreSQL should know that a badly formatted
patch faces an automatic rejection, because no reviewer can work with it
easily. This fact is not a mystery; in fact it's documented at
http://wiki.postgresql.org/wiki/Submitting_a_Patch : "The easiest way
to get your patch rejected is to make lots of unrelated changes, like
reformatting lines, correcting comments you felt were poorly worded etc.
Each patch should have the minimum set of changes required to fulfil the
single stated objective." I think I'll go improve that text
next--something like "Ways to get your patch rejected" should be its own
section.

The problem here isn't whether someone used an IDE or not, it's that
this proves they didn't read their own patch before submitting it.
Reading one's own diff and reflecting on what you've changed is one of
the extremely underappreciated practices of good open-source software
development. Minimizing the size of that diff is perhaps the most
important thing someone can do in order to make their changes to a piece
of software better. Not saying something that leads in that direction
would be a disservice to the submitter.

P.S. You know what else I feel should earn an automatic rejection
without any reviewer even looking at the code? Submitting a patch that
claims to improve performance and not attaching the test case you used,
along with detailed notes about before/after tests on your own
hardware. A hand wave "it's faster" is never good enough, and it's
extremely wasteful of our limited reviewer resources to try and
duplicate what the submitter claimed. Going to add something about that
to the submission guidelines too.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#8Robert Haas
robertmhaas@gmail.com
In reply to: Greg Smith (#7)
Re: MMAP Buffers

On Apr 16, 2011, at 1:48 AM, Greg Smith <greg@2ndquadrant.com> wrote:

P.S. You know what else I feel should earn an automatic rejection without any reviewer even looking at the code?

Greg is absolutely right. And to the two he listed, let me add another of my own gripes: failing to provide submission notes that explain how the patch works, and how it addresses the conceptually difficult issues raised previously. The OP says that this patch maintains the WAL-before-data rule without any explanation of how it accomplishes that seemingly quite amazing feat. I assume I'm going to have to read this patch at some point to refute this assertion, and I think that sucks. I am pretty nearly 100% confident that this approach is utterly doomed, and I don't want to spend a lot of time on it unless someone can provide me with a compelling explanation of why my confidence is misplaced. But spending a lot of time on it is exactly what I'm going to have to do, because reading a undocumented patch full of spurious garbage to refute a hand-wavy claim of correctness is time-consuming, and if I give up on it without reading it, someone will yell "unfair, unfair!"

None of this is to say that I don't appreciate Radoslaw's interest in contributing, because I very much do. But I also think it's important to realize that we have a finite number of reviewers and they have finite time. Trying to minimize the amount of time that it takes someone to review or commit your patch is a service to the whole community, and we should acknowledge that it has value and appreciate the people who consistently do it.

...Robert

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#7)
Re: MMAP Buffers

Greg Smith <greg@2ndquadrant.com> writes:

Reading one's own diff and reflecting on what you've changed is one of
the extremely underappreciated practices of good open-source software
development. Minimizing the size of that diff is perhaps the most
important thing someone can do in order to make their changes to a piece
of software better. Not saying something that leads in that direction
would be a disservice to the submitter.

A couple further comments on that thought:

* Another argument for avoiding unnecessary changes is that the larger
your patch's change footprint, the more likely it is to create merge
conflicts for people working on other patches. Now, if they're
necessary changes, that's the price of parallelism in development.
But gratuitous whitespace changes add nothing and they do have costs.

* On the other side of the coin, I have seen many a patch that was
written to minimize the length of the diff to the detriment of
readability or maintainability of the resulting code, and that's *not*
a good tradeoff. Always do what makes the most sense from a long-run
perspective.

I keep wanting to do a talk arguing that everything you need to know
about good patch style can be derived from the mantra "Make the patch
look like the code had always been there". If the functionality had
been designed in on day one, where would it be placed and how would it
be coded? You might be able to make the patch diff shorter with some
shortcut or other, but that's not the way to do it.

regards, tom lane

#10Greg Smith
greg@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: MMAP Buffers

Tom Lane wrote:

* On the other side of the coin, I have seen many a patch that was
written to minimize the length of the diff to the detriment of
readability or maintainability of the resulting code, and that's *not*
a good tradeoff.

Sure. that's possible. But based on the reviews I've done, I'd say that
the fact someone is even aware that minimizing their diff is something
important to consider automatically puts them far ahead of the average
new submitter. There are a high percentage of patches where the
submitter generates a diff and sents it without even looking at it.
That a person would look at their diff and go too far without trying to
make it small doesn't happen nearly as much.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#11Greg Smith
greg@2ndquadrant.com
In reply to: Robert Haas (#8)
Re: MMAP Buffers

Robert Haas wrote:

The OP says that this patch maintains the WAL-before-data rule without any explanation of how it accomplishes that seemingly quite amazing feat. I assume I'm going to have to read this patch at some point to refute this assertion, and I think that sucks.

I don't think you have to read any patch that doesn't follow the
submission guidelines. The fact that you do is a great contribution to
the community. But if I were suggesting how your time would be best
spent improving PostgreSQL, "reviewing patches that don't meet coding
standards" would be at the bottom of the list. There's always something
better for the project you could be working on instead.

I just added
http://wiki.postgresql.org/wiki/Submitting_a_Patch#Reasons_your_patch_might_be_returned
, recycling some existing text, adding some new suggestions.

I hope I got the tone of that text right. The intention was to have a
polite but clear place to point submitters to when their suggestion
doesn't meet the normal standards here, such that they might even get
bounced before even entering normal CommitFest review. This MMAP patch
looks like it has all 5 of the problems mentioned on that now more
focused list.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#12Greg Stark
gsstark@mit.edu
In reply to: Robert Haas (#8)
Re: MMAP Buffers

On Sat, Apr 16, 2011 at 7:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:

The OP says that this patch maintains the WAL-before-data rule without any explanation of how it accomplishes that seemingly quite amazing feat.  I assume I'm going to have to read this patch at some point to refute this assertion, and I think that sucks. I am pretty nearly 100% confident that this approach is utterly doomed, and I don't want to spend a lot of time on it unless someone can provide me with a compelling explanation of why my confidence is misplaced.

Fwiw he did explain how he did that. Or at least I think he did --
it's possible I read what I expected because what he came up with is
something I've recently been thinking about.

What he did, I gather, is treat the mmapped buffers as a read-only
copy of the data. To actually make any modifications he copies it into
shared buffers and treats them like normal. When the buffers get
flushed from memory they get written and then the pointers get
repointed back at the mmapped copy. Effectively this means the shared
buffers get extended to include all of the filesystem cache instead of
having to evict buffers from shared buffers just because you want to
read another one that's already in filesystem cache.

It doesn't save the copying between filesystem cache and shared
buffers for buffers that are actually being written to. But it does
save some amount of other copies on read-only traffic and it can even
save some i/o. It does require a function call before each buffer
modification where the pattern is currently <lock buffer>, <mutate
buffer>, <mark buffer dirty>. From what he describes he needs to add a
<prepare buffer for mutation> between the lock and mutate.

I think it's an interesting experiment and it's good to know how to
solve some of the subproblems. Notably, how do you extend files or
drop them atomically across processes? And how do you deal with
getting the mappings to be the same across all the processes or deal
with them being different? But I don't think it's a great long-term
direction. It just seems clunky to have to copy things from mmapped
buffers to local buffers and back. Perhaps the performance testing
will show that clunkiness is well worth it but we'll need to see that
for a wide variety of workloads to judge that.

--
greg

#13Radosław Smogura
rsmogura@softperience.eu
In reply to: Greg Stark (#12)
Re: MMAP Buffers

Greg Stark <gsstark@mit.edu> Saturday 16 April 2011 13:00:19

On Sat, Apr 16, 2011 at 7:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:

The OP says that this patch maintains the WAL-before-data rule without
any explanation of how it accomplishes that seemingly quite amazing
feat. I assume I'm going to have to read this patch at some point to
refute this assertion, and I think that sucks. I am pretty nearly 100%
confident that this approach is utterly doomed, and I don't want to
spend a lot of time on it unless someone can provide me with a
compelling explanation of why my confidence is misplaced.

Fwiw he did explain how he did that. Or at least I think he did --
it's possible I read what I expected because what he came up with is
something I've recently been thinking about.

What he did, I gather, is treat the mmapped buffers as a read-only
copy of the data. To actually make any modifications he copies it into
shared buffers and treats them like normal. When the buffers get
flushed from memory they get written and then the pointers get
repointed back at the mmapped copy. Effectively this means the shared
buffers get extended to include all of the filesystem cache instead of
having to evict buffers from shared buffers just because you want to
read another one that's already in filesystem cache.

It doesn't save the copying between filesystem cache and shared
buffers for buffers that are actually being written to. But it does
save some amount of other copies on read-only traffic and it can even
save some i/o. It does require a function call before each buffer
modification where the pattern is currently <lock buffer>, <mutate
buffer>, <mark buffer dirty>. From what he describes he needs to add a
<prepare buffer for mutation> between the lock and mutate.

I think it's an interesting experiment and it's good to know how to
solve some of the subproblems. Notably, how do you extend files or
drop them atomically across processes? And how do you deal with
getting the mappings to be the same across all the processes or deal
with them being different? But I don't think it's a great long-term
direction. It just seems clunky to have to copy things from mmapped
buffers to local buffers and back. Perhaps the performance testing
will show that clunkiness is well worth it but we'll need to see that
for a wide variety of workloads to judge that.

In short words, I swap, exchange (clash of terms) VM pages to prevent pointers
(only if needed). I tried to directly point to new memory area, but I saw that
some parts of code really depends on memory pointed by original pointers, e.g.
Vaccumm uses hint bits setted by previous scan (it depends on this if bit is
set or not! so for it it's not only hint). Just from this case I can't assume
there is no more such places, so VM pages swap does it for me.

Stand alone tests shows for me that this process (with copy from mmap) is
2x-3x time longer then previous. But until someone will not update whole
table, then benefit will be taken from pre-update scan, index scans, larger
availability of memory (you don't eat cache memory to keep copy of cache in
ShMem). Everything may be slower when database fits in ShMem, and similarly
(2nd level bufferes may increase performance slightly).

I reserve memory for whole segment even if file is smaller. Extending is by
wirte one byte at the end of block (here may come deal with Unfiorm Buffer
Caches, if I remember name well). For current processors, and current
implementation database size is limited to about 260TB (no dynamic segment
reservation is performed).

Truncation not implemented.

Each buffer descriptor has tagVersion to simple check if buffer tag has
changed. Descriptors (partially) are mirrored in local memory, and versions
are checked. Currently each re-read (is pointed to smgr/md), but introduce
shared segment id, and assuming each segment has constant maximum number of
blocks, will make it faster (this will be something like current buffer tag),
even version field will be unneeded.

I saw problems with vacuum, as it reopens relation and I got mappings of same
file twice (minor problem). Important will be about deletion, when pointers
must invalidated in "good way".

Regards,
Radek.

#14Marko Kreen
markokr@gmail.com
In reply to: Greg Smith (#7)
Re: MMAP Buffers

On Sat, Apr 16, 2011 at 8:48 AM, Greg Smith <greg@2ndquadrant.com> wrote:

Joshua Berkus wrote:

Guys, can we *please* focus on the patch for now, rather than the
formatting, which is fixable with sed?

Never, and that's not true.  Heikki was being nice; I wouldn't have even
slogged through it long enough to ask the questions he did before kicking it
back as unusable.  A badly formatted patch makes it impossible to evaluate
whether the changes from a submission are reasonable or not without the
reviewer fixing it first.  And you can't automate correcting it, it takes a
lot of tedious manual work.  Start doing a patch review every CommitFest
cycle and you very quickly realize it's not an ignorable problem.  And lack
of discipline in minimizing one's diff is always a sign of other code
quality issues.

Potential contributors to PostgreSQL should know that a badly formatted
patch faces an automatic rejection, because no reviewer can work with it
easily.  This fact is not a mystery; in fact it's documented at
http://wiki.postgresql.org/wiki/Submitting_a_Patch :  "The easiest way to
get your patch rejected is to make lots of unrelated changes, like
reformatting lines, correcting comments you felt were poorly worded etc.
Each patch should have the minimum set of changes required to fulfil the
single stated objective."  I think I'll go improve that text next--something
like "Ways to get your patch rejected" should be its own section.

The problem here isn't whether someone used an IDE or not, it's that this
proves they didn't read their own patch before submitting it.  Reading one's
own diff and reflecting on what you've changed is one of the extremely
underappreciated practices of good open-source software development.
 Minimizing the size of that diff is perhaps the most important thing
someone can do in order to make their changes to a piece of software better.
 Not saying something that leads in that direction would be a disservice to
the submitter.

P.S. You know what else I feel should earn an automatic rejection without
any reviewer even looking at the code?  Submitting a patch that claims to
improve performance and not attaching the test case you used, along with
detailed notes about before/after tests on your own hardware.  A hand wave
"it's faster" is never good enough, and it's extremely wasteful of our
limited reviewer resources to try and duplicate what the submitter claimed.
 Going to add something about that to the submission guidelines too.

Give the OP a break - he was not "re-styling", he was clearly trying to make
crappily indented Postgres code readable in his editor. Reading the patch
would not matter because the original code would still be crappily indented.

Yes, such patch is bad, but what should the proper response be in such
situation?

Hint: it can be both polite and short.

--
marko

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#12)
Re: MMAP Buffers

Greg Stark <gsstark@mit.edu> writes:

What he did, I gather, is treat the mmapped buffers as a read-only
copy of the data. To actually make any modifications he copies it into
shared buffers and treats them like normal. When the buffers get
flushed from memory they get written and then the pointers get
repointed back at the mmapped copy.

That seems much too late --- won't other processes still be looking at
the stale mmap'ed version of the page until a write-out happens?

I'm pretty concerned about the memory efficiency of this too, since it
seems like it's making it *guaranteed*, not just somewhat probable,
that there are two copies in RAM of every database page that's been
modified since the last checkpoint (or so).

regards, tom lane

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Radosław Smogura (#1)
Re: MMAP Buffers

On Fri, 2011-04-15 at 12:32 +0200, Radosław Smogura wrote:

I didn't included this, as diff, because of ~150kb size (mainly
configure scripts, which are included in SVC). Due to this, You may
download it from
http://softperience.eu/downloads/pg_mmap_20110415.diff.bz2 (Legal:
Work
under PostgreSQL BSD Lincense). Patch is just GIT diff, later I will
try
to grab some git.

Btw., about 87% of this patch are diffs against configure and
pg_config.h.in, which are useless. If you strip those out, your patch
will be small enough to submit inline.

#17Joshua Berkus
josh@agliodbs.com
In reply to: Greg Smith (#7)
Re: Formatting Curmudgeons WAS: MMAP Buffers

All,

Never, and that's not true. Heikki was being nice; I wouldn't have
even
slogged through it long enough to ask the questions he did before
kicking it back as unusable. A badly formatted patch makes it
impossible to evaluate whether the changes from a submission are
reasonable or not without the reviewer fixing it first.

Then you can say that politely and firmly with direct reference to the problem, rather than making the submitter feel bad.

"Thank you for taking on testing an idea we've talked about on this list for a long time and not had the energy to test. However, I'm having a hard time evaluating your patch for a few reasons ...(give reasons). Would it be possible for you to resolve these and resubmit so that I can give the patch a good evaluation?"

... and once *one* person on this list has made such a comment, there is no need for two other hackers to pile on the reformat-your-patch bandwagon.

Our project has an earned reputation for being rejection-happy curmudgeons. This is something I heard more than once at MySQLConf, including from one student who chose to work on Drizzle instead of PostgreSQL for that reason. I think that we could stand to go out of our way to be helpful to first-time submitters.

That doesn't mean that we have to accept patches mangled by using an IDE designed for Java, and which lack test cases. However, we can be nice about it.

--
Josh Berkus
Niceness Nazi

#18Joshua Berkus
josh@agliodbs.com
In reply to: Radosław Smogura (#6)
Re: MMAP Buffers

Radoslaw,

I think 10% is quite good, as my stand-alone test of mmap vs. read
shown that
speed up of copying 100MB data to mem may be from ~20ms to ~100ms
(depends on
destination address). Of course deeper, system test simulating real
usage will
say more. In any case after good deals with writes, I will speed up
reads. I
think to bypass smgr/md much more and to expose shared id's (1,2,3...)
for
each file segment.

Well, given the risks to durability and stability associated with using MMAP, I doubt anyone would even consider it for a 10% throughput improvement. However, I don't think the test you used demonstrates the best case for MMAP as a performance improvement.

In attachment I sent test-scripts which I used to fill data, nothing
complex
(left from 2nd level caches).

Query I've used to measure was SELECT count(substr(content, 1, 1))
FROM
testcase1 WHERE multi_id > 50000;

Timings ware taken from psql.

I didn't made load (I have about 2GB of free sapce at /home, and 4GB
RAM) and
stress (I'm not quite ready to try concurrent updates of same page -
may fail,
notice is and place to fix is in code) tests yet.

Yes, but this test case doesn't offer much advantage to MMAP. Where I expect it would shine would be cases where the database is almost as big as, or much bigger than RAM ... where the extra data copying by current code is both frequent and wastes buffer space we need to use. As well as concurrent reads from the same rows.

You can write a relatively simple custom script using pgBench to test this; you don't need a big complicated benchmark. Once we get over the patch cleanup issues, I might be able to help with this.

Netbeans is quite good, of course it depends who likes what. Just try
7.0 RC
2.

I don't know if you've followed the formatting discussion, but apparently there's an issue with Netbeans re-indenting lines you didn't even edit. It makes your patch hard to read or apply. I expect that Netbeans has some method to reconfigure indenting, etc.; do you think you could configure it to PostgresQL standards so that this doesn't get in the way of evaluation of your ideas?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

#19Greg Smith
greg@2ndquadrant.com
In reply to: Joshua Berkus (#17)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Joshua Berkus wrote:

Then you can say that politely and firmly with direct reference to the problem, rather than making the submitter feel bad.

That's exactly what happened. And then you responded that it was
possible to use a patch without fixing the formatting first. That's not
true, and those of us who do patch review are tired of even trying.

Our project has an earned reputation for being rejection-happy curmudgeons. This is something I heard more than once at MySQLConf, including from one student who chose to work on Drizzle instead of PostgreSQL for that reason. I think that we could stand to go out of our way to be helpful to first-time submitters.

I'll trade you anecdotes by pointing out that I heard from half a dozen
business people that the heavy emphasis on quality control and standards
was the reason they were looking into leaving MySQL derived
distributions for PostgreSQL.

I've spent days of time working on documentation to help new submitters
get their patches improve to where they meet this community's
standards. This thread just inspired another round of that. What
doesn't help is ever telling someone they can ignore those and still do
something useful we're interested in.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#20Marko Kreen
markokr@gmail.com
In reply to: Radosław Smogura (#3)
Re: MMAP Buffers

On Fri, Apr 15, 2011 at 3:12 PM, Radosław Smogura
<rsmogura@softperience.eu> wrote:

On Fri, 15 Apr 2011 14:33:37 +0300, Heikki Linnakangas wrote:

The patch is quite hard to read because of random whitespace changes
and other stylistic issues, but I have a couple of high-level
questions on the design:

Yes, but, hmm... in Netbeans I had really long gaps (probably 8 spaces, from
tabs), so deeper "ifs", comments at the and of variables, went of out my
screen. I really wanted to not format this, but sometimes I needed.

Seems no one else has mentioned it yet -- Postgres uses non-standard
tab-width of 4, instead of 8. If you see ugly code in Postgres, thats why...

--
marko

#21Radosław Smogura
rsmogura@softperience.eu
In reply to: Tom Lane (#15)
Re: MMAP Buffers

Tom Lane <tgl@sss.pgh.pa.us> Saturday 16 April 2011 17:02:32

Greg Stark <gsstark@mit.edu> writes:

What he did, I gather, is treat the mmapped buffers as a read-only
copy of the data. To actually make any modifications he copies it into
shared buffers and treats them like normal. When the buffers get
flushed from memory they get written and then the pointers get
repointed back at the mmapped copy.

That seems much too late --- won't other processes still be looking at
the stale mmap'ed version of the page until a write-out happens?

No, no, no :) I wanted to do this, but from above reason I skipped it. I swap
VM pages, I do remap, in place where the shared buffer was I put mmaped page,
and in place where mmaped page was I put shared page (in certain cases, which
should be optimized by e. g. read for update, for initial read of page in
process I directly points to shared buffer), it can be imagined as I affects
TLB. This what I call "VM swap" is remapping, so I don't change pointers, I
change only where this pointers points in physical memory, preserving same
pointer in Virtual Memory.

if 0x1 is start of buffer 1 (at relation 1, block 1)
I have
0x1 - 0x1 + BLCKSZ -> mmaped area
0xfffff1000 - 0xfffff1000 + BLCKSZ -> Shmem

SWAP
0x1 - 0x1 + BLCKSZ -> Shmem
0xfffff1000 - 0xfffff1000 + BLCKSZ -> mmaped area

It's reason I putted in crash reports /proc/{pid}/maps. For e. g. maps after
swap looks like (from crash report):

[...]
#Data mappings
7fe69b7e3000-7fe69b7ef000 r--s 00000000 08:03 3196408
/home/radek/src/postgresql-2nd-level-cache/db/base/12822/12516
7fe69b7ef000-7fe69b7f1000 rw-s 00148000 00:04 8880132
/SYSV0052ea91 (deleted)
7fe69b7f1000-7fe6db7e3000 r--s 0000e000 08:03 3196408
/home/radek/src/postgresql-2nd-level-cache/db/base/12822/12516
[...]
#SysV shmem mappings

7fec60788000-7fec6078c000 rw-s 00144000 00:04 8880132
/SYSV0052ea91 (deleted)
7fec6078c000-7fec6078e000 r--s 0000c000 08:03 3196408
/home/radek/src/postgresql-2nd-level-cache/db/base/12822/12516
7fec6078e000-7fec6079c000 rw-s 0014a000 00:04 8880132
/SYSV0052ea91 (deleted)
[...]

Without swap 12516 should be mapped to one VM region of size equal to
BLCKSZ*BLOCKS_PER_SEGMENT (which is about 1GB).

When process reads buffer (or after taking lock), the shared buffer descriptor
is checked if page was modified (currently is it dirty) if yes do swap, if
page is currently in use, or use directly SysV shared areas if pages is just
pinned to process.

Regards,
Radek

Show quoted text

I'm pretty concerned about the memory efficiency of this too, since it
seems like it's making it *guaranteed*, not just somewhat probable,
that there are two copies in RAM of every database page that's been
modified since the last checkpoint (or so).

regards, tom lane

#22Greg Smith
greg@2ndquadrant.com
In reply to: Radosław Smogura (#3)
Re: MMAP Buffers

Radosław Smogura wrote:

Yes, but, hmm... in Netbeans I had really long gaps (probably 8
spaces, from tabs), so deeper "ifs", comments at the and of variables,
went of out my screen. I really wanted to not format this, but
sometimes I needed.

The guide at
http://www.open-source-editor.com/editors/how-to-make-netbeans-use-tabs-for-indention.html
seems to cover how to fix this in Netbeans. You want it to look like
that screen shot: 4 spaces per indent with matching tab size of 4, and
"Expand Tabs to Spaces" unchecked.

Generally, if you look at the diff you've created, and your new code
doesn't line up right with what's already there, that means the
tab/space setup isn't quite right when you were editing. Reading the
diff is useful for catching all sorts of other issues, too, so it's just
generally a good practice. As Peter already mentioned, the big problem
here is that you checked in a modified configure file.

I also note that you use C++ style "//" comments, which aren't allowed
under the coding guidelines--even though they work fine on many common
platforms.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Radosław Smogura (#21)
Re: MMAP Buffers

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

No, no, no :) I wanted to do this, but from above reason I skipped it. I swap
VM pages, I do remap, in place where the shared buffer was I put mmaped page,
and in place where mmaped page was I put shared page (in certain cases, which
should be optimized by e. g. read for update, for initial read of page in
process I directly points to shared buffer), it can be imagined as I affects
TLB. This what I call "VM swap" is remapping, so I don't change pointers, I
change only where this pointers points in physical memory, preserving same
pointer in Virtual Memory.

... Huh? Are you saying that you ask the kernel to map each individual
shared buffer separately? I can't believe that's going to scale to
realistic applications.

regards, tom lane

#24Christopher Browne
cbbrowne@gmail.com
In reply to: Greg Smith (#19)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Sat, Apr 16, 2011 at 3:19 PM, Greg Smith <greg@2ndquadrant.com> wrote:

Joshua Berkus wrote:

Then you can say that politely and firmly with direct reference to the
problem, rather than making the submitter feel bad.

That's exactly what happened.  And then you responded that it was possible
to use a patch without fixing the formatting first.  That's not true, and
those of us who do patch review are tired of even trying.

It would be worth a lot if we could get it enough easier to use
pgindent, so that that could help *anyone* fix the formatting, as
opposed to being something that Bruce runs once in a long while.

If you can say, "here, run 'tools/frobozz/pg_indent' against each of
your files, then resubmit the patch," and have at least a fighting
chance of that being *nearly* right, that is a much nicer response to
give those folks.

Alternately, it would be nice if you could say, "I ran pgindent
against your files, here's the revised patch, please do that yourself
in future"

When application of formatting policy is near-nondeterministic, that's no fun!
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

#25Robert Haas
robertmhaas@gmail.com
In reply to: Christopher Browne (#24)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Sat, Apr 16, 2011 at 9:12 PM, Christopher Browne <cbbrowne@gmail.com> wrote:

On Sat, Apr 16, 2011 at 3:19 PM, Greg Smith <greg@2ndquadrant.com> wrote:

Joshua Berkus wrote:

Then you can say that politely and firmly with direct reference to the
problem, rather than making the submitter feel bad.

That's exactly what happened.  And then you responded that it was possible
to use a patch without fixing the formatting first.  That's not true, and
those of us who do patch review are tired of even trying.

It would be worth a lot if we could get it enough easier to use
pgindent, so that that could help *anyone* fix the formatting, as
opposed to being something that Bruce runs once in a long while.

If you can say, "here, run 'tools/frobozz/pg_indent' against each of
your files, then resubmit the patch," and have at least a fighting
chance of that being *nearly* right, that is a much nicer response to
give those folks.

Alternately, it would be nice if you could say, "I ran pgindent
against your files, here's the revised patch, please do that yourself
in future"

I agree.

But it turns out that it doesn't really matter. Whitespace or no
whitespace, if you don't read the diff before you hit send, it's
likely to contain some irrelevant cruft, whether whitespace changes or
otherwise.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#26Andrew Dunstan
andrew@dunslane.net
In reply to: Christopher Browne (#24)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/16/2011 09:12 PM, Christopher Browne wrote:

On Sat, Apr 16, 2011 at 3:19 PM, Greg Smith<greg@2ndquadrant.com> wrote:

Joshua Berkus wrote:

Then you can say that politely and firmly with direct reference to the
problem, rather than making the submitter feel bad.

That's exactly what happened. And then you responded that it was possible
to use a patch without fixing the formatting first. That's not true, and
those of us who do patch review are tired of even trying.

It would be worth a lot if we could get it enough easier to use
pgindent, so that that could help *anyone* fix the formatting, as
opposed to being something that Bruce runs once in a long while.

If you can say, "here, run 'tools/frobozz/pg_indent' against each of
your files, then resubmit the patch," and have at least a fighting
chance of that being *nearly* right, that is a much nicer response to
give those folks.

Alternately, it would be nice if you could say, "I ran pgindent
against your files, here's the revised patch, please do that yourself
in future"

When application of formatting policy is near-nondeterministic, that's no fun!

What makes you think this isn't possible to run pgindent? There are no
secret incantations.

But it's probably overkill. emacs' indent-region gives you about a 90%
result or better if you're set up correctly.

cheers

andrew

#27Robert Haas
robertmhaas@gmail.com
In reply to: Joshua Berkus (#18)
Re: MMAP Buffers

On Sat, Apr 16, 2011 at 2:33 PM, Joshua Berkus <josh@agliodbs.com> wrote:

Well, given the risks to durability and stability associated with using MMAP, I doubt anyone would even consider it for a 10% throughput improvement.  However, I don't think the test you used demonstrates the best case for MMAP as a performance improvement.

Actually, I'd walk through fire for a 10% performance improvement if
it meant only a *risk* to stability. The problem is that this is
likely unfixably broken. In particular, I think the first sentence of
Tom's response hit it right on the nose, and mirrors my own thoughts
on the subject. To have any chance of working, you'd need to track
buffer pins and shared/exclusive content locks for the pages that were
being accessed outside of shared buffers; otherwise someone might be
looking at a stale copy of the page.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#28Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#27)
Re: MMAP Buffers

On Sat, Apr 16, 2011 at 9:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Apr 16, 2011 at 2:33 PM, Joshua Berkus <josh@agliodbs.com> wrote:

Well, given the risks to durability and stability associated with using MMAP, I doubt anyone would even consider it for a 10% throughput improvement.  However, I don't think the test you used demonstrates the best case for MMAP as a performance improvement.

Actually, I'd walk through fire for a 10% performance improvement if
it meant only a *risk* to stability.  The problem is that this is
likely unfixably broken.  In particular, I think the first sentence of
Tom's response hit it right on the nose, and mirrors my own thoughts
on the subject.  To have any chance of working, you'd need to track
buffer pins and shared/exclusive content locks for the pages that were
being accessed outside of shared buffers; otherwise someone might be
looking at a stale copy of the page.

Of course, maybe the patch is doing that. Rereading the thread, I
grow increasingly confused about what this is actually supposed to do
and how it's supposed to work and why it's supposedly better than what
we do now. But please, everyone feel free to continue bashing me for
wanting a readable patch with some understandable submission notes.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#29Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Andrew Dunstan (#26)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

What makes you think this isn't possible to run pgindent?

I have to say, I've been rather mystified by the difficulty
attributed to running pgindent. During work on the SSI patch, I ran
it about once every two weeks on files involved in the patch, just so
that it would be easier to review by people used to that format. I
also tried to keep src/tools/pgindent/typedefs.list up to date with
new structures, so that my runs were good. Granted, when the
official run was done there were a few adjustments to typedefs.list,
and some comments which were added after the commit of the main part
of the patch hadn't yet been wrapped to the right line length, but on
the whole I didn't find it a big deal to stay relatively close by
doing periodic runs. Maybe three minutes every two weeks.

When people talk like it's hugely difficult or hard to understand, I
wonder if they have actually made the attempt. When someone is eager
for feedback on a patch, it doesn't seem unreasonable to me to ask
them to read the README for pgindent and try to generate a patch with
conforming results.

Now, the other aspect to this whole discussion is that people often
have code they have developed for academic purposes or for their own
use which they want to offer to the community "FWIW", and I think we
sometimes miss an opportunity to take advantage of someone else's
work because of an assumption that they have some vested interest in
it's acceptance. The fact that someone doesn't care enough to try to
work with the community to get their patch accepted doesn't *always*
mean that we're better off for ignoring that patch. Maybe that's
true 90% of the time or better, but it seems to me that sometimes our
community is a bit provincial.

And I can't help but wonder why, in an off-list discussion with
Michael Cahill about the SSI technology he commented that he was
originally intending to implement the technique in PostgreSQL, but
later chose Oracle Berkeley DB and then latter InnoDB instead.
*Maybe* he was looking toward being hired by Oracle, and *maybe* it
was because the other databases already had predicate locking and
true serializable transaction isolation levels -- but was part of it
the reputation of the community? I keep wondering.

-Kevin

#30Jeff Janes
jeff.janes@gmail.com
In reply to: Andrew Dunstan (#26)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 4/16/11, Andrew Dunstan <andrew@dunslane.net> wrote:

What makes you think this isn't possible to run pgindent? There are no
secret incantations.

A while ago I spent a few hours trying to run it and gave up. I think
it was something about needing some obscure BSD version of some tool
which conflicted with just about everything else on the system. I can
try again and report back if anyone cares.

#31Magnus Hagander
magnus@hagander.net
In reply to: Jeff Janes (#30)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Apr 17, 2011 8:17 AM, "Jeff Janes" <jeff.janes@gmail.com> wrote:

On 4/16/11, Andrew Dunstan <andrew@dunslane.net> wrote:

What makes you think this isn't possible to run pgindent? There are no
secret incantations.

A while ago I spent a few hours trying to run it and gave up. I think
it was something about needing some obscure BSD version of some tool
which conflicted with just about everything else on the system. I can
try again and report back if anyone cares.

It does rely on BSD indent. For that very reason, we provide the source for
it, since most people have gnu indent. It's trivial to build, though, and
works just fine as a local build, and you can keep gnu indent as the main
one on your system - no conflicts.

It used to be a PITA due to the typedef list, but that has been fixed.
Perhaps we just need to document it a bit more...

/Magnus

#32Greg Smith
greg@2ndquadrant.com
In reply to: Andrew Dunstan (#26)
1 attachment(s)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

What makes you think this isn't possible to run pgindent? There are no
secret incantations.

The first hit newbies find looking for info about pgident is
http://blog.hagander.net/archives/185-pgindent-vs-dash.html which sure
looks like secret incantations to me. The documentation
src/tools/pgindent/README reads like a magic spell too:

find . -name '*.[ch]' -type f -print | \
egrep -v -f src/tools/pgindent/exclude_file_patterns | \
xargs -n100 pgindent src/tools/pgindent/typedefs.list

And it doesn't actually work as written unless you've installed
pgindent, entab/detab, and the specially patched NetBSD indent into the
system PATH somewhere--unreasonable given that this may be executing on
a source only tree that has never been installed.. The fact that the
documention is only in the README and not with the rest of the code
conventions isn't helping either.

The last time I tried to do this a few years ago I failed miserably and
never came back. I know way more about building software now though,
and just got this to work for the first time. Attached is a WIP wrapper
script for running pgident that builds all the requirements into
temporary directories, rather than expecting you to install anything
system-wide or into a PostgreSQL destination directory. Drop this into
src/tools/pgindent, make it executable, and run it from that directory.
Should do the right thing on any system that has "make" as an alias for
"gmake" (TODO to be better about that in the file, with some other
nagging things).

When I just ran it against master I got a bunch of modified files, but
most of them look like things that have been touched recently so I think
it did the right thing. A test of my work here from someone who isn't
running this for the first time would be helpful. If this works well
enough, I think it would make a good helper script to include in the
distribution. The loose ends to fix I can take care of easily enough
once basic validation is finished.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

Attachments:

run-pgindenttext/plain; name=run-pgindentDownload
#33Radosław Smogura
rsmogura@softperience.eu
In reply to: Tom Lane (#23)
Re: MMAP Buffers

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 01:35:45

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

No, no, no :) I wanted to do this, but from above reason I skipped it. I
swap VM pages, I do remap, in place where the shared buffer was I put
mmaped page, and in place where mmaped page was I put shared page (in
certain cases, which should be optimized by e. g. read for update, for
initial read of page in process I directly points to shared buffer), it
can be imagined as I affects TLB. This what I call "VM swap" is
remapping, so I don't change pointers, I change only where this pointers
points in physical memory, preserving same pointer in Virtual Memory.

... Huh? Are you saying that you ask the kernel to map each individual
shared buffer separately? I can't believe that's going to scale to
realistic applications.

regards, tom lane

No, I do
mrempa(mmap_buff_A, MAP_FIXED, temp);
mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
mrempa(tmp, MAP_FIXED, mmap_buff_A).

This is this additional overhead - and may have some disadvantages. All
regions SysV / Posix MMAP are mapped before.

I couldn't believe too, but as I done some work about read, I was in dead
corner:
1. Create Read Before Buffer (connect with XLOG) that will store each page
before modification (page should be flushed and synced to log)
2. Rewrite whole db to repoint pointers or similar stuff (I done few steps for
this).
3. Or find something different.

I couldn't believe too, it's way I still work on it. I saw it gains speed for
few simple updates. I'm not quite sure why it gets it. I only may think it
was from "pre update reads". But full checks will go after some good point of
updates.

Regards,
Radek

#34Greg Smith
greg@2ndquadrant.com
In reply to: Robert Haas (#25)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas wrote:

But it turns out that it doesn't really matter. Whitespace or no
whitespace, if you don't read the diff before you hit send, it's
likely to contain some irrelevant cruft, whether whitespace changes or
otherwise.

Right. Presuming that pgident will actually solve anything leaps over
two normally incorrect assumptions:

-That the main tree was already formatted with pgident before you
started, so no stray diffs will result from it touching things the
submitter isn't even involved in.

-There is no larger code formatting or diff issues except for spacing.

This has been a nagging loose end for a while, so I'd like to see
pgindent's rough edges get sorted out so it's easier to use. But
whitespace errors because of bad editors are normally just a likely sign
of a patch with bigger problems, rather than something that can get
fixed and then submissions is good. There is no substitute for the
discipline of reading your own diff before submission. I'll easily
obsess over mine for an hour before I submit something major, and that
time is always well spent.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#35Andrew Dunstan
andrew@dunslane.net
In reply to: Jeff Janes (#30)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 02:16 AM, Jeff Janes wrote:

On 4/16/11, Andrew Dunstan<andrew@dunslane.net> wrote:

What makes you think this isn't possible to run pgindent? There are no
secret incantations.

A while ago I spent a few hours trying to run it and gave up. I think
it was something about needing some obscure BSD version of some tool
which conflicted with just about everything else on the system. I can
try again and report back if anyone cares.

A few hours? Seriously?

Here's what I just did, starting from scratch. It took me a few minutes.

* wget ftp://ftp.postgresql.org/pub/dev/indent.netbsd.patched.tgz
* mkdir bsdindent && cd bdsindent && tar -z -xf
.../indent.netbsd.patched.tgz
* make
* mv indent indent_for_pg
* sudo install -s -o bin -g bin indent_for_pg /usr/local/bin
* cd ../pg_head/src/tools/entab
* sudo make install
* cd ../pgindent
* sed -i 's/INDENT=indent/INDENT=indent_for_pg/' pgindent
* sudo install -s -o bin -g bin pgindent /usr/local/bin

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current
directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just seems
totally ill-conceived.

It might also be worth setting it up so that instead of having to pass a
path to a typedefs file on the command line, we default to a file
sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

But it shouldn't take anyone hours to set up.

cheers

andrew

#36Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#32)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 04:08 AM, Greg Smith wrote:

Andrew Dunstan wrote:

What makes you think this isn't possible to run pgindent? There are
no secret incantations.

The first hit newbies find looking for info about pgident is
http://blog.hagander.net/archives/185-pgindent-vs-dash.html which sure
looks like secret incantations to me. The documentation
src/tools/pgindent/README reads like a magic spell too:

find . -name '*.[ch]' -type f -print | \
egrep -v -f src/tools/pgindent/exclude_file_patterns | \
xargs -n100 pgindent src/tools/pgindent/typedefs.list

That's the incantation for indenting the whole of the source code. But
very few people want to do that. Most people just want to indent a
single file, for which the incantation is "pgindent path_to_typedefs
my_file.c". See in another message my suggestion for defaulting the
typedefs arg, so you'd just be able to say "pg_indent my_file.c".

cheers

andrew

#37Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#35)
2 attachment(s)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 04:26 AM, Andrew Dunstan wrote:

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current
directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just
seems totally ill-conceived.

It might also be worth setting it up so that instead of having to pass
a path to a typedefs file on the command line, we default to a file
sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

OK, I have most of these bits.

A new tarball of indent is available at
<http://developer.postgresql.org/~andrew/indent.netbsd.patched.tgz&gt; and
if everyone agrees I'll push it out to the mirrors.

Attached are two patches, one to remove some infelicity in the entab
makefile, and the other to allow skipping specifying the typedefs file
location either by setting it in an environment variable or by putting
it in a hard coded location.

cheers

andrew

Attachments:

entab.patchtext/x-patch; name=entab.patchDownload
diff --git a/src/tools/entab/Makefile b/src/tools/entab/Makefile
index de81818..6372971 100644
--- a/src/tools/entab/Makefile
+++ b/src/tools/entab/Makefile
@@ -20,9 +20,7 @@ halt.o	: halt.c
 clean:
 	rm -f *.o $(TARGET) log core
 
-install:
-	make clean
-	make CFLAGS=-O
+install: $(TARGET)
 	install -s $(TARGET) $(BINDIR)
 	rm -f $(BINDIR)/detab
 	ln $(BINDIR)/$(TARGET) $(BINDIR)/detab
pgindent.patchtext/x-patch; name=pgindent.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 05f69ef..08dde0c 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,13 +13,32 @@
 #
 #	void x(struct xxc * a);
 
-if [ "$#" -lt 2 ]
-then	echo "Usage:  $(basename $0) typedefs file [...]" 1>&2
-	exit 1
+# look for typedefs in the first argument, if there is a second argument and
+# the first argument contains the string 'typedef' in its name, or in the
+# environment setting PGTYPEDEFS, or in a hardcoded location, whichever
+# matches first.
+
+
+if [ $# -gt 1 ]
+then
+	case `basenname $0` in
+		*typedef*) 
+			TYPDEFS=$1
+			shift
+			;;
+		*)
+			;;
+	esac
 fi
 
-TYPEDEFS="$1"
-shift
+test -z "$TYPEDEFS" && TYPEDEFS=$PGTYPEDEFS
+test -z "$TYPEDEFS" && TYPEDEFS=/usr/local/etc/pgtypedefs.list
+
+if [ ! -f "$TYPEDEFS" ]
+then
+	echo "Cannot find typedefs file '$TYPEDEFS'"
+	exit 1
+fi 
 
 if [ -z "$INDENT" ]
 then
#38Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#37)
1 attachment(s)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 09:51 AM, Andrew Dunstan wrote:

On 04/17/2011 04:26 AM, Andrew Dunstan wrote:

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current
directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just
seems totally ill-conceived.

It might also be worth setting it up so that instead of having to
pass a path to a typedefs file on the command line, we default to a
file sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

OK, I have most of these bits.

A new tarball of indent is available at
<http://developer.postgresql.org/~andrew/indent.netbsd.patched.tgz&gt;
and if everyone agrees I'll push it out to the mirrors.

Attached are two patches, one to remove some infelicity in the entab
makefile, and the other to allow skipping specifying the typedefs file
location either by setting it in an environment variable or by putting
it in a hard coded location.

... and this one has a typo fixed.

cheers

andrew

Attachments:

pgindent.patchtext/x-patch; name=pgindent.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 05f69ef..02f2b93 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -13,13 +13,32 @@
 #
 #	void x(struct xxc * a);
 
-if [ "$#" -lt 2 ]
-then	echo "Usage:  $(basename $0) typedefs file [...]" 1>&2
-	exit 1
+# look for typedefs in the first argument, if there is a second argument and
+# the first argument contains the string 'typedef' in its name, or in the
+# environment setting PGTYPEDEFS, or in a hardcoded location, whichever
+# matches first.
+
+
+if [ $# -gt 1 ]
+then
+	case `basename $0` in
+		*typedef*) 
+			TYPDEFS=$1
+			shift
+			;;
+		*)
+			;;
+	esac
 fi
 
-TYPEDEFS="$1"
-shift
+test -z "$TYPEDEFS" && TYPEDEFS=$PGTYPEDEFS
+test -z "$TYPEDEFS" && TYPEDEFS=/usr/local/etc/pgtypedefs.list
+
+if [ ! -f "$TYPEDEFS" ]
+then
+	echo "Cannot find typedefs file '$TYPEDEFS'"
+	exit 1
+fi 
 
 if [ -z "$INDENT" ]
 then
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Radosław Smogura (#33)
Re: MMAP Buffers

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 01:35:45

... Huh? Are you saying that you ask the kernel to map each individual
shared buffer separately? I can't believe that's going to scale to
realistic applications.

No, I do
mrempa(mmap_buff_A, MAP_FIXED, temp);
mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
mrempa(tmp, MAP_FIXED, mmap_buff_A).

There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
nor on my quite-up-to-date OS X box. The Linux man page for it says
"This call is Linux-specific, and should not be used in programs
intended to be portable." So if the patch is dependent on that call,
it's dead on arrival from a portability standpoint.

But in any case, you didn't explain how use of mremap() avoids the
problem of the kernel having to maintain a separate page-mapping-table
entry for each individual buffer. (Per process, yet.) If that's what's
happening, it's going to be a significant performance penalty as well as
(I suspect) a serious constraint on how many buffers can be managed.

regards, tom lane

#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#28)
The big picture for patch submission (was Re: MMAP Buffers)

Robert Haas <robertmhaas@gmail.com> writes:

... But please, everyone feel free to continue bashing me for
wanting a readable patch with some understandable submission notes.

What he said. All this obsessing over whether the mmap patch could or
should have been run through pgindent is missing the big picture.
Namely, that no design documentation or theory-of-operation was offered,
and people are trying to extract that information from the code, and
it's just too messy for that to be feasible. (The patch isn't really
short of comments, but half of the comments seem to be TODOs or author's
questions to himself about whether something will work, and so they just
aren't particularly helpful to someone trying to understand what the
patch does or whether it will work.)

I think that rather than complaining about formatting, we should be
complaining about not following the overall patch submission process
and not providing adequate documentation. Most of the questions that
people are asking right now could have been answered on the strength of
a design sketch, before any code had been written at all. For a patch
as complicated and invasive as this, there should be a design sketch,
which perhaps gets fleshed out into a README file in the final patch.

The Submitting_a_Patch wiki page does touch on the point of getting some
early design feedback before you even try to write a patch, but I think
it could do with more emphasis on the issue.

regards, tom lane

#41Radosław Smogura
rsmogura@softperience.eu
In reply to: Tom Lane (#39)
Re: MMAP Buffers

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 17:48:56

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 01:35:45

... Huh? Are you saying that you ask the kernel to map each individual
shared buffer separately? I can't believe that's going to scale to
realistic applications.

No, I do
mrempa(mmap_buff_A, MAP_FIXED, temp);
mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
mrempa(tmp, MAP_FIXED, mmap_buff_A).

There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
nor on my quite-up-to-date OS X box. The Linux man page for it says
"This call is Linux-specific, and should not be used in programs
intended to be portable." So if the patch is dependent on that call,
it's dead on arrival from a portability standpoint.

Good point. This is from initial concept, and actually I done this to do not
leave gaps in VM in which library or something could be mmaped. Last time I
think about using mmap to replace just one VM page.

But in any case, you didn't explain how use of mremap() avoids the
problem of the kernel having to maintain a separate page-mapping-table
entry for each individual buffer. (Per process, yet.) If that's what's
happening, it's going to be a significant performance penalty as well as
(I suspect) a serious constraint on how many buffers can be managed.

regards, tom lane

Kernel merges vm_structs. So mappings are compacted. I'm not kernel
specialist, but skipping memory consumption, for not compacted mappings,
kernel uses btrees for dealing with TLB, so it should not matter if there is
100 vm_structs or 100000 vm_structs.

Swap isn't made everywhere. When buffer is initialy read (privaterefcount
==1), then any access to this buffer will directly point to latest valid area.
If it has assigned shmem area then this will be used. I plan to add
"readbuffer for update" to prevent swaps, when it's almost sure that buffer
will be used for update.

I measured performance of page modifications (with unpining, full process on
stand alone unit test) it's 2x-3x more time of normal page reads, but this
result may not be sure, as I saw memcpy to memory above 2GB is slower then
memcpy to first 2GB (this may be idea to try to put some shared structs <
2GB).

I know that this patch is big question. Sometimes I'm optimistic, and
sometimes I'm pessimistic about final result.

Regards,
Radek

#42Andres Freund
andres@anarazel.de
In reply to: Radosław Smogura (#41)
Re: MMAP Buffers

On Sunday 17 April 2011 19:26:31 Radosław Smogura wrote:

Kernel merges vm_structs. So mappings are compacted. I'm not kernel
specialist, but skipping memory consumption, for not compacted mappings,
kernel uses btrees for dealing with TLB, so it should not matter if there
is 100 vm_structs or 100000 vm_structs.

But the CPUs TLB cache has maybe 16/256 (1lvl, 2nd) to 64/512 entries. That
will mean that there will be cachemisses all over.
Additionally your scheme requires flushing it regularly...

Andres

#43Joshua Berkus
josh@agliodbs.com
In reply to: Robert Haas (#27)
Re: MMAP Buffers

Robert,

Actually, I'd walk through fire for a 10% performance improvement if
it meant only a *risk* to stability.

Depends on the degree of risk. MMAP has the potential to introduce instability into areas of the code which have been completely reliable for years. Adding 20 new coredump cases with data loss for a 10% improvement seems like a poor bargain to me. It doesn't help that the only DB to rely heavily on MMAP (MongoDB) is OSSDB's paragon of data loss.

However, in the case where the database is larger than RAM ... or better, 90% of RAM ... MMAP has the theoretical potential to improve performance quite a bit more than 10% ... try up to 900% on some queries. However, I'd like to prove that in a test before we bother even debating the fundamental obstacles to using MMAP. It's possible that these theoretical performance benefits will not materialize, even without data safeguards.

The problem is that this is
likely unfixably broken. In particular, I think the first sentence of
Tom's response hit it right on the nose, and mirrors my own thoughts
on the subject. To have any chance of working, you'd need to track
buffer pins and shared/exclusive content locks for the pages that were
being accessed outside of shared buffers; otherwise someone might be
looking at a stale copy of the page.

Nothing is unfixable. The question is whether it's worth the cost. Let me see if I can build a tree with Radislaw's patch, and do some real performance tests.

I, for one, am glad he did this work. We've discussed MMAP in the code off and on for years, but nobody wanted to do the work to test it. Now someone has, and we can decide whether it's worth pursuing based on the numbers.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

#44Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua Berkus (#43)
Re: MMAP Buffers

Joshua Berkus <josh@agliodbs.com> writes:

I, for one, am glad he did this work. We've discussed MMAP in the code off and on for years, but nobody wanted to do the work to test it. Now someone has, and we can decide whether it's worth pursuing based on the numbers.

Well, the troubling issue is that it's not clear whether this patch is
realistic enough to think that performance measurements based on it
are representative of the whole idea of using mmap. The business of
remapping individual buffers in order to transition them to writable
state seems likely to me to be a huge performance penalty --- first
there's the direct cost of having to incur a kernel call each time we
do that, and second there's the distributed cost of asking the kernel
to manage thousands or millions of tiny mappings.

IOW, if this patch shows little or no performance improvement (as seems
likely to happen at scale), that doesn't prove that mmap in general
isn't potentially interesting, only that this isn't the right way to
approach it.

Still, if you do some tests and don't find a win, that might save time
compared to actually trying to understand and vet the patch ...

regards, tom lane

#45Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#39)
Re: MMAP Buffers

On Sun, Apr 17, 2011 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 01:35:45

... Huh?  Are you saying that you ask the kernel to map each individual
shared buffer separately?  I can't believe that's going to scale to
realistic applications.

No, I do
mrempa(mmap_buff_A, MAP_FIXED, temp);
mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
mrempa(tmp, MAP_FIXED, mmap_buff_A).

There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
nor on my quite-up-to-date OS X box.  The Linux man page for it says
"This call is Linux-specific, and should not be used in programs
intended to be portable."  So if the patch is dependent on that call,
it's dead on arrival from a portability standpoint.

But in any case, you didn't explain how use of mremap() avoids the
problem of the kernel having to maintain a separate page-mapping-table
entry for each individual buffer.  (Per process, yet.)  If that's what's
happening, it's going to be a significant performance penalty as well as
(I suspect) a serious constraint on how many buffers can be managed.

I share your suspicions, although no harm in measuring it.

But I don't understand is how this approach avoids the problem of
different processes seeing different buffer contents. If backend A
has the buffer mmap'd and backend B wants to modify it (and changes
the mapping), backend A is still looking at the old buffer contents,
isn't it? And then things go boom.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#46Radosław Smogura
rsmogura@softperience.eu
In reply to: Andres Freund (#42)
Re: MMAP Buffers

Andres Freund <andres@anarazel.de> Sunday 17 April 2011 20:02:11

On Sunday 17 April 2011 19:26:31 Radosław Smogura wrote:

Kernel merges vm_structs. So mappings are compacted. I'm not kernel
specialist, but skipping memory consumption, for not compacted mappings,
kernel uses btrees for dealing with TLB, so it should not matter if
there is 100 vm_structs or 100000 vm_structs.

But the CPUs TLB cache has maybe 16/256 (1lvl, 2nd) to 64/512 entries. That
will mean that there will be cachemisses all over.
Additionally your scheme requires flushing it regularly...

Andres

I only know Phenom has 4096 entries I think and this covers 16MB of memory.
But I was taking about memory usage of struct vm_struct in kernel. I tries as
well with huge pages, but I can't write really fast allocator for this, it's
slower then malloc, maybe from different reasons.

Regards,
Radek

#47Andres Freund
andres@anarazel.de
In reply to: Radosław Smogura (#46)
Re: MMAP Buffers

On Sunday 17 April 2011 22:09:24 Radosław Smogura wrote:

I only know Phenom has 4096 entries I think and this covers 16MB of
memory.

The numbers I cited where intels before and after core2.

Andres

#48Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#29)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Sun, Apr 17, 2011 at 12:26 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Now, the other aspect to this whole discussion is that people often
have code they have developed for academic purposes or for their own
use which they want to offer to the community "FWIW", and I think we
sometimes miss an opportunity to take advantage of someone else's
work because of an assumption that they have some vested interest in
it's acceptance.  The fact that someone doesn't care enough to try to
work with the community to get their patch accepted doesn't *always*
mean that we're better off for ignoring that patch.  Maybe that's
true 90% of the time or better, but it seems to me that sometimes our
community is a bit provincial.

We are.

On the other hand, cleaning up other people's not-ready-for-prime-time
patches isn't free. If I spend 4 hours cleaning up a patch in
preparation for a commit, then that's 4 hours I don't get to spend on
my own work. And since I *already* spend 3 or 4 times as much energy
on other people's work as I do on my own, I'm not willing to go much
further in that direction; if anything, I think I'd like to roll it
back a bit. On the other hand, I am emphatically in favor of other
people who are not me being willing to do that kind of work; I think
it benefits our whole community, much as the work of people who write
their own patches or review or volunteer in any other way benefits our
whole community.

Because I commit approximately 10 patches per CommitFest, and review
perhaps another 5-10 that I don't end up committing (either because
they get rejected or because someone else commits them), the amount of
time that I can afford to spend on each of those patches is limited.
Generally, if I can't commit a normal-size patch in half an hour of
looking at it, I send back a review and move on. For some patches
that I particularly care about, I have on occasion invested as much as
2-3 days (most recently, a big chunk of my Christmas vacation) to get
them beaten into shape for a commit. I'd be happy to devote more time
per patch, but it ain't gonna happen as long as the number that I have
to handle to get the CommitFest finished on time remains in the
two-digit range.

That having been said, the kind of fixing up that you're talking about
*does* happen, when someone cares enough to make it happen. We have
numerous examples in the archives where person A submits a patch, and
person B reviews it and, in lieu of a review, posts an updated patch,
sometimes when person A has meanwhile totally disappeared, or when
they haven't completely disappeared but don't have time to work on it.
This is actually quite commonplace; it just doesn't happen for every
patch. It tends to happen only for the things someone is really
excited about because, well, fixing up someone else's bad code is not
one of life's great pleasures. It'd be nice if we had even more of it
than we do, but this is an all-volunteer organization.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#49Radosław Smogura
rsmogura@softperience.eu
In reply to: Robert Haas (#45)
Re: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> Sunday 17 April 2011 22:01:55

On Sun, Apr 17, 2011 at 11:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

=?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura@softperience.eu> writes:

Tom Lane <tgl@sss.pgh.pa.us> Sunday 17 April 2011 01:35:45

... Huh? Are you saying that you ask the kernel to map each individual
shared buffer separately? I can't believe that's going to scale to
realistic applications.

No, I do
mrempa(mmap_buff_A, MAP_FIXED, temp);
mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
mrempa(tmp, MAP_FIXED, mmap_buff_A).

There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
nor on my quite-up-to-date OS X box. The Linux man page for it says
"This call is Linux-specific, and should not be used in programs
intended to be portable." So if the patch is dependent on that call,
it's dead on arrival from a portability standpoint.

But in any case, you didn't explain how use of mremap() avoids the
problem of the kernel having to maintain a separate page-mapping-table
entry for each individual buffer. (Per process, yet.) If that's what's
happening, it's going to be a significant performance penalty as well as
(I suspect) a serious constraint on how many buffers can be managed.

I share your suspicions, although no harm in measuring it.

But I don't understand is how this approach avoids the problem of
different processes seeing different buffer contents. If backend A
has the buffer mmap'd and backend B wants to modify it (and changes
the mapping), backend A is still looking at the old buffer contents,
isn't it? And then things go boom.

Each process has simple "mirror" of shared descriptors.

I "believe" that modifications to buffer content may be only done when holding
exclusive lock (with some simple exceptions) (+ MVCC), actually I saw only two
things that can change already loaded data and cause damage, you have
described (setting hint bits during scan, and vacuum - 1st may only cause, I
think, that two processes will ask for same transaction statuses <except
vacuum>, 2nd one is impossible as vacumm requires exclusive pin). When buffer
tag is changed the version of buffer is bumped up, and checked against local
version - this about reading buffer.

In other cases after obtaining lock check is done if buffer has associated
updatable buffer and if local "mirror" has it too, then swap should take
place.

Logic about updatable buffers is similar to "shared buffers", each updatable
buffer has pin count, and updatable buffer can't be free if someone uses it,
but in contrast to "normal buffers", updatable buffers doesn't have any
support for locking etc. Updatable buffers exists only on free list, or when
associated with buffer.

In future, I will change version to shared segment id, something like
relation's oid + block, but ids will have continuous numbering 1,2,3..., so I
will be able to bypass smgr/md during read, and tag version check - this looks
like faster solution.

Regards,
Radek

#50Dan Ports
drkp@csail.mit.edu
In reply to: Kevin Grittner (#29)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Sat, Apr 16, 2011 at 11:26:34PM -0500, Kevin Grittner wrote:

I have to say, I've been rather mystified by the difficulty
attributed to running pgindent. During work on the SSI patch, I ran
it about once every two weeks on files involved in the patch

Well, as a counterpoint: during work on the SSI patch, I did *not* run
pgindent. I attempted to, at one point, but was discouraged when I
realized that it required BSD indent and my Linux machine only had GNU
indent. That meant I would need to find, build, and install a new
version of indent, and keep it separate from my existing GNU indent.
Hardly impossible, but it's a lot more of a hassle than simply running a
script, and it left me wondering if I was going to run into other
issues even if I did get the right indent installed.

Andrew's instructions upthread would certainly have been helpful to
have in the pgindent README.

(To be fair, I would probably have made much more of an effort to run
pgindent if I didn't already know Kevin was running it periodically on
the SSI code.)

And I can't help but wonder why, in an off-list discussion with
Michael Cahill about the SSI technology he commented that he was
originally intending to implement the technique in PostgreSQL, but
later chose Oracle Berkeley DB and then latter InnoDB instead.
*Maybe* he was looking toward being hired by Oracle, and *maybe* it
was because the other databases already had predicate locking and
true serializable transaction isolation levels -- but was part of it
the reputation of the community? I keep wondering.

I would discount the first explanation (being hired at Oracle)
entirely. I think the second explanation is the correct one: it's
simply much more difficult to implement SSI atop a database that does
not already have predicate locking (as we know!)

But I am aware of other cases in which people in the academic community
have done work that could well be of interest to the Postgres community
but didn't submit their work here. In part, that was because they did
not have the time/motivation to get the work into a polished,
acceptable state, and in part because of the reputation of the
community.

Dan

--
Dan R. K. Ports MIT CSAIL http://drkp.net/

#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dan Ports (#50)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Dan Ports <drkp@csail.mit.edu> writes:

... But I am aware of other cases in which people in the academic community
have done work that could well be of interest to the Postgres community
but didn't submit their work here. In part, that was because they did
not have the time/motivation to get the work into a polished,
acceptable state, and in part because of the reputation of the
community.

Well, if the author isn't interested in getting the work into a
committable state, it's not clear what's the point of submitting it.
It's not like people who are eager to do that kind of work on someone
else's patch are thick on the ground.

But I think the perception that we reject most patches is misplaced.
It's fairly easy to demonstrate that the default assumption around here
is that submitted patches will get committed. Looking at the past five
commitfests (covering a bit more than a year), we committed 201 out of
305 patches, and only 10 were actually marked "rejected". I'm too lazy
to try to determine just which of the 94 returned-with-feedback patches
got committed in later fests, but a quick scan suggests at least 20 did,
and there are more that might get committed in the next fest. That puts
the overall patch acceptance rate at perhaps 75%. At least since the CF
mechanism was instituted, it seems to me that the dynamic has been that
someone who doesn't like a patch has to show cause why it shouldn't get
committed, not the other way around. Robert's recent comment that he
was afraid he'd have to spend time digging into the mmap patch to prove
it was broken reflects exactly that feeling.

regards, tom lane

#52Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#51)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Sun, Apr 17, 2011 at 7:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Dan Ports <drkp@csail.mit.edu> writes:

... But I am aware of other cases in which people in the academic community
have done work that could well be of interest to the Postgres community
but didn't submit their work here. In part, that was because they did
not have the time/motivation to get the work into a polished,
acceptable state, and in part because of the reputation of the
community.

Well, if the author isn't interested in getting the work into a
committable state, it's not clear what's the point of submitting it.
It's not like people who are eager to do that kind of work on someone
else's patch are thick on the ground.

But I think the perception that we reject most patches is misplaced.
It's fairly easy to demonstrate that the default assumption around here
is that submitted patches will get committed.  Looking at the past five
commitfests (covering a bit more than a year), we committed 201 out of
305 patches, and only 10 were actually marked "rejected".  I'm too lazy
to try to determine just which of the 94 returned-with-feedback patches
got committed in later fests, but a quick scan suggests at least 20 did,
and there are more that might get committed in the next fest.  That puts
the overall patch acceptance rate at perhaps 75%.

That someone overstates the acceptance rate, because it ignores the
patches that people post and immediately get flamed to a well-done
crisp before adding them to the CF app, but there are not very many of
those any more. (If someone thinks I'm wrong about this, they are
cheerfully invited to provide the evidence. It is certainly possible
that I'm guilty of selective memory; this is just how I remember it.)

At least since the CF
mechanism was instituted, it seems to me that the dynamic has been that
someone who doesn't like a patch has to show cause why it shouldn't get
committed, not the other way around. Robert's recent comment that he
was afraid he'd have to spend time digging into the mmap patch to prove
it was broken reflects exactly that feeling.

Yes, and I think it's also telling that the response to that was not
"oh, gee, if Robert thinks this patch is totally busted, we'd better
take that concern seriously" but rather "stop picking on the guy who
submitted the patch". Maybe someone out there is under the impression
that I get high off of rejecting patches; but the statistics you cite
from the CF app don't exactly support the contention that I'm going
around looking for reasons to reject things, or if I am, I'm doing a
pretty terrible job finding them.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#53Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#52)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 07:41 PM, Robert Haas wrote:

That puts the overall patch acceptance rate at perhaps 75%.

That someone overstates the acceptance rate, because it ignores the
patches that people post and immediately get flamed to a well-done
crisp before adding them to the CF app, but there are not very many of
those any more.

I don't believe there were ever terribly many of them.

cheers

andrew

#54Robert Haas
robertmhaas@gmail.com
In reply to: Radosław Smogura (#49)
Re: MMAP Buffers

On Sun, Apr 17, 2011 at 5:32 PM, Radosław Smogura
<rsmogura@softperience.eu> wrote:

Each process has simple "mirror" of shared descriptors.

I "believe" that modifications to buffer content may be only done when holding
exclusive lock (with some simple exceptions) (+ MVCC), actually I saw only two
things that can change already loaded data and cause damage, you have
described (setting hint bits during scan, and vacuum - 1st may only cause, I
think, that two processes will ask for same transaction statuses <except
vacuum>, 2nd one is impossible as vacumm requires exclusive pin). When buffer
tag is changed the version of buffer is bumped up, and checked against local
version - this about reading buffer.

Yes, an exclusive lock is required for substantive content changes.
But if vacuum cleaning up the buffer is an issue for your patch, then
it's probably also a problem if someone grabs an exclusive content
lock and deletes the tuple (by setting XMAX) and some other backend
later sees the old buffer contents after having in the meanwhile taken
a new snapshot; or if likewise someone grabs an exclusive-lock, adds a
tuple, and then your backend takes a new snapshot and then sees the
old buffer contents. Basically, any time someone grabs an
exclusive-lock and releases it, it's necessary for all observers to
see the updated contents by the time the exclusive lock is released.

In other cases after obtaining lock check is done if buffer has associated
updatable buffer and if local "mirror" has it too, then swap should take
place.

I think this check would have to be done every time someone
share-locks the buffer, which seems rather expensive.

Logic about updatable buffers is similar to "shared buffers", each updatable
buffer has pin count, and updatable buffer can't be free if someone uses it,
but in contrast to "normal buffers", updatable buffers doesn't have any
support for locking etc. Updatable buffers exists only on free list, or when
associated with buffer.

I don't see how you're going to get away with removing buffer locks.
They exist for a reason, and adding mmap() to the mix is going to
require MORE locking, not less.

In future, I will change version to shared segment id, something like
relation's oid + block, but ids will have continuous numbering 1,2,3..., so I
will be able to bypass smgr/md during read, and tag version check - this looks
like faster solution.

I don't understand this part at all.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#55Greg Smith
greg@2ndquadrant.com
In reply to: Andrew Dunstan (#35)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current
directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just
seems totally ill-conceived.

I think the script I submitted upthread has most of the additional
slickness needed here. Looks like we both were working on documenting a
reasonable way to do this at the same time the other day. The idea of
any program here relying on being able to write to /usr/local/bin as
your example did makes this harder for people to run; that's why I made
everything in the build tree and just pushed the appropriate directories
into the PATH.

Since I see providing a script to automate this whole thing as the
preferred way to make this easier, re-packaging the indent source
tarball to extract to a directory doesn't seem worth the backwards
compatibility trouble it will introduce. Improving the entab makefile I
don't have an opinion on.

It might also be worth setting it up so that instead of having to pass
a path to a typedefs file on the command line, we default to a file
sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

OK, so I need to update my script to handle either indenting a single
file, or doing all of them.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#56Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#53)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan <andrew@dunslane.net> writes:

On 04/17/2011 07:41 PM, Robert Haas wrote:

That puts the overall patch acceptance rate at perhaps 75%.

That someone overstates the acceptance rate, because it ignores the
patches that people post and immediately get flamed to a well-done
crisp before adding them to the CF app, but there are not very many of
those any more.

I don't believe there were ever terribly many of them.

Well, that number also ignores patches that were *committed* without
ever making it to the CF list. There aren't terribly many of those
either I think, but it does happen, particularly for small patches.
If you want to argue about the acceptance rate for out-of-CF-process
patches you'd have to do some serious digging in the archives to say
anything about what it is.

But anyway this is quibbling. The point I was trying to make is that
our patch acceptance rate is fairly far north of 50%, not south of it.
So we might hold people's feet to the fire a bit in the process, but
it's hardly impossible to get a patch committed.

regards, tom lane

#57Radosław Smogura
rsmogura@softperience.eu
In reply to: Robert Haas (#54)
Re: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> Monday 18 April 2011 03:06:17

On Sun, Apr 17, 2011 at 5:32 PM, Radosław Smogura

<rsmogura@softperience.eu> wrote:

Each process has simple "mirror" of shared descriptors.

I "believe" that modifications to buffer content may be only done when
holding exclusive lock (with some simple exceptions) (+ MVCC), actually
I saw only two things that can change already loaded data and cause
damage, you have described (setting hint bits during scan, and vacuum -
1st may only cause, I think, that two processes will ask for same
transaction statuses <except vacuum>, 2nd one is impossible as vacumm
requires exclusive pin). When buffer tag is changed the version of
buffer is bumped up, and checked against local version - this about
reading buffer.

Yes, an exclusive lock is required for substantive content changes.
But if vacuum cleaning up the buffer is an issue for your patch, then
it's probably also a problem if someone grabs an exclusive content
lock and deletes the tuple (by setting XMAX) and some other backend
later sees the old buffer contents after having in the meanwhile taken
a new snapshot; or if likewise someone grabs an exclusive-lock, adds a
tuple, and then your backend takes a new snapshot and then sees the
old buffer contents. Basically, any time someone grabs an
exclusive-lock and releases it, it's necessary for all observers to
see the updated contents by the time the exclusive lock is released.

In other cases after obtaining lock check is done if buffer has
associated updatable buffer and if local "mirror" has it too, then swap
should take place.

I think this check would have to be done every time someone
share-locks the buffer, which seems rather expensive.

I don't treat as issues, but it's disadvantage.

Logic about updatable buffers is similar to "shared buffers", each
updatable buffer has pin count, and updatable buffer can't be free if
someone uses it, but in contrast to "normal buffers", updatable buffers
doesn't have any support for locking etc. Updatable buffers exists only
on free list, or when associated with buffer.

I don't see how you're going to get away with removing buffer locks.
They exist for a reason, and adding mmap() to the mix is going to
require MORE locking, not less.

In future, I will change version to shared segment id, something like
relation's oid + block, but ids will have continuous numbering 1,2,3...,
so I will be able to bypass smgr/md during read, and tag version check -
this looks like faster solution.

I don't understand this part at all.

Versioning is witch approach where I thought about really often changes of
mmaped areas, I allocated part of segments, but now the segment is mmaped with
reservation, to it's full possible size, addresses of segments can't change
(problem is only with segment deletion).

Regards,
Radek

#58Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#52)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> writes:

... Maybe someone out there is under the impression
that I get high off of rejecting patches; but the statistics you cite
from the CF app don't exactly support the contention that I'm going
around looking for reasons to reject things, or if I am, I'm doing a
pretty terrible job finding them.

Hm ... there are people out there who think *I* get high off rejecting
patches. I have a t-shirt to prove it. But I seem to be pretty
ineffective at it too, judging from these numbers.

regards, tom lane

#59Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Smith (#32)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Greg Smith <greg@2ndquadrant.com> writes:

The last time I tried to do this a few years ago I failed miserably and
never came back. I know way more about building software now though,
and just got this to work for the first time.

BTW, another thing that should be in the try-try-again category is
seeing how close we could get to pgindent's results with GNU indent.
It seems clear to me that a process based on GNU indent would be a
lot easier for a lot of people. We tried that once before, and couldn't
get close enough to want to consider switching, but maybe it just needs
a more determined effort and/or more recent versions of GNU indent.
(ISTR that we hit some things that seemed to be outright bugs in GNU
indent, but this was quite a few years ago.)

regards, tom lane

#60Radosław Smogura
rsmogura@softperience.eu
In reply to: Robert Haas (#54)
Re: MMAP Buffers

On Sun, 17 Apr 2011 21:06:17 -0400, Robert Haas wrote:

On Sun, Apr 17, 2011 at 5:32 PM, Radosław Smogura
<rsmogura@softperience.eu> wrote:

Each process has simple "mirror" of shared descriptors.

I "believe" that modifications to buffer content may be only done
when holding
exclusive lock (with some simple exceptions) (+ MVCC), actually I
saw only two
things that can change already loaded data and cause damage, you
have
described (setting hint bits during scan, and vacuum - 1st may only
cause, I
think, that two processes will ask for same transaction statuses
<except
vacuum>, 2nd one is impossible as vacumm requires exclusive pin).
When buffer
tag is changed the version of buffer is bumped up, and checked
against local
version - this about reading buffer.

Yes, an exclusive lock is required for substantive content changes.
But if vacuum cleaning up the buffer is an issue for your patch, then
it's probably also a problem if someone grabs an exclusive content
lock and deletes the tuple (by setting XMAX) and some other backend
later sees the old buffer contents after having in the meanwhile
taken
a new snapshot; or if likewise someone grabs an exclusive-lock, adds
a
tuple, and then your backend takes a new snapshot and then sees the
old buffer contents. Basically, any time someone grabs an
exclusive-lock and releases it, it's necessary for all observers to
see the updated contents by the time the exclusive lock is released.

In other cases after obtaining lock check is done if buffer has
associated
updatable buffer and if local "mirror" has it too, then swap should
take
place.

I think this check would have to be done every time someone
share-locks the buffer, which seems rather expensive.

Logic about updatable buffers is similar to "shared buffers", each
updatable
buffer has pin count, and updatable buffer can't be free if someone
uses it,
but in contrast to "normal buffers", updatable buffers doesn't have
any
support for locking etc. Updatable buffers exists only on free list,
or when
associated with buffer.

I don't see how you're going to get away with removing buffer locks.
They exist for a reason, and adding mmap() to the mix is going to
require MORE locking, not less.

In future, I will change version to shared segment id, something
like
relation's oid + block, but ids will have continuous numbering
1,2,3..., so I
will be able to bypass smgr/md during read, and tag version check -
this looks
like faster solution.

I don't understand this part at all.

To my previous post I want to clarify that "updatable buffers" are
implemented in shared memory, so there is no way that process has own
copy of data.

Regards,
Radek.

#61Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#55)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/18/2011 12:48 AM, Greg Smith wrote:

Andrew Dunstan wrote:

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current
directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just
seems totally ill-conceived.

I think the script I submitted upthread has most of the additional
slickness needed here. Looks like we both were working on documenting
a reasonable way to do this at the same time the other day. The idea
of any program here relying on being able to write to /usr/local/bin
as your example did makes this harder for people to run; that's why I
made everything in the build tree and just pushed the appropriate
directories into the PATH.

Since I see providing a script to automate this whole thing as the
preferred way to make this easier, re-packaging the indent source
tarball to extract to a directory doesn't seem worth the backwards
compatibility trouble it will introduce. Improving the entab makefile
I don't have an opinion on.

Personally, I want pgindent installed in /usr/local/ or similar. That
way I can have multiple trees and it will work in all of them without my
having to build it for each. What I don't want is for the installed
patched BSD indent to conflict with the system's indent, which is why I
renamed it. If you still think that's a barrier to easy use, then I
think we need a way to provide hooks in the makefiles for specifying the
install location, so we can both be satisfied.

Since there's no script I know of other than your prototype, I don't
think repackaging is likely to break anything. That makes it worth doing
*now* rather than later.

But frankly, I'd rather do without an extra script if possible.

It might also be worth setting it up so that instead of having to
pass a path to a typedefs file on the command line, we default to a
file sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

OK, so I need to update my script to handle either indenting a single
file, or doing all of them.

Yes, very much.

cheers

andrew

#62Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#58)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Excerpts from Tom Lane's message of lun abr 18 02:50:22 -0300 2011:

Robert Haas <robertmhaas@gmail.com> writes:

... Maybe someone out there is under the impression
that I get high off of rejecting patches; but the statistics you cite
from the CF app don't exactly support the contention that I'm going
around looking for reasons to reject things, or if I am, I'm doing a
pretty terrible job finding them.

Hm ... there are people out there who think *I* get high off rejecting
patches. I have a t-shirt to prove it. But I seem to be pretty
ineffective at it too, judging from these numbers.

Does this mean we need an auction to get Robert a nice $1000 t-shirt?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#63Joshua Berkus
josh@agliodbs.com
In reply to: Tom Lane (#58)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert, Tom,

Hm ... there are people out there who think *I* get high off rejecting
patches. I have a t-shirt to prove it. But I seem to be pretty
ineffective at it too, judging from these numbers.

It's a question of how we reject patches, especially first-time patches. We can reject them in a way which makes the submitter more likely to fix them and/or work on something else, or we can reject them in a way which discourages people from submitting to PostgreSQL at all.

For example, the emails to Radoslaw mentioned nothing about pg_ident, documented spacing requirements, accidental inclusion of files he didn't mean to touch, etc. Instead, a couple of people told him he should abandon his chosen development IDE in favor of emacs or vim. Radoslaw happens to be thick-skinned and persistent, but other first-time submitters would have given up at that point and run off to a more welcoming project.

Mind, even better would be to get our "so you're submitting a patch" documentation and tools into shape; that way, all we need to do is send the first-time submitter a link. Will work on that between testing ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

#64Josh Berkus
josh@agliodbs.com
In reply to: Alvaro Herrera (#62)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Does this mean we need an auction to get Robert a nice $1000 t-shirt?

... starting hunting through Robert's emails for a good quote ...

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#65Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#59)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/17/2011 11:07 PM, Tom Lane wrote:

BTW, another thing that should be in the try-try-again category is
seeing how close we could get to pgindent's results with GNU indent.
It seems clear to me that a process based on GNU indent would be a
lot easier for a lot of people. We tried that once before, and couldn't
get close enough to want to consider switching, but maybe it just needs
a more determined effort and/or more recent versions of GNU indent.
(ISTR that we hit some things that seemed to be outright bugs in GNU
indent, but this was quite a few years ago.)

That seems like a definite win possibility there.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Developement
Organizers of the PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

#66Robert Haas
robertmhaas@gmail.com
In reply to: Joshua Berkus (#63)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 12:52 PM, Joshua Berkus <josh@agliodbs.com> wrote:

Robert, Tom,

Hm ... there are people out there who think *I* get high off rejecting
patches. I have a t-shirt to prove it. But I seem to be pretty
ineffective at it too, judging from these numbers.

It's a question of how we reject patches, especially first-time patches.   We can reject them in a way which makes the submitter more likely to fix them and/or work on something else, or we can reject them in a way which discourages people from submitting to PostgreSQL at all.

For example, the emails to Radoslaw mentioned nothing about pg_ident, documented spacing requirements, accidental inclusion of files he didn't mean to touch, etc.  Instead, a couple of people told him he should abandon his chosen development IDE in favor of emacs or vim.  Radoslaw happens to be thick-skinned and persistent, but other first-time submitters would have given up at that point and run off to a more welcoming project.

Actually, the first reply was a very polite reply from Heikki pointing
out the problem very gently and asking for a theory of operation.

Radoslaw replied and said that he understood the formatting problem,
but his editor was mangling it:

Yes, but, hmm... in Netbeans I had really long gaps (probably 8 spaces, from tabs), so deeper "ifs", comments at the and of variables, went of out my screen. I really wanted to not format this, but sometimes I needed.

That prompted one - ONE! - person to reply and suggest that the use of
another editor might work better. At which point, we got an
apparently-exasperated note from you suggesting that a 10% performance
improvement wasn't enough (which I disagree with) and that it was
wrong for people to worry about whether they could read the patch well
enough to understand it (which I also disagree with). Conceding that
some of the following discussion may have gotten a little harsh
(though frankly I think that was mostly directed at your remark, not
the OP), what prompted that original note? Here it is:

Guys, can we *please* focus on the patch for now, rather than the formatting, which is fixable with sed?

So first of all, no it's not fixable with sed. But secondly, writing
"*please*" here seems to evince a level of frustration which is
entirely out of proportion to the really rather mild comments which
preceded it. What made you write it that way?

I think that the harshness of the reaction to your statement is a
reflection of some underlying frustration on my part and perhaps also
on the part of other reviewers - to this continual commentary that we
are not nice enough to people, especially newcomers. Well, OK, maybe
we're not. But you know what? We're trying really hard, and getting
accused of being nasty when we actually weren't is kind of a tough
pill to swallow. I would really like to see someone go back and look
at every patch from a newcomer that's been submitted in the last year,
and rate the reaction to that patch on an A-F scale. Then let's have
a discussion about what percentage we did well on, and what percentage
we did poorly on, and how we could have done better. When we actually
start raking someone over the coals, I think it's great and a helpful
service for you to jump in and say - hold on a minute, timeout. But
in this case I think you were too quick off the trigger, and I don't
think that acting as if it's unreasonable to want a patch that
conforms to our submission guidelines is doing anyone any favors.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#67Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#66)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 4/18/11 10:57 AM, Robert Haas wrote:

So first of all, no it's not fixable with sed. But secondly, writing
"*please*" here seems to evince a level of frustration which is
entirely out of proportion to the really rather mild comments which
preceded it. What made you write it that way?

I'll admit that the conversation I'd had at the Drizzle BOF the previous
night strongly influenced me.

to this continual commentary that we
are not nice enough to people, especially newcomers. Well, OK, maybe
we're not. But you know what? We're trying really hard, and getting
accused of being nasty when we actually weren't is kind of a tough
pill to swallow ... But
in this case I think you were too quick off the trigger

Well, my apologies to you. You are probably correct.

In any case, I think the answer to this is constructive; better
documentation and tools to let submitters get their code into good shape
in the first place so that we don't have discussions about formatting.
That way we waste *neither* the reviewers' nor the submitters' time.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#68Andrew Dunstan
andrew@dunslane.net
In reply to: Joshua D. Drake (#65)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/18/2011 01:46 PM, Joshua D. Drake wrote:

On 04/17/2011 11:07 PM, Tom Lane wrote:

BTW, another thing that should be in the try-try-again category is
seeing how close we could get to pgindent's results with GNU indent.
It seems clear to me that a process based on GNU indent would be a
lot easier for a lot of people. We tried that once before, and couldn't
get close enough to want to consider switching, but maybe it just needs
a more determined effort and/or more recent versions of GNU indent.
(ISTR that we hit some things that seemed to be outright bugs in GNU
indent, but this was quite a few years ago.)

That seems like a definite win possibility there.

If you're aware of any changes in GNU indent that would overcome the
previous issues, then by all means spend the time on it. If not, it
seems a bit like the definition of insanity ("repeating an experiment
with the expectation of a different result").

cheers

andrew

#69Alvaro Herrera
alvherre@commandprompt.com
In reply to: Andrew Dunstan (#68)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Excerpts from Andrew Dunstan's message of lun abr 18 19:20:30 -0300 2011:

On 04/18/2011 01:46 PM, Joshua D. Drake wrote:

On 04/17/2011 11:07 PM, Tom Lane wrote:

BTW, another thing that should be in the try-try-again category is
seeing how close we could get to pgindent's results with GNU indent.
It seems clear to me that a process based on GNU indent would be a
lot easier for a lot of people. We tried that once before, and couldn't
get close enough to want to consider switching, but maybe it just needs
a more determined effort and/or more recent versions of GNU indent.
(ISTR that we hit some things that seemed to be outright bugs in GNU
indent, but this was quite a few years ago.)

That seems like a definite win possibility there.

If you're aware of any changes in GNU indent that would overcome the
previous issues, then by all means spend the time on it. If not, it
seems a bit like the definition of insanity ("repeating an experiment
with the expectation of a different result").

The source of GNU indent itself is 3x what it was when the experiment
was last reported. (I checked this about a year ago with an eye on
"repeating the experiment" but then I failed to actually do it.) It
seems fair to say that, yes, it has changed a bit in the meantime.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#69)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Alvaro Herrera <alvherre@commandprompt.com> writes:

Excerpts from Andrew Dunstan's message of lun abr 18 19:20:30 -0300 2011:

On 04/17/2011 11:07 PM, Tom Lane wrote:

... but maybe it just needs
a more determined effort and/or more recent versions of GNU indent.

If you're aware of any changes in GNU indent that would overcome the
previous issues, then by all means spend the time on it. If not, it
seems a bit like the definition of insanity ("repeating an experiment
with the expectation of a different result").

The source of GNU indent itself is 3x what it was when the experiment
was last reported. (I checked this about a year ago with an eye on
"repeating the experiment" but then I failed to actually do it.) It
seems fair to say that, yes, it has changed a bit in the meantime.

Also, my recollection of the previous go-round is that we gave up rather
quickly. Maybe by now there is somebody willing to put more than
minimal effort into getting the options just so.

regards, tom lane

#71Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#67)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 3:17 PM, Josh Berkus <josh@agliodbs.com> wrote:

In any case, I think the answer to this is constructive; better
documentation and tools to let submitters get their code into good shape
in the first place so that we don't have discussions about formatting.
That way we waste *neither* the reviewers' nor the submitters' time.

Well, I'm all in favor of better documentation, but I think the
biggest thing we need to do is get the word out:

1. We realize we have been too trigger-happy sometimes.
2. But we really want you to participate.
3. And we are trying very hard to do better.
4. And please tell us if we screw up, so we can keep working on it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#72Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#71)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert,

1. We realize we have been too trigger-happy sometimes.
2. But we really want you to participate.
3. And we are trying very hard to do better.
4. And please tell us if we screw up, so we can keep working on it.

I received a private offlist email from someone who didn't feel
comfortable bringing up their issues with this list publicly. Let me
quote from it, because I think it pins part of the issue:

"I believe this is due to the current postgresql "commitfest" process
whereby there is no real way to present new ideas or technologies
without coming to the table with a fully-baked plan and patch. This is
obvious even in the name "commitfest" since the expectation is that
every patch presented is considered ready-to-commit by the patch
presenter. This makes a novice or experimental contribution less likely."

You'll notice that this has been a complaint of veteran contributors as
well; WIP patches either get no review, or get reviewed as if they were
expected to be committable.

The person who e-mailed me suggests some form of PostgreSQL Incubator as
a solution. I'm not sure about that, but it does seem to me that we
need somewhere or some way that people can submit patches, ideas, git
forks, etc., for discussion without that discussion needing to
immediately move to the cleanliness/maintainability/supportable status
of the patch.

I'm concerned though that if these WIP projects don't get to -hackers,
then their creators won't get the feedback they really need.

Thoughts?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#73Joshua D. Drake
jd@commandprompt.com
In reply to: Robert Haas (#71)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/18/2011 06:38 PM, Robert Haas wrote:

On Mon, Apr 18, 2011 at 3:17 PM, Josh Berkus<josh@agliodbs.com> wrote:

In any case, I think the answer to this is constructive; better
documentation and tools to let submitters get their code into good shape
in the first place so that we don't have discussions about formatting.
That way we waste *neither* the reviewers' nor the submitters' time.

Well, I'm all in favor of better documentation, but I think the
biggest thing we need to do is get the word out:

1. We realize we have been too trigger-happy sometimes.
2. But we really want you to participate.
3. And we are trying very hard to do better.
4. And please tell us if we screw up, so we can keep working on it.

I think Robert has hit the nail on the head. As I mentioned at #PgWest,
we are a 1000 person dysfunctional family. David Fetter reminded me
gently (yes really) that as far as 1000 person families go, we're doing
pretty good. We are one of the last true communities left.

We need to find a way to let people know that we are only gruff, because
of experience and that although we can be rough we welcome the
participation and we try really hard. We are engineers (well, I'm
not...) but most of us are.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Developement
Organizers of the PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

#74Alex Hunsaker
badalex@gmail.com
In reply to: Josh Berkus (#72)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 19:50, Josh Berkus <josh@agliodbs.com> wrote:

You'll notice that this has been a complaint of veteran contributors as
well; WIP patches either get no review, or get reviewed as if they were
expected to be committable.

I don't see this changing anytime in the future. We have a hard enough
time getting "finished" patches reviewed.

The person who e-mailed me suggests some form of PostgreSQL Incubator as
a solution.   I'm not sure about that, but it does seem to me that we
need somewhere or some way that people can submit patches, ideas, git
forks, etc., for discussion without that discussion needing to
immediately move to the cleanliness/maintainability/supportable status
of the patch.

Reminds me a bit of what linux is doing with the "staging" tree. I
don't see anyway for that to work with postgres (lower the bar for
-contrib?).

You can fork fairly easy with github nowdays. For example the replace
GEQ with SA is on one of those git sites. Does that mean it gets any
attention? *shrug*

#75Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#72)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 9:50 PM, Josh Berkus <josh@agliodbs.com> wrote:

Robert,

1. We realize we have been too trigger-happy sometimes.
2. But we really want you to participate.
3. And we are trying very hard to do better.
4. And please tell us if we screw up, so we can keep working on it.

I received a private offlist email from someone who didn't feel
comfortable bringing up their issues with this list publicly.  Let me
quote from it, because I think it pins part of the issue:

"I believe this is due to the current postgresql "commitfest" process
whereby there is no real way to present new ideas or technologies
without coming to the table with a fully-baked plan and patch. This is
obvious even in the name "commitfest" since the expectation is that
every patch presented is considered ready-to-commit by the patch
presenter. This makes a novice or experimental contribution less likely."

You'll notice that this has been a complaint of veteran contributors as
well; WIP patches either get no review, or get reviewed as if they were
expected to be committable.

The person who e-mailed me suggests some form of PostgreSQL Incubator as
a solution.   I'm not sure about that, but it does seem to me that we
need somewhere or some way that people can submit patches, ideas, git
forks, etc., for discussion without that discussion needing to
immediately move to the cleanliness/maintainability/supportable status
of the patch.

I'm concerned though that if these WIP projects don't get to -hackers,
then their creators won't get the feedback they really need.

Thoughts?

I think the quality of review that WIP patches get depends very much
on how specific the submitter is about what they'd like to get out of
the process. If you submit a patch and say "I have this cool patch
that allows FTL travel, but it's WIP, please review" then you're
basically asking some poor schmuck to reverse engineer what the patch
is doing, and, when they find problems with it, guess which of those
problems were things that you didn't think of and which were things
that you knew about but haven't gotten around to fixing yet because
you're still working on it. This is a pretty thankless task for the
reviewer, and it's not surprising that it doesn't go well. However,
if you say "I have this cool patch that allows FTL travel. It current
plays havoc with the transporter beams and the dilithium crystals tend
to shatter if you exceed Warp 3, but I'd like to get a check as to
whether the basic design is sound, and if anyone can see why the
Heisenburg compensator is destabilizing, please let me know", your
chances of getting some useful feedback are pretty good. Sometimes it
even provokes a rather competitive spot-the-bug race...

Also, I think the reason why we have a process called CommitFest and
not a process called BrainstormingFest is because, when we didn't have
a CommitFest process, patches fell on the floor. Since we've added
that process, that problem has largely gone away. But it is generally
not difficult to get a review of a "big idea" for which no code has
been written yet - in fact it's often much faster and easier than
getting a patch reviewed. It's true that there have been occasional
times when people have gotten lightly toasted for bringing up big new
ideas in the middle of a CF or beta period, but I think we've gotten
less pedantic about that. Certainly, there are no shortage of ideas
that have been proposed and commented on over the last few weeks, even
as we have been working to get 9.1beta1 out the door. Code is not
really getting reviewed right now, but ideas *are*. I'm not going to
claim that this works perfectly: the way that ideas are presented and
the relative level of interest and/or exhaustion of the people
responding certainly play a role, but it is a pretty rare for an email
to -hackers to get no answer at all. Maybe we need some formal
process here just to make people more comfortable, but it's not
necessary from a workflow perspective.

Thinking back over the kinds of things that have lead to people
getting jumped on, I think I can identify a pattern: people tend to
get jumped on when they allege that our code sucks, or that they're
smarter than we are. Whether or not they actually meant to imply
those things turns out not to matter - it rubs people the wrong way,
and everyone's a volunteer, so when you rub them the wrong way, they
get annoyed. I make an effort, as I think most of us do, to be aware
that just because someone makes an annoying remark doesn't necessarily
mean that they are an annoying person; it just means they haven't
quite figured it all out yet. But there are still people who get
flamed that way far more than they probably deserve. That's an area
we can improve, but in the meantime, approaching the topic with a bit
of humility goes a long way. I can't remember the last time someone
said "I was thinking about working on ... and I thought I might
approach it by ... Does this seem like a good idea? Is it likely to
be too hard for me to tackle? My skillset is ..." and got flamed for
it. Some people here (myself included) get a bit pricklier than we
probably ought to from time to time, but everyone is well-meaning and
sincerely wants to help. The list of users who have had Tom fix a bug
for them within hours of posting a question is not short, and the list
of people who have spent time and energy helping newcomers get started
with PostgreSQL tuning, hacking, or whatever is very long.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#76Christopher Browne
cbbrowne@gmail.com
In reply to: Alex Hunsaker (#74)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 11:15 PM, Alex Hunsaker <badalex@gmail.com> wrote:

On Mon, Apr 18, 2011 at 19:50, Josh Berkus <josh@agliodbs.com> wrote:

You'll notice that this has been a complaint of veteran contributors as
well; WIP patches either get no review, or get reviewed as if they were
expected to be committable.

I don't see this changing anytime in the future. We have a hard enough
time getting "finished" patches reviewed.

Sadly so.

As much as I think we have gotten a LOT of useful milage out of the
"commitfest" concept, it does, conceptually, have a strong bias
(including in its very name) towards the assumption that changes are
pretty much ready to commit.

Two items still undergoing work (collations, sync rep) weren't at that
level of readiness, needing some mere "dusting off" to make them
ready. Rather, they needed substantial examination and modification
before they'd be ready. And, while this has doubtless aroused some
ire, it doesn't intrinsically make those items "broken."

The Apache guys may be onto something in having the "incubator"
moniker, for things that aren't "so ready we're calling them
Commitable."

There may be merit to separating out "easy to commit" and "tougher to
commit" items, and having different kinds of pickiness for them, the
former being good fodder for "Easy CommitFest" and the latter being
"PG Incubation."

Though I'm not sure the latter makes it any easier to get tough
features like synchronous replication into place.

The person who e-mailed me suggests some form of PostgreSQL Incubator as
a solution.   I'm not sure about that, but it does seem to me that we
need somewhere or some way that people can submit patches, ideas, git
forks, etc., for discussion without that discussion needing to
immediately move to the cleanliness/maintainability/supportable status
of the patch.

Reminds me a bit of what linux is doing with the "staging" tree. I
don't see anyway for that to work with postgres (lower the bar for
-contrib?).

You can fork fairly easy with github nowdays. For example the replace
GEQ with SA is on one of those git sites. Does that mean it gets any
attention? *shrug*

Well, the project hasn't been on Git for all that spectacularly long a
time, so the comfort level with managing via forks maybe isn't quite
there yet.

Forking isn't as magically delicious as GitHub might make some
imagine; it's fine and useful to have a bunch of forks, and eventually
merge useful ones, when they are remaining pretty close together, and
don't conflict. That's likely to work out happily for features that
are essentially independent. If you and I are hacking on different
contrib modules, that's pretty "essentially independent."

Unfortunately, deeper features are more likely to be more
interdependent, and forks aren't so readily productive in that case.

If we hack around with formatting, that would muck with *everything*
else, as an even worse "for instance."
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

#77Robert Haas
robertmhaas@gmail.com
In reply to: Christopher Browne (#76)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, Apr 18, 2011 at 11:50 PM, Christopher Browne <cbbrowne@gmail.com> wrote:

Two items still undergoing work (collations, sync rep) weren't at that
level of readiness, needing some mere "dusting off" to make them
ready.  Rather, they needed substantial examination and modification
before they'd be ready.  And, while this has doubtless aroused some
ire, it doesn't intrinsically make those items "broken."

I don't think it really aroused that much ire. It's pretty clear that
both of those patches cost us something on the schedule, and I would
have preferred to see them committed sooner and with fewer bugs. But
they are great features. Unfortunately, we have a tendency to leave
things to the last minute, and that's something I think we could
improve. We have gotten a bit better but there is clearly room for
further improvement. With beta having gotten pushed out to the end of
the month, there is a real chance that we are going to end up
releasing in the fall again, and I would have much preferred July 1.
But given how long CF4 lasted and how much surgery was required
afterwards, it was an unfixable problem. It's not going to get any
better unless we get more serious about getting these big features
done early in the cycle, or postponing them to the next release if
they aren't. Anyway, I'm drifting off topic: nothing against the
patches, at least on my part, just want to make the schedule.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#78Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#72)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Josh Berkus <josh@agliodbs.com> writes:

I received a private offlist email from someone who didn't feel
comfortable bringing up their issues with this list publicly. Let me
quote from it, because I think it pins part of the issue:

"I believe this is due to the current postgresql "commitfest" process
whereby there is no real way to present new ideas or technologies
without coming to the table with a fully-baked plan and patch. This is
obvious even in the name "commitfest" since the expectation is that
every patch presented is considered ready-to-commit by the patch
presenter. This makes a novice or experimental contribution less likely."

As Robert noted, the purpose of the commitfest mechanism is mostly to
ensure that patches that *are* committable, or close to it, don't fall
through the cracks. I'm not sure we're doing anybody any favors by
trying to shoehorn reviews of WIP ideas into that same process. At the
very least it seems we'd need a different set of review guidelines for
WIP items, and we don't have one.

I think useful reviewing of WIP stuff has to focus much more on design
concepts and much less on code reading. The reason why the mmap patch
was getting such negative feedback was that there was no way to provide
such a review except by reverse-engineering the design out of some very
messily-presented code. So if we're going to do anything about this,
what we have to do is tell people that the first thing to present for
a WIP review is a design document. If they feel a need to write some
throwaway code to help them clarify their ideas, fine ... but *don't
show us that code*. Write a design document. Get that reviewed.
Then see about coding it, or bringing your first-draft code up to the
point where it can stand the light of day.

I don't know if we need a formal process akin to CFs for reviewing
design documents. I think people are usually plenty willing to discuss
ideas on -hackers, unless maybe you hit them at a particularly bad time
like when they're already burnt out towards the end of a CF.

regards, tom lane

#79Alvaro Herrera
alvherre@commandprompt.com
In reply to: Tom Lane (#78)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Excerpts from Tom Lane's message of mar abr 19 03:34:34 -0300 2011:

As Robert noted, the purpose of the commitfest mechanism is mostly to
ensure that patches that *are* committable, or close to it, don't fall
through the cracks. I'm not sure we're doing anybody any favors by
trying to shoehorn reviews of WIP ideas into that same process. At the
very least it seems we'd need a different set of review guidelines for
WIP items, and we don't have one.

I think this is historical revisionism. Commitfests were mostly created
because of pressure due to the lateness of the HOT patch. Probably
there were other factors too but this is likely the single most
important reason. (I think the term "commitfest" was coined later, but
I don't think this invalidates my point.)

And the way we considered things at the time is that we had failed to
timely review the concepts in the WIP HOT patch that was presented. So
we wanted to ensure that we provided good feedback to WIP patches (to
all patches really) to avoid this failure from repeating. All patches
*and WIP ideas* were supposed to be reviewed by someone, and if they
were to be rejected, some rationale was to be provided.

Somewhere down the line this seems to have been forgotten and we are now
using commitfests just to track finished patches.

So if we want to stick to the original principles we should have some
sort of "different set of review guidelines". Or perhaps we could just
decide that we don't care much about this problem and toss it aside.

Maybe this is something to discuss at the next developer's meeting.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#79)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Alvaro Herrera <alvherre@commandprompt.com> writes:

I think this is historical revisionism. ...
Somewhere down the line this seems to have been forgotten and we are now
using commitfests just to track finished patches.

So if we want to stick to the original principles we should have some
sort of "different set of review guidelines". Or perhaps we could just
decide that we don't care much about this problem and toss it aside.

Well, I absolutely think that we need to encourage people to get
feedback at the design and prototype stages. The problem with the
commitfest mechanism for that is that when you are trying to work out a
patch, you don't want to wait around for a couple months for comments.
The time delay that's built into the CF process means that it's
fundamentally not very good for anything except finished patches that
can sit on a shelf for awhile before they get applied.

I think that ideally, WIP reviews would be something that happens
quickly on pgsql-hackers, and probably it would be best if they were
explicitly *not* encouraged while a CF is on. I know that I tend to see
discussions of unfinished patches as something of a distraction when
I'm up to my ears in committing finished ones, and certainly there's
less mental bandwidth available then.

Maybe this is something to discuss at the next developer's meeting.

I'd rather talk about it on-list so we can get comments from a wider
circle of people.

regards, tom lane

#81Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#80)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 5:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Well, I absolutely think that we need to encourage people to get
feedback at the design and prototype stages.  The problem with the
commitfest mechanism for that is that when you are trying to work out a
patch, you don't want to wait around for a couple months for comments.
The time delay that's built into the CF process means that it's
fundamentally not very good for anything except finished patches that
can sit on a shelf for awhile before they get applied.

From my point of view I definitely thought the rationale for
commitfests was as a kind of checkpoint to make sure there weren't any
developers waiting for feedback for a long time. My concurrent index
build patch ended up needing to be reworked and I would have liked to
be involved but it wasn't until feature freeze that you found all
these problems and then it was too late to wait for me to recode
things instead of having you just do it.

I admit though this whole concept of "finished patches" seems foreign
to me. I always have additional stuff I want to do and if the patch
sits on the shelf I'm essentially stuck unable to work on the next
great thing that that patch enables. Developers either have the option
to go off on their own with no feedback and risk having initial
assumptions questioned later and all their work invalidated or go and
work on something unrelated leaving this direction stunted with only
one round of features implemented. I think this is how we ended up
with partitioning that's only halfway useful and selinux that had tons
of code written that needed to be reworked.

Core developers attention is precious and we can't really dictate that
Tom must respond to every email within a week or anything crazy like
that. The commitfests are a dramatic improvement over waiting until
feature freeze which was what was happening before. They also help
bring in new committers and having Robert and Heikki and Peter and
others giving substantive feedback has also improved things
dramatically.

To use a database analogy I think of the commitfests as a checkpoint
-- that doesn't mean we don't also need bgwriter and don't
occasionally need to flush dirty buffers to enable the database to
make progress in the mean-time. But if we didn't have checkpoints at
all things would definitely fall through the cracks and get lost to
bitrot and brainfade.

--
greg

#82Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#80)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 12:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@commandprompt.com> writes:

I think this is historical revisionism. ...
Somewhere down the line this seems to have been forgotten and we are now
using commitfests just to track finished patches.

So if we want to stick to the original principles we should have some
sort of "different set of review guidelines".  Or perhaps we could just
decide that we don't care much about this problem and toss it aside.

Well, I absolutely think that we need to encourage people to get
feedback at the design and prototype stages.  The problem with the
commitfest mechanism for that is that when you are trying to work out a
patch, you don't want to wait around for a couple months for comments.
The time delay that's built into the CF process means that it's
fundamentally not very good for anything except finished patches that
can sit on a shelf for awhile before they get applied.

I think that ideally, WIP reviews would be something that happens
quickly on pgsql-hackers, and probably it would be best if they were
explicitly *not* encouraged while a CF is on.  I know that I tend to see
discussions of unfinished patches as something of a distraction when
I'm up to my ears in committing finished ones, and certainly there's
less mental bandwidth available then.

Ditto.

Unfortunately, my memory of this project only goes back to about
September 2008, which isn't far enough to remember why CommitFests
were created in the first place. So Alvaro may be correct in saying
that things have mutated over time, but that isn't necessarily a bad
thing. Maybe we've settled into something that works reasonably well.
Or maybe we should make some changes; nothing is set in stone.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#83Peter Eisentraut
peter_e@gmx.net
In reply to: Greg Stark (#81)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 17:52 +0100, Greg Stark wrote:

I admit though this whole concept of "finished patches" seems foreign
to me. I always have additional stuff I want to do and if the patch
sits on the shelf I'm essentially stuck unable to work on the next
great thing that that patch enables. Developers either have the option
to go off on their own with no feedback and risk having initial
assumptions questioned later and all their work invalidated or go and
work on something unrelated leaving this direction stunted with only
one round of features implemented. I think this is how we ended up
with partitioning that's only halfway useful and selinux that had tons
of code written that needed to be reworked.

Yeah, there appear to be occasional assumptions that one ought to work
on one major feature per release, and ideally you'd have the plan ready
for the first commit fest and the code mostly ready for the third commit
fest. Whereas I agree with you that it's often rather the case that you
want to work on say three incremental features, and order for that to
work out under this process, you really have to get the first increment
perfect for the first commit fest already. Which is difficult if no one
pays attention until the commit fest starts.

#84Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#80)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 12:21 -0400, Tom Lane wrote:

Well, I absolutely think that we need to encourage people to get
feedback at the design and prototype stages. The problem with the
commitfest mechanism for that is that when you are trying to work out
a patch, you don't want to wait around for a couple months for
comments. The time delay that's built into the CF process means that
it's fundamentally not very good for anything except finished patches
that can sit on a shelf for awhile before they get applied.

I think that ideally, WIP reviews would be something that happens
quickly on pgsql-hackers, and probably it would be best if they were
explicitly *not* encouraged while a CF is on. I know that I tend to
see discussions of unfinished patches as something of a distraction
when I'm up to my ears in committing finished ones, and certainly
there's less mental bandwidth available then.

We'll the current process certainly places a lot of emphasis on the
"finishing" part. You have commit fests that nominally account for 50%
of development time, and then beta, RC, limbo, backbranch releases -- I
blogged about this a while ago, if you follow all these guidelines and
encouragements, you are left with all of about 20 days per year for
discussion, collaborative planning and coding. Which is obviously
silly, which is why the process breaks down. People do other things as
commit fests fade out, but they subconsciously fear they will get the
stink for it, so public discussion and planning is effectively stifled.

I think we should put less temporal emphasis on the finishing part, but
use the time better. I would imagine one commit fest per month, but
it's only a week long. Then everyone can really concentrate on the
commit fest, people get faster feedback, but there is ultimately more
time to do other things. Something to think about.

#85Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#82)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert,

Unfortunately, my memory of this project only goes back to about
September 2008, which isn't far enough to remember why CommitFests
were created in the first place. So Alvaro may be correct in saying
that things have mutated over time, but that isn't necessarily a bad
thing. Maybe we've settled into something that works reasonably well.
Or maybe we should make some changes; nothing is set in stone.

Review of design concepts and WIP patches has *always* been a problem
for this project. Andrew Sullivan bitched about it at some length back
in 2004 ("Why there is no traffic on pgsql-replicationhooks", but
Andrew's blog is down now unfortunately). And I've gotten complaints
from numerous people: the Drizzle student, the person who e-mailed me,
Afilias, Greenplum, Aster Data, others. It's just a broken process, and
it particularly leads PostgreSQL forks to not contribute back stuff.

We tell people to submit a design concept, but then such submissions are
often ignored. When they're not ignored, they often are subject to
either extreme bikeshedding or a lot of negativity around things the
author hasn't implemented yet ... even if the author warns that they're
not implemented.

(btw, I'm not talking about the MMAP patch here, which has gotten
excellent review at this point. I'm talking about a lot of other patches)

I think that Robert is right and what we need is a completely different
process for WIP patches and design concepts. It's pretty clear that
none of the processes we've tried so far ("just post it to
pgsql-hackers", "get a submission mentor" and "commitfest") have worked
consistently.

So in the spirit of NOT reinventing the wheel: ReviewBoard. Yes,
really. One of the big issues with working through design reviews etc.
on this mailing list is the lack of continuity and timeliness in
comments on the idea/WIP patch. Having an interface which presents all
of the discussion around a specific patch in a threaded and
chronological way would help cut down on bikeshedding and dogpiling, as
well as allowing both the idea/patch author to review all commentary in
a coherent way.

Maybe we don't want to use ReviewBoard specifically. Maybe we want to
use bugzilla or Crucible or Redmine something more specific for
patch/spec review. But I think it's time to try something else, maybe
several other things.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#84)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Peter Eisentraut <peter_e@gmx.net> writes:

I think we should put less temporal emphasis on the finishing part, but
use the time better. I would imagine one commit fest per month, but
it's only a week long. Then everyone can really concentrate on the
commit fest, people get faster feedback, but there is ultimately more
time to do other things. Something to think about.

Yeah, maybe. To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes. But that might be a good thing
all around. I find this idea attractive.

regards, tom lane

#87Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#86)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wednesday, April 20, 2011 08:50:04 PM Tom Lane wrote:

I think we should put less temporal emphasis on the finishing part, but
use the time better. I would imagine one commit fest per month, but
it's only a week long. Then everyone can really concentrate on the
commit fest, people get faster feedback, but there is ultimately more
time to do other things. Something to think about.

Yeah, maybe. To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes. But that might be a good thing
all around. I find this idea attractive.

Actually as a patch submitter I would somewhat prefer that as well. Its not
exactly easy to learn what wasn't optimal with your patch at times.

On the other hand for some issues its pretty hard to fix the more involved
issues without e.g. Tom's involvement.

Andres

#88Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#85)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 2:39 PM, Josh Berkus <josh@agliodbs.com> wrote:

Review of design concepts and WIP patches has *always* been a problem
for this project.  Andrew Sullivan bitched about it at some length back
in 2004 ("Why there is no traffic on pgsql-replicationhooks", but
Andrew's blog is down now unfortunately).  And I've gotten complaints
from numerous people: the Drizzle student, the person who e-mailed me,
Afilias, Greenplum, Aster Data, others.  It's just a broken process, and
it particularly leads PostgreSQL forks to not contribute back stuff.

We tell people to submit a design concept, but then such submissions are
often ignored.

Please provide the evidence that this is a problem that exists now, as
opposed to seven years ago. I leave pgsql-hackers emails marked
unread until they have gotten a response, especially if it's something
important like a design proposal. I have 10 unread threads at the
moment; and I don't think any of them are design proposals except
possibly "Still more REINDEX fun", which was posted 9 minutes ago by
Tom - presumably not the case you are concerned about. I have worked
extremely hard to make sure that we do NOT ignore such submissions,
and I would like to hold your feet to the fire on this one a little
bit: let's hear the list of design ideas that have been proposed in
the last year and been ignored. If the process is as bad as you are
alleging, you should find it easy to come up with numerous, recent
examples. I bet you can't.

When they're not ignored, they often are subject to
either extreme bikeshedding or a lot of negativity around things the
author hasn't implemented yet ... even if the author warns that they're
not implemented.

I concede that this happens, but I don't believe it happens nearly as
often as it used to, and, again, let's have some recent examples. I
don't care what happened three years ago; a lot has changed in the
last three years.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#89Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#88)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 4/20/11 12:00 PM, Robert Haas wrote:

Please provide the evidence that this is a problem that exists now, as
opposed to seven years ago.

Since you're clearly already made up your mind that no problem exists, I
don't have the energy to fight it out with you.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#90Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#87)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 2:53 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday, April 20, 2011 08:50:04 PM Tom Lane wrote:

I think we should put less temporal emphasis on the finishing part, but
use the time better.  I would imagine one commit fest per month, but
it's only a week long.  Then everyone can really concentrate on the
commit fest, people get faster feedback, but there is ultimately more
time to do other things.  Something to think about.

Yeah, maybe.  To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes.  But that might be a good thing
all around.  I find this idea attractive.

Actually as a patch submitter I would somewhat prefer that as well. Its not
exactly easy to learn what wasn't optimal with your patch at times.

On the other hand for some issues its pretty hard to fix the more involved
issues without e.g. Tom's involvement.

This would amount to reducing the amount of time we spend
in-CommitFest from 50% to slightly less than 25%. That would
certainly be pleasant from my point of view, but for the average patch
to get the same amount of attention, we'd need twice as many
volunteers, or the existing people to volunteer twice as much time, or
everyone to work twice as fast as they already are. That's not
impossible, if the new system inspires more people to contribute, but
2x is a lot, especially when you correct for relative skill levels:
we're not going to find another Tom Lane.

Still, it's an interesting thought.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#91Andres Freund
andres@anarazel.de
In reply to: Josh Berkus (#85)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wednesday, April 20, 2011 08:39:47 PM Josh Berkus wrote:

Robert,

Unfortunately, my memory of this project only goes back to about
September 2008, which isn't far enough to remember why CommitFests
were created in the first place. So Alvaro may be correct in saying
that things have mutated over time, but that isn't necessarily a bad
thing. Maybe we've settled into something that works reasonably well.

Or maybe we should make some changes; nothing is set in stone.

Review of design concepts and WIP patches has *always* been a problem
for this project. Andrew Sullivan bitched about it at some length back
in 2004 ("Why there is no traffic on pgsql-replicationhooks", but
Andrew's blog is down now unfortunately). And I've gotten complaints
from numerous people: the Drizzle student, the person who e-mailed me,
Afilias, Greenplum, Aster Data, others. It's just a broken process, and
it particularly leads PostgreSQL forks to not contribute back stuff.

Well. But very few company people to contribute back in reviewing stuff from
others. At least in the time I have somewhat regularly

We tell people to submit a design concept, but then such submissions are
often ignored. When they're not ignored, they often are subject to
either extreme bikeshedding or a lot of negativity around things the
author hasn't implemented yet ... even if the author warns that they're
not implemented.

I can see that point.

I think that Robert is right and what we need is a completely different
process for WIP patches and design concepts. It's pretty clear that
none of the processes we've tried so far ("just post it to
pgsql-hackers", "get a submission mentor" and "commitfest") have worked
consistently.

So in the spirit of NOT reinventing the wheel: ReviewBoard. Yes,
really. One of the big issues with working through design reviews etc.
on this mailing list is the lack of continuity and timeliness in
comments on the idea/WIP patch. Having an interface which presents all
of the discussion around a specific patch in a threaded and
chronological way would help cut down on bikeshedding and dogpiling, as
well as allowing both the idea/patch author to review all commentary in
a coherent way.

I don't believe a second that problem is solved by any tool. In my opinion
there simply are very few people being able to do in-depth reviews of complex
patches. And those are also needed to implement complex features or do parts
of features others could not do.

A RRR like process doesn't really help in those cases except catch the most
obvious problems.

Andres

#92Joshua D. Drake
jd@commandprompt.com
In reply to: Josh Berkus (#89)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/20/2011 12:05 PM, Josh Berkus wrote:

On 4/20/11 12:00 PM, Robert Haas wrote:

Please provide the evidence that this is a problem that exists now, as
opposed to seven years ago.

Since you're clearly already made up your mind that no problem exists, I
don't have the energy to fight it out with you.

Well, you aren't fighting alone. We have significant problems in this
area. As you said, we always have. There is also a bizarre, almost
insane objection to using tools that "aren't invented here" to solve
problems. The problems you (Josh) present are real, regardless of
Robert's opinion. The thing that is important for everyone to remember
is PERCEPTION IS REALITY.

If people PERCEIVE there is a problem, THERE IS A PROBLEM.

So Robert, with respect to your "show me the money", the money is at
your feet on the floor. JB and I can list multitudes of hackers and
contributors who have the perception of this problem and that perception
is hurting the project because frankly, Astor Data isn't going to waste
it's valuable time (money) fighting our community. We have to make it
damn freaking easy for them or we lose their interest, and thus the
community loses.

From the whales of discontentment society,

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Developement
Organizers of the PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

#93Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#90)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wednesday, April 20, 2011 09:09:48 PM Robert Haas wrote:

On Wed, Apr 20, 2011 at 2:53 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday, April 20, 2011 08:50:04 PM Tom Lane wrote:

Yeah, maybe. To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes. But that might be a good thing
all around. I find this idea attractive.

Actually as a patch submitter I would somewhat prefer that as well. Its
not exactly easy to learn what wasn't optimal with your patch at times.
On the other hand for some issues its pretty hard to fix the more
involved issues without e.g. Tom's involvement.

This would amount to reducing the amount of time we spend
in-CommitFest from 50% to slightly less than 25%. That would
certainly be pleasant from my point of view, but for the average patch
to get the same amount of attention, we'd need twice as many
volunteers, or the existing people to volunteer twice as much time, or
everyone to work twice as fast as they already are. That's not
impossible, if the new system inspires more people to contribute, but
2x is a lot, especially when you correct for relative skill levels:
we're not going to find another Tom Lane.
Still, it's an interesting thought.

Additional points:
* perhaps it also frees up time if committers balk earlier if a patch doesn't
meet some requirement
* Patch submitters learn more:
* so they submit better patches in the future
* so they can apply the same standards when they review other patches

Andres

#94Robert Haas
robertmhaas@gmail.com
In reply to: Joshua D. Drake (#92)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 3:13 PM, Joshua D. Drake <jd@commandprompt.com> wrote:

On 04/20/2011 12:05 PM, Josh Berkus wrote:

On 4/20/11 12:00 PM, Robert Haas wrote:

Please provide the evidence that this is a problem that exists now, as
opposed to seven years ago.

Since you're clearly already made up your mind that no problem exists, I
don't have the energy to fight it out with you.

Well, you aren't fighting alone. We have significant problems in this area.
As you said, we always have. There is also a bizarre, almost insane
objection to using tools that "aren't invented here" to solve problems. The
problems you (Josh) present are real, regardless of Robert's opinion. The
thing that is important for everyone to remember is PERCEPTION IS REALITY.

If people PERCEIVE there is a problem, THERE IS A PROBLEM.

Absolutely. And I am perfectly well aware that we have screwed this
up from time to time. But I also know that I have spent a very large
amount of time over the last few years trying to improve things. It
would be nice to know whether that has had any impact. If it hasn't,
then half of what I have spent the last two years doing has been a
waste of time.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#95Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#87)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wednesday, April 20, 2011 08:53:34 PM Andres Freund wrote:

On Wednesday, April 20, 2011 08:50:04 PM Tom Lane wrote:

I think we should put less temporal emphasis on the finishing part, but
use the time better. I would imagine one commit fest per month, but
it's only a week long. Then everyone can really concentrate on the
commit fest, people get faster feedback, but there is ultimately more
time to do other things. Something to think about.

Yeah, maybe. To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes. But that might be a good thing
all around. I find this idea attractive.

Actually as a patch submitter I would somewhat prefer that as well. Its not
exactly easy to learn what wasn't optimal with your patch at times.

Perhaps we should adapt something like the kernel's checkpatch.pl for our
needs?
I.e. something that checks that the most obvious style issues are addressed
(tabs, trailing whitespaces, spacing around braces etc).

Andres

#96Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#90)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> writes:

On Wed, Apr 20, 2011 at 2:53 PM, Andres Freund <andres@anarazel.de> wrote:

On Wednesday, April 20, 2011 08:50:04 PM Tom Lane wrote:

Yeah, maybe. To do that, we'd have to strongly resist the temptation to
spend a lot of time fixing up submitted patches --- if it's not pretty
darn close to committable, back it goes. But that might be a good thing
all around. I find this idea attractive.

Actually as a patch submitter I would somewhat prefer that as well. Its not
exactly easy to learn what wasn't optimal with your patch at times.
On the other hand for some issues its pretty hard to fix the more involved
issues without e.g. Tom's involvement.

This would amount to reducing the amount of time we spend
in-CommitFest from 50% to slightly less than 25%. That would
certainly be pleasant from my point of view, but for the average patch
to get the same amount of attention, we'd need twice as many
volunteers, or the existing people to volunteer twice as much time, or
everyone to work twice as fast as they already are.

Well, no, that's not the whole story. To me, what the above idea
implies is shifting more of the burden of fixing up patches away from
the committer and back to the patch author. Instead of spending time
fixing up not-quite-ready patches myself, I'd be much more ready to
tell the patch author "do X, Y, and Z, and come back next month".

From the committers' standpoint, this is a great idea precisely because
it suggests we might get to put only 25% and not 50% of our time into
commitfests. But it also makes the work more distributed, and it forces
patch authors to learn the things that committers might otherwise have
done for them silently, which in the long run will make everything work
better.

The key point is that we do have to have much more frequent commitfests.
It's hard to bounce back a patch when you know it will then be delayed
two months, especially if the patch is already two months old and the
author has probably forgotten half of it himself. For me anyway,
"I'll just take half a day and make this look the way I think it should"
is a continual temptation. A shorter CF cycle would weaken the argument
to do that.

I haven't spent any time in the role of a non-committer reviewer, but
I think that the same dynamic might work for reviewers. Basically
what a short cycle would do is encourage people to hit the high points
and turn the review around quickly, dumping the big issues back into the
patch author's lap for fixing. You wouldn't spend time sweating details
until the patch had gotten into a state that justified it. Of course
we'd need to tweak the review guidelines to encourage this sort of
multiple-iterations review approach --- right now the guidelines are
pretty much one-size-fits-all, and this type of approach cannot work
with that.

regards, tom lane

#97Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#96)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 4/20/11 12:35 PM, Tom Lane wrote:

Well, no, that's not the whole story. To me, what the above idea
implies is shifting more of the burden of fixing up patches away from
the committer and back to the patch author. Instead of spending time
fixing up not-quite-ready patches myself, I'd be much more ready to
tell the patch author "do X, Y, and Z, and come back next month".

Yes, definitely! For that matter, booting a patch which got no review
is less of a problem if we're only booting it for 3 weeks.

The whole purpose of the CFs was not to help submitters -- it was to
help reviewers. If we just wanted to help submitters, we'd do
Continuous Integration, and review all the time. But the reviewers need
"time off".

I think we should try this for 9.2. Given the accumulation between then
and now, I think the first CF should be 2 weeks, and then we can move to
monthly/weeklong CFs after that. So it would look like:

CF1: July 16-31
CF2: August 1-7
CF3: September 1-7
CF4: October 1-7
CF5: November 1-7
CF6: December 1-7
CF7: January 3-10
CF8: February until done

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#98Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#94)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert,

Absolutely. And I am perfectly well aware that we have screwed this
up from time to time. But I also know that I have spent a very large
amount of time over the last few years trying to improve things. It
would be nice to know whether that has had any impact. If it hasn't,
then half of what I have spent the last two years doing has been a
waste of time.

That would take pretty significant research; it's not like we have a
database of idea/WIP submissions. It's all e-mail.

Not that it wouldn't be worth doing, but it would be an entire day of
someone's time.

BTW, I do still believe that step 1 is tremendously expanding the "so
you want to submit a patch" documentation, and linking it in many places
so that newbies read it. On my TODO list.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#99Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#89)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 3:05 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 4/20/11 12:00 PM, Robert Haas wrote:

Please provide the evidence that this is a problem that exists now, as
opposed to seven years ago.

Since you're clearly already made up your mind that no problem exists, I
don't have the energy to fight it out with you.

It is not possible for me to work any harder on anything than I have
worked on this problem. I do not deny the existence of the problem.
But I believe that we have greatly mitigated it in the last few
release cycles, and that much of what remains is a problem of
perception, not reality.

You can disagree, but if no one has the energy to find real examples
and talk about them, then it is hard to see how we will be able to
improve the situation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#100Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#84)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Peter Eisentraut <peter_e@gmx.net> writes:

I would imagine one commit fest per month, but
it's only a week long.

BTW, just as a thought experiment: what about a one-day CF once a week?
"Patch Tuesdays", if you will. Spend all day reviewing/committing,
bounce back whatever is not ready, patch authors try again next week.

Really large patches are not going to fit into that paradigm, probably,
but an awful lot of stuff would --- and it might help encourage more
incremental development of the big ones, too.

regards, tom lane

#101Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#100)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 21:54, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I would imagine one commit fest per month, but
it's only a week long.

BTW, just as a thought experiment: what about a one-day CF once a week?
"Patch Tuesdays", if you will.  Spend all day reviewing/committing,
bounce back whatever is not ready, patch authors try again next week.

I think that would pretty much kill the process for any committer who
is not employed to work full-time on postgresql *development*. Those
who have other dayjobs (which may well be postgresql consulting or
training or whatever) will probably end up dealing with significantly
fewer patches, leaving even more of the burden on those who do have
the dedicated schedule. I know I don't do as much reviewing/comitting
as I'd like to do during the commitfests, but with a process like
that, it would probably become more or less zero.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#102Andrew Dunstan
andrew@dunslane.net
In reply to: Magnus Hagander (#101)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/20/2011 04:09 PM, Magnus Hagander wrote:

On Wed, Apr 20, 2011 at 21:54, Tom Lane<tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut<peter_e@gmx.net> writes:

I would imagine one commit fest per month, but
it's only a week long.

BTW, just as a thought experiment: what about a one-day CF once a week?
"Patch Tuesdays", if you will. Spend all day reviewing/committing,
bounce back whatever is not ready, patch authors try again next week.

I think that would pretty much kill the process for any committer who
is not employed to work full-time on postgresql *development*. Those
who have other dayjobs (which may well be postgresql consulting or
training or whatever) will probably end up dealing with significantly
fewer patches, leaving even more of the burden on those who do have
the dedicated schedule. I know I don't do as much reviewing/comitting
as I'd like to do during the commitfests, but with a process like
that, it would probably become more or less zero.

Yeah, I can't organize my time that way either.

cheers

andrew

#103Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#102)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan <andrew@dunslane.net> writes:

On 04/20/2011 04:09 PM, Magnus Hagander wrote:

On Wed, Apr 20, 2011 at 21:54, Tom Lane<tgl@sss.pgh.pa.us> wrote:

BTW, just as a thought experiment: what about a one-day CF once a week?
"Patch Tuesdays", if you will. Spend all day reviewing/committing,
bounce back whatever is not ready, patch authors try again next week.

I think that would pretty much kill the process for any committer who
is not employed to work full-time on postgresql *development*.

Yeah, I can't organize my time that way either.

True, and any fixed day of the week would let out X number of people
anyway. But ignoring scheduling difficulties, my point here is that
it seems like the shorter the cycle, the better, for a lot of purposes.
Can we do any better than once-a-month, or is that the limit given that
people need flexible schedules within the fest?

regards, tom lane

#104Peter Eisentraut
peter_e@gmx.net
In reply to: Josh Berkus (#85)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 11:39 -0700, Josh Berkus wrote:

Maybe we don't want to use ReviewBoard specifically. Maybe we want
to use bugzilla or Crucible or Redmine something more specific for
patch/spec review. But I think it's time to try something else, maybe
several other things.

I had suggested ideatorrent before. But I agree with you in principle.

#105Josh Berkus
josh@agliodbs.com
In reply to: Tom Lane (#103)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Tom,

True, and any fixed day of the week would let out X number of people
anyway. But ignoring scheduling difficulties, my point here is that
it seems like the shorter the cycle, the better, for a lot of purposes.
Can we do any better than once-a-month, or is that the limit given that
people need flexible schedules within the fest?

Also consider that the PostgreSQL development world represents a lot of
different time zones. For me to have some dialog with Tatsuo about a
patch, for example, takes at least 24 hours for a simple back-and-forth.

If we were a full-time development shop in a single time zone, we could
use scrum and do a *daily* integration. Many of my clients do. But for
a high-distributed volunteer-based organization, I don't think it's
practical.

I also find the one-day-a-week attractive. It would make patch review
much more immediate. However, not only would it raise issues with
people's schedules, it would also require us to adopt new tools or
modify the CF code.

Is there anything between one-week-a-month and one-day-a-week?

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#106Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#90)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 15:09 -0400, Robert Haas wrote:

This would amount to reducing the amount of time we spend
in-CommitFest from 50% to slightly less than 25%. That would
certainly be pleasant from my point of view, but for the average patch
to get the same amount of attention, we'd need twice as many
volunteers, or the existing people to volunteer twice as much time, or
everyone to work twice as fast as they already are.

I think in reality people don't spend more than 50% of their time during
commit fests on the commit fest. By making the commit fests shorter and
tighter, we could perhaps increase that number. More "quality time" if
you will.

#107Joshua D. Drake
jd@commandprompt.com
In reply to: Robert Haas (#94)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/20/2011 12:22 PM, Robert Haas wrote:

Well, you aren't fighting alone. We have significant problems in this area.
As you said, we always have. There is also a bizarre, almost insane
objection to using tools that "aren't invented here" to solve problems. The
problems you (Josh) present are real, regardless of Robert's opinion. The
thing that is important for everyone to remember is PERCEPTION IS REALITY.

If people PERCEIVE there is a problem, THERE IS A PROBLEM.

Absolutely. And I am perfectly well aware that we have screwed this
up from time to time. But I also know that I have spent a very large
amount of time over the last few years trying to improve things. It
would be nice to know whether that has had any impact. If it hasn't,
then half of what I have spent the last two years doing has been a
waste of time.

I don't think anyone would argue that your efforts have not improved the
situation. I certainly wouldn't. However, the perception (and reality of
the problem) definitely still applies. I wouldn't suggest that you stop
what you are doing but that doesn't mean the problem or variances of the
problem don't still exist and need to be addressed.

Sincerely,

jD

--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Developement
Organizers of the PostgreSQL Conference - http://www.postgresqlconference.org/
@cmdpromptinc - @postgresconf - 509-416-6579

#108Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#103)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 16:25 -0400, Tom Lane wrote:

But ignoring scheduling difficulties, my point here is that
it seems like the shorter the cycle, the better, for a lot of
purposes. Can we do any better than once-a-month, or is that the
limit given that people need flexible schedules within the fest?

If you want to keep the basic idea of predictable periods of activity
and rest, I think that's as far as you can go.

I'm personally not terribly tied to that; I'm more interested in the
tool support that the CF gives us. I might also like, for example, just
a permanent patch queue with patches sorted by date. Multiple
approaches like that could also very well exist in parallel, even within
the existing commitfest application framework.

#109Alvaro Herrera
alvherre@commandprompt.com
In reply to: Robert Haas (#94)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Excerpts from Robert Haas's message of mié abr 20 16:22:24 -0300 2011:

If people PERCEIVE there is a problem, THERE IS A PROBLEM.

Absolutely. And I am perfectly well aware that we have screwed this
up from time to time. But I also know that I have spent a very large
amount of time over the last few years trying to improve things. It
would be nice to know whether that has had any impact. If it hasn't,
then half of what I have spent the last two years doing has been a
waste of time.

It may very well be fixed, but if the guys doing the submission (or,
more precisely failing to do it) don't know that things have changed,
they will continue to avoid submitting stuff.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#110Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#97)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 3:42 PM, Josh Berkus <josh@agliodbs.com> wrote:

On 4/20/11 12:35 PM, Tom Lane wrote:

Well, no, that's not the whole story.  To me, what the above idea
implies is shifting more of the burden of fixing up patches away from
the committer and back to the patch author.  Instead of spending time
fixing up not-quite-ready patches myself, I'd be much more ready to
tell the patch author "do X, Y, and Z, and come back next month".

Yes, definitely!  For that matter, booting a patch which got no review
is less of a problem if we're only booting it for 3 weeks.

The whole purpose of the CFs was not to help submitters -- it was to
help reviewers.   If we just wanted to help submitters, we'd do
Continuous Integration, and review all the time.  But the reviewers need
"time off".

I think we should try this for 9.2.  Given the accumulation between then
and now, I think the first CF should be 2 weeks, and then we can move to
monthly/weeklong CFs after that.  So it would look like:

CF1: July 16-31
CF2: August 1-7
CF3: September 1-7
CF4: October 1-7
CF5: November 1-7
CF6: December 1-7
CF7: January 3-10
CF8: February until done

I am concerned that this will get us back into the land of the
interminable last CommitFest. I believe that one of the reasons why
things didn't go as smoothly before we had the CommitFest was because
patches didn't get dealt with until the end of the cycle. I think
that if, as proposed, we are faster about pushing patches back on the
submitters when they're not up to snuff, then we will end up having
more stuff bounce along for many CommitFests without actually getting
committed, which will tend to exacerbate the pile-up at the end of the
cycle. The basic underlying problem here is that there is tremendous
reluctance to boot anything when it means pushing it out to the next
release, and I think that's just terrible project management. If we
had punted collations and sync rep to 9.2, we would be on beta2 right
now, instead of still trying to get things squared away for beta1. If
we allow people to submit patches up until supposed feature freeze - 7
days instead of proposed feature freeze - 31 days, that's not going to
help.

Now, maybe if we branched the tree immediately after the last CF of
the release and continued having week-long CFs, we might be able to
make it work. Then, at least if you didn't get your stuff committed
to the right release, you could still get it committed somewhere. But
even then I think we'd have this problem of people being unwilling to
give up on jamming stuff into a release, regardless of the scheduling
impact of doing so. I actually think the problem of getting releases
out on time is a *much* bigger problem for us than how long or short
CommitFests are.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#111Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#109)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 7:00 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:

Excerpts from Robert Haas's message of mié abr 20 16:22:24 -0300 2011:

If people PERCEIVE there is a problem, THERE IS A PROBLEM.

Absolutely.  And I am perfectly well aware that we have screwed this
up from time to time.  But I also know that I have spent a very large
amount of time over the last few years trying to improve things.  It
would be nice to know whether that has had any impact.  If it hasn't,
then half of what I have spent the last two years doing has been a
waste of time.

It may very well be fixed, but if the guys doing the submission (or,
more precisely failing to do it) don't know that things have changed,
they will continue to avoid submitting stuff.

Yep.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#112Peter Eisentraut
peter_e@gmx.net
In reply to: Robert Haas (#110)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, 2011-04-20 at 21:09 -0400, Robert Haas wrote:

But
even then I think we'd have this problem of people being unwilling to
give up on jamming stuff into a release, regardless of the scheduling
impact of doing so. I actually think the problem of getting releases
out on time is a *much* bigger problem for us than how long or short
CommitFests are.

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months. Otherwise, the current 12
to 14 month horizon is just too long psychologically.

#113Noname
tomas@tuxteam.de
In reply to: Josh Berkus (#85)
Re: Formatting Curmudgeons WAS: MMAP Buffers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Apr 20, 2011 at 11:39:47AM -0700, Josh Berkus wrote:

[...]

Review of design concepts and WIP patches has *always* been a problem
for this project [...]

We tell people to submit a design concept, but then such submissions are
often ignored. When they're not ignored, they often are subject to
either extreme bikeshedding or a lot of negativity around things the
author hasn't implemented yet ... even if the author warns that they're
not implemented.

I'm not a committer. So take this data point for what it's worth. But I
have been following this list for quite a while, and I must say: I (very
respectfully!) disagree. Having watched mailing lists for other
projects, the quality of the answers one gets here is outstanding. The
tone might be sometimes a bit tight (but never disrespectful or
flaming), but seriously: what do I get off a friendly answer if there is
no content?

The same goes to -GENERAL. I've always got answers to my (sometimes, in
hindsight quite stupid) questions which actually *helped* to solve my
problem.

It's OK to strive to improve the process, but I think you all are quite
good.

[...]

So in the spirit of NOT reinventing the wheel: ReviewBoard. Yes,
really [...]
[...] But I think it's time to try something else, maybe
several other things.

Maybe. But I *do* understand the unwillingness to change that. I've
contributed (tiny) patches to more that one project, and it's
frustrating to fight the bug-tracker-du-jour system. This one won't talk
to me unless my browser talks Javascript. That one... (you get the
idea). I strongly appreciate the free-flowing mailing list style here
(maybe it's just an age problem ;-)

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFNr9IhBcgs9XrR2kYRAkw+AJoDFJcnpR06VpGNVAzsbx/eZpQcxACfUv//
vFsZsPiYlM78fxsjCLQvbHw=
=A+7H
-----END PGP SIGNATURE-----

#114Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#100)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Wed, Apr 20, 2011 at 8:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

I would imagine one commit fest per month, but
it's only a week long.

BTW, just as a thought experiment: what about a one-day CF once a week?
"Patch Tuesdays", if you will.  Spend all day reviewing/committing,
bounce back whatever is not ready, patch authors try again next week.

Really large patches are not going to fit into that paradigm, probably,
but an awful lot of stuff would --- and it might help encourage more
incremental development of the big ones, too.

I'm responding to this post with mostly general comments, not directed
specifically at Tom.

Speeding up the process means that people with more time get a bigger
say and people with less time get a smaller input than before. I'm
already concerned that the gap between patch submission and patch
commit is so short it effectively means feedback is impossible.

The more frequently we do integration, the greater proportion of our
time is spent doing that.

My concern is there are a relatively low number of people working on
features that lots of people care about. Senior time should not be
wasted on endless integration.

We should be encouraging people to spend more time on more useful
features, not an endless stream of trivial patches, integration and
release processes. None of our users give a flying, err, squirrel,
about our small patch review process. Especially when its absolutely
brilliant already.

My model of contributing to this project has always been to spend time
with customers, understanding solutions and problems, then bringing
that back to the community. That has brought both the funding to allow
me to contribute and a stream of ideas with a clear focus. I encourage
others to do the same. I don't think we should be working on an
interrupt driven model, we should be planning our contributions and
making sure we make the biggest impact possible with real code, not
just twittering about it constantly. If we spend too much time with
each other we will be exactly like the larger commercial development
groups who never meet users only each other. Even the General list
isn't fully representative of the actual/potential user base.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#115Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Peter Eisentraut (#112)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Peter Eisentraut <peter_e@gmx.net> wrote:

you need to think about shorter release cycles overall, like every
6 months.

With the current time between feature freeze and release, that
wouldn't leave a lot of time for development.

-Kevin

#116Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#114)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, 2011-04-21 at 14:01 +0100, Simon Riggs wrote:

We should be encouraging people to spend more time on more useful
features, not an endless stream of trivial patches, integration and
release processes.

Hence the proposal to cut that time down and make it count better.

Which direction were you thinking?

#117Peter Eisentraut
peter_e@gmx.net
In reply to: Kevin Grittner (#115)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, 2011-04-21 at 08:42 -0500, Kevin Grittner wrote:

you need to think about shorter release cycles overall, like every
6 months.

With the current time between feature freeze and release, that
wouldn't leave a lot of time for development.

Presumably, one would aim to cut all the other things in half as well.

#118Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#112)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

On Wed, 2011-04-20 at 21:09 -0400, Robert Haas wrote:

But
even then I think we'd have this problem of people being unwilling to
give up on jamming stuff into a release, regardless of the scheduling
impact of doing so.  I actually think the problem of getting releases
out on time is a *much* bigger problem for us than how long or short
CommitFests are.

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months.  Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree. I am in favor of a shorter release cycle. But I think that
a shorter release cycle won't work well if there is still four month
long integration period at the end of each series of CommitFests. The
problem is a bit circular here: because release cycles are long,
people really, really want to slip as much as possible in at the end.
But being under time pressure to get things committed results in a
higher bug count, which means more things that have to be fixed after
feature freeze, which translates into a long release cycle.

I think that it's not too bad if the process of a release getting out
the door results in effectively missing one CommitFest. For example,
if we imagine one-month CommitFests starting every two months, and we
had a CommitFest starting on January 15th, it wouldn't be too painful
if we skipped a hypothetical March 15th CommitFest to get the release
done, and then started up the process again on May 15th. However, in
practice, what happens is we miss *two* CommitFests: the expectation
is that the next CommitFest will be on the order of July 15th, which
is just too long. Similarly, if we did shorter CommitFests and
shorter releases - say, five one-week-a-month CommitFests in July,
August, September, October, and November, I'd want to kick a release
out in December and reopen for development in January, not get stuck
with the same six-month feature freeze we have now, or even a
four-month feature freeze. But that isn't going to work if people do
the same sort of throwing everything into the kitchen sink at the last
minute that we have been doing for at least the last couple of
releases.

In fact, I don't believe that the current CF cycle really forces a
huge amount of waiting-for-feedback. It's true that if you submit a
patch at a randomly chosen time, you will have to wait up to two
months for a CommitFest to start, and then you might not get a review
until late in the CommitFest, so it could take you up to three months
to get a review. In practice, patches are not submitted at random
times - in fact, probably 50% of the patches come in during the last
week before the CF starts, and typically perhaps 50% of the patches
get a review in the first week, and maybe 80% within the first two
weeks. Some patches also get an initial review between CommitFests,
which further improves the average. Overall, I bet the average time
between patch submission and first review is <3 weeks. You can
typically get 2 or 3 followup reviews during the same cycle with only
a few days latency for each. Even though it would be nice to do
better, for an all-volunteer project, I think it's respectable. I
can't say the same thing about our process from getting from feature
freeze to release. It's really long, and it's nearly all fixing bugs
in code that was committed in the last CF, and the last CF produces
exponentially more bugs than the earlier ones, and it's often the case
that people don't fix their own bugs and someone else has to jump in
to pick up the slack. Meanwhile, the regular flow of reviewing and
committing patches is completely disrupted; and once in a while
someone gets flamed for so much as bringing up a new feature that
they're interested in working on for the next release (which I think
is totally unwarranted; now is the PERFECT time to begin roughing out
plans for 9.2 work... but I digress).

So while I'm mildly interested in the idea of shifting the CF cycle
around to provide more timely review, I can't really get that excited
about it, especially if there's any risk that we are just shifting
more of the work from the CommitFest cycle to the
end-of-release-interminable-integration-period. However, if there's
some way of avoiding the phenomenon where all hell breaks loose
because people jam four major new features into the tree in as many
weeks, sign me up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#119Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#118)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months. �Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree. I am in favor of a shorter release cycle.

I'm not. I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year. It's hard enough
to get people to migrate that often.

Another problem is that if you halve the release interval, you either
double the amount of work spent on maintaining back branches, or halve
the support lifetime of a branch. Neither of those is attractive.

Now, it certainly would be nice to spend less time in beta mode as
opposed to development, and I think most of the points being made here
are really about how to cut that. But reducing the release interval is
not going to reduce the total amount of time we spend in beta mode;
in fact I'd expect it to increase. Halving the amount of development
time per release doesn't mean that you can cut beta time proportionally.
It just takes time to cut a release, and time for testers to try it.

regards, tom lane

#120Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#119)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/21/2011 11:16 AM, Tom Lane wrote:

Robert Haas<robertmhaas@gmail.com> writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut<peter_e@gmx.net> wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months. Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree. I am in favor of a shorter release cycle.

I'm not. I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year. It's hard enough
to get people to migrate that often.

I agree.

Another problem is that if you halve the release interval, you either
double the amount of work spent on maintaining back branches, or halve
the support lifetime of a branch. Neither of those is attractive.

I *really* *really* agree.

cheers

andrew

#121Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Tom Lane (#119)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months. �Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree. I am in favor of a shorter release cycle.

I'm not. I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year. It's hard enough
to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more end
users to start skipping releases. Some already do - it's common not to
upgrade unless there's a feature you really need, but for those who do
stay on the 'current' upgrade path, you'll lose some who can't afford to
spend more than one integration-testing round a year.

Ross
--
Ross Reedstrom, Ph.D. reedstrm@rice.edu
Systems Engineer & Admin, Research Scientist phone: 713-348-6166
Connexions http://cnx.org fax: 713-348-3665
Rice University MS-375, Houston, TX 77005
GPG Key fingerprint = F023 82C8 9B0E 2CC6 0D8E F888 D3AE 810E 88F0 BEDE

#122Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#119)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:16 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months.  Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree.  I am in favor of a shorter release cycle.

I'm not.  I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year.  It's hard enough
to get people to migrate that often.

I agree there's probably little user demand, and back-branch
maintenance is an issue, but I think if it removed the temptation to
cram major new features into the tree at the last minute, it might be
worth it. However, a possibly more likely outcome is that we'd still
have that temptation, just more frequently; and end up with even less
of the year open to new patches than is currently the case.

Another problem is that if you halve the release interval, you either
double the amount of work spent on maintaining back branches, or halve
the support lifetime of a branch.  Neither of those is attractive.

Now, it certainly would be nice to spend less time in beta mode as
opposed to development, and I think most of the points being made here
are really about how to cut that.  But reducing the release interval is
not going to reduce the total amount of time we spend in beta mode;
in fact I'd expect it to increase.  Halving the amount of development
time per release doesn't mean that you can cut beta time proportionally.
It just takes time to cut a release, and time for testers to try it.

I believe that the problem is much more related to the fact that we
commit things at the end of the cycle that aren't really done than it
is to the amount of time beta testers need to try things. If we were
only waiting on testing, we could branch the tree and call the release
du jour beta for another N months, then release, meanwhile continuing
development. In fact, you and I and three or four other people have
spent most of our visible PG time over the last 2 months fixing MANY
bugs, mostly in the six or so major features committed between
February 7th and March 6th. (By way of comparison, notice how few
bugs that have been in the major patches from CF3 - because those
things were actually pretty much working *when they were committed*.)

Now, we're getting to the point where that might actually be a
reasonable way to go. It wouldn't bother me a bit to branch the tree
just after beta1 and start a new cycle of CommitFests on May 15th, and
we could begin integrating some of the big stuff that didn't make it
into 9.1: key locks, range types, additional sync rep modes, snapshot
cloning, parallel pg_dump, etc. It would be great to start working on
that stuff while it's still mildly fresh in people's minds, and at the
*beginning* of the release cycle. We're probably doomed to another
fall release at this point anyway, so it's not clear to me that the
inevitable loss of focus that will ensue is really costing anything.
Had we gotten to beta1 on March 1st, I'd probably be in favor of going
all in to get the release out in June or maybe on July 1, but at this
point that seems unlikely to be realistic.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#123Robert Haas
robertmhaas@gmail.com
In reply to: Ross J. Reedstrom (#121)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:37 AM, Ross J. Reedstrom <reedstrm@rice.edu> wrote:

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 21, 2011 at 2:43 AM, Peter Eisentraut <peter_e@gmx.net> wrote:

I think to really address that problem, you need to think about shorter
release cycles overall, like every 6 months.  Otherwise, the current 12
to 14 month horizon is just too long psychologically.

I agree.  I am in favor of a shorter release cycle.

I'm not.  I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year.  It's hard enough
to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more end
users to start skipping releases. Some already do - it's common not to
upgrade unless there's a feature you really need, but for those who do
stay on the 'current' upgrade path, you'll lose some who can't afford to
spend more than one integration-testing round a year.

Well, that aspect of the problem doesn't bother me, much. I don't
really care whether people upgrade to each new release the moment it
comes out anyway. It would require us to keep any
backward-compatibility hacks around for more releases, but we're
pretty good about that anyway. 8.3 broke the world, but the last few
releases have been pretty smooth for most people, I think.

Not to say that there aren't OTHER problems with the idea...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#124Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#118)
Re: Formatting Curmudgeons WAS: MMAP Buffers

[ another thought on this topic ]

Robert Haas <robertmhaas@gmail.com> writes:

I think that it's not too bad if the process of a release getting out
the door results in effectively missing one CommitFest. ...
But that isn't going to work if people do
the same sort of throwing everything into the kitchen sink at the last
minute that we have been doing for at least the last couple of
releases.

In fact, I don't believe that the current CF cycle really forces a
huge amount of waiting-for-feedback. It's true that if you submit a
patch at a randomly chosen time, you will have to wait up to two
months for a CommitFest to start, and then you might not get a review
until late in the CommitFest, so it could take you up to three months
to get a review. In practice, patches are not submitted at random
times - in fact, probably 50% of the patches come in during the last
week before the CF starts, and typically perhaps 50% of the patches
get a review in the first week, and maybe 80% within the first two
weeks.

But aren't those two sides of the same coin, ie, people's natural
tendency to work to a deadline? If you approve of a lot of patches
showing up just in time for a commitfest, why don't you approve of
big patches showing up just in time for a release? I mean, I've been
heard to complain about that too, but complaining hasn't changed
anyone's behavior and it's foolish to expect that it will in the
future. (See insanity, definition of.)

We need to find a way to work with that behavior, not try to change it.
I don't know what exactly.

One idea that comes to mind is to give up on the linear development-mode-
then-beta-mode management model, ie, allow development of release N+1
to start while beta is still going on for release N. The principal
objection to this in the past has been that the PG development community
is too small to do more than one thing at once, but maybe that's not
true anymore. The thing I'd be most worried about is how we get enough
energy directed at the release-stabilization part of the work, when for
most developers the new-development part is much more interesting/fun.
But we have that problem in some form already --- it's not clear to me
how much of the community really engages in what happens during beta,
rather than quietly working on stuff for the next release.

regards, tom lane

#125Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#123)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thursday, April 21, 2011 05:43:16 PM Robert Haas wrote:

On Thu, Apr 21, 2011 at 11:37 AM, Ross J. Reedstrom <reedstrm@rice.edu>

wrote:

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I agree. I am in favor of a shorter release cycle.

I'm not. I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year. It's hard enough
to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more end
users to start skipping releases. Some already do - it's common not to
upgrade unless there's a feature you really need, but for those who do
stay on the 'current' upgrade path, you'll lose some who can't afford to
spend more than one integration-testing round a year.

Well, that aspect of the problem doesn't bother me, much. I don't
really care whether people upgrade to each new release the moment it
comes out anyway.
Not to say that there aren't OTHER problems with the idea...

One could argue that its causing bad PR for postgres. I have seen several
parties planning to migrate away or not migrate to postgres because of
performance evaluations they made. With 7.4, 8.0 and 8.2. In 2010.

Andres

#126Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#124)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:46 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

But aren't those two sides of the same coin, ie, people's natural
tendency to work to a deadline?  If you approve of a lot of patches
showing up just in time for a commitfest, why don't you approve of
big patches showing up just in time for a release?  I mean, I've been
heard to complain about that too, but complaining hasn't changed
anyone's behavior and it's foolish to expect that it will in the
future.  (See insanity, definition of.)

Well, I guess I approve of the first behavior because it doesn't feel
like having a red-hot iron spike driven through my foot, and I
disapprove of the second one because it does. That may not be
entirely consistent taken in the abstract, but it has some solid
practical roots.

We need to find a way to work with that behavior, not try to change it.
I don't know what exactly.

One idea that comes to mind is to give up on the linear development-mode-
then-beta-mode management model, ie, allow development of release N+1
to start while beta is still going on for release N.  The principal
objection to this in the past has been that the PG development community
is too small to do more than one thing at once, but maybe that's not
true anymore.  The thing I'd be most worried about is how we get enough
energy directed at the release-stabilization part of the work, when for
most developers the new-development part is much more interesting/fun.
But we have that problem in some form already --- it's not clear to me
how much of the community really engages in what happens during beta,
rather than quietly working on stuff for the next release.

I totally agree. In fact, I think that trying to close off that
activity is one of the most self-destructive things we could possibly
do. It makes missing the release far more painful if you're thinking
about not only a 12-month slip on GA but also a 6-month slip on any
meaningful further review. Encouraging people to hold off major
proposals for the next release while we are focusing on beta also
tends to slow them down, which then exacerbates the pile-up at the end
of the release cycle. I would like to blow the doors on that wide
open and encourage people to start submitting design proposals for 9.2
NOW. NOW, NOW, NOW! Not in July! And *really* not next January!
And frankly, the sooner we can realistically start working on
integrating the code that has *already* been written for 9.2, the
better. The patches are going to land on us at some point, and
dealing with them earlier will allow those people to move on to other
things (which is good), reduce the pile-up at the end of the cycle
(even better), or possibly both.

I'm willing to make a serious commitment to being involved in the
release stabilization work and to give it some degree of priority over
new patches, if that's what it takes to make the process work
smoothly. We are fundamentally resource-constrained, and no process
is going to change that unless the process change, of itself, causes
more people to contribute more time. But even if the first CommitFest
involves a slightly higher bounce rate due to lack of
reviewer/committer bandwidth, it's still better than not having one.
There have been maybe half a dozen people who have been principally
responsible for the stabilization that we have done since CF4, and the
community is much larger than that. Everyone else is either doing
nothing (which is bad), or working without on-list discussion (which
is also bad). Even for the people who are deeply committed to release
stabilization would probably be happier and more motivated to continue
contributing if they weren't being limited to ONLY that.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#127Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#125)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote:

On Thursday, April 21, 2011 05:43:16 PM Robert Haas wrote:

On Thu, Apr 21, 2011 at 11:37 AM, Ross J. Reedstrom <reedstrm@rice.edu>

wrote:

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I agree.  I am in favor of a shorter release cycle.

I'm not.  I don't think there is any demand among *users* (as opposed to
developers) for more than one major PG release a year.  It's hard enough
to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more end
users to start skipping releases. Some already do - it's common not to
upgrade unless there's a feature you really need, but for those who do
stay on the 'current' upgrade path, you'll lose some who can't afford to
spend more than one integration-testing round a year.

Well, that aspect of the problem doesn't bother me, much.  I don't
really care whether people upgrade to each new release the moment it
comes out anyway.
Not to say that there aren't OTHER problems with the idea...

One could argue that its causing bad PR for postgres. I have seen several
parties planning to migrate away or not migrate to postgres because of
performance evaluations they made. With 7.4, 8.0 and 8.2. In 2010.

That's certainly true. It's clearly insane to benchmark with anything
other than the latest major release - on any product - if you want to
have any pretense of fairness. However, for users who have
applications that work and perform acceptably, I don't think it
benefits us to be too aggressive in trying to get them onto a later
major release. If we wanted to do that, we could maintain
back-branches for two years instead of five, but I don't think that
would be doing anyone any favors.

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small. I guess the brunt of the work would
actually fall on the packagers. It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#128Christopher Browne
cbbrowne@gmail.com
In reply to: Andres Freund (#125)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote:

One could argue that its causing bad PR for postgres. I have seen several
parties planning to migrate away or not migrate to postgres because of
performance evaluations they made. With 7.4, 8.0 and 8.2. In 2010.

Well evaluating based on things past that can't be changed in the
absence of time machines doesn't offer us much guidance, as there
isn't anything that can be done in the present to fix such.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

#129Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#127)
EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

[ man, this thread has totally outlived its title, could we change that?
I'll start with this subtopic ]

Robert Haas <robertmhaas@gmail.com> writes:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small. I guess the brunt of the work would
actually fall on the packagers. It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

I agree that the incremental effort would not be so large, but what
makes you think that the situation will change given another year?
My expectation is that'd just mean people will do nothing about
migrating for a year longer.

More generally: it took a lot of argument to establish the current EOL
policy, and bending it the first time anyone feels any actual pain
will pretty much destroy the whole concept.

regards, tom lane

#130Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#127)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thursday, April 21, 2011 06:39:44 PM Robert Haas wrote:

On Thu, Apr 21, 2011 at 11:48 AM, Andres Freund <andres@anarazel.de> wrote:

On Thursday, April 21, 2011 05:43:16 PM Robert Haas wrote:

On Thu, Apr 21, 2011 at 11:37 AM, Ross J. Reedstrom <reedstrm@rice.edu>

wrote:

On Thu, Apr 21, 2011 at 11:16:45AM -0400, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I agree. I am in favor of a shorter release cycle.

I'm not. I don't think there is any demand among *users* (as opposed
to developers) for more than one major PG release a year. It's hard
enough to get people to migrate that often.

In fact, I predict that the observed behavior would be for even more
end users to start skipping releases. Some already do - it's common
not to upgrade unless there's a feature you really need, but for
those who do stay on the 'current' upgrade path, you'll lose some who
can't afford to spend more than one integration-testing round a year.

Well, that aspect of the problem doesn't bother me, much. I don't
really care whether people upgrade to each new release the moment it
comes out anyway.
Not to say that there aren't OTHER problems with the idea...

One could argue that its causing bad PR for postgres. I have seen several
parties planning to migrate away or not migrate to postgres because of
performance evaluations they made. With 7.4, 8.0 and 8.2. In 2010.

That's certainly true. It's clearly insane to benchmark with anything
other than the latest major release - on any product - if you want to
have any pretense of fairness.

The usual argument against that is that $version is the only available on
$platform in version $version...

And I doubt that a higher number of new pg versions will lead to more
supported releases in distributions...

Andres

#131Dave Page
dpage@pgadmin.org
In reply to: Tom Lane (#129)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On Thu, Apr 21, 2011 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ man, this thread has totally outlived its title, could we change that?
 I'll start with this subtopic ]

Robert Haas <robertmhaas@gmail.com> writes:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small.  I guess the brunt of the work would
actually fall on the packagers.  It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

I agree that the incremental effort would not be so large, but what
makes you think that the situation will change given another year?
My expectation is that'd just mean people will do nothing about
migrating for a year longer.

More generally: it took a lot of argument to establish the current EOL
policy, and bending it the first time anyone feels any actual pain
will pretty much destroy the whole concept.

It would also make at least one packager very unhappy as the 8.2
Windows build is by far the hardest and most time consuming to do and
I happen to know he's been counting the days until it goes.

More generally, keeping it for longer means we might end up supporting
6 major releases at once. That may not be so much work on a day to day
basis, but it adds up to a lot at release times, which was one of the
reasons why we agreed on the 5 year support window.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#132Christopher Browne
cbbrowne@gmail.com
In reply to: Robert Haas (#118)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, Apr 21, 2011 at 11:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I agree.  I am in favor of a shorter release cycle.  But I think that
a shorter release cycle won't work well if there is still four month
long integration period at the end of each series of CommitFests.  The
problem is a bit circular here: because release cycles are long,
people really, really want to slip as much as possible in at the end.
But being under time pressure to get things committed results in a
higher bug count, which means more things that have to be fixed after
feature freeze, which translates into a long release cycle.

If we somehow were able to come up with a 6 week release cycle, we'd
still have the problem that there are features that take more than 6
weeks to integrate into a release. (HOT and SyncRep, I'm looking at
you!) Any such larger features would "blow this up," quite forcibly.

I don't think our release cycle is vastly too long; it takes enough
time to plan upgrades for systems that my colleagues at Afilias aren't
keen on using every PG release in production that comes out as it
stands now.

Peter Eisentraut points out that with the way things are, now, "...
you are left with all of about 20 days per year for discussion,
collaborative planning and coding. Which is obviously silly, which is
why the process breaks down."

I think the CommitFests have been a *super* tool for addressing such
problems as:
- patches getting lost
- getting review effort put onto the easier patches

But they aren't the only thing we conceptually need to have. For
tougher features, they're not great. And they're completely useless
at addressing discussions surrounding things we know we want done, but
don't have a strategy for yet. Those things aren't "patches", there's
nothing yet to commit.

My sense is that something else is needed as a process to help with
those "nebulous large changes." I'm not sure quite what it looks
like. Maybe there's some tooling that would be helpful, but we really
need some experimentation to figure out what the process should look
like.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

#133Josh Berkus
josh@agliodbs.com
In reply to: Dave Page (#131)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

All,

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small. I guess the brunt of the work would
actually fall on the packagers. It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

Better that someone should just focus on whipping Robert's (or was it
Greg's?) replace-the-missing-casts package into shape as an extension.

I'm sure some kind of corporate sponsorship would be available for this
if someone wanted to work on it. Enough companies are facing this as
upgrade pain to want to fix it. If someone wants to work on it, let me
know; I'll start a fundraising campaign.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In reply to: Dave Page (#131)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On Thu, Apr 21, 2011 at 06:04:09PM +0100, Dave Page wrote:

On Thu, Apr 21, 2011 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ man, this thread has totally outlived its title, could we change that?
?I'll start with this subtopic ]

Robert Haas <robertmhaas@gmail.com> writes:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small. ?I guess the brunt of the work would
actually fall on the packagers. ?It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

I agree that the incremental effort would not be so large, but what
makes you think that the situation will change given another year?
My expectation is that'd just mean people will do nothing about
migrating for a year longer.

More generally: it took a lot of argument to establish the current EOL
policy, and bending it the first time anyone feels any actual pain
will pretty much destroy the whole concept.

It would also make at least one packager very unhappy as the 8.2
Windows build is by far the hardest and most time consuming to do and
I happen to know he's been counting the days until it goes.

More generally, keeping it for longer means we might end up supporting
6 major releases at once. That may not be so much work on a day to day
basis, but it adds up to a lot at release times, which was one of the
reasons why we agreed on the 5 year support window.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

+1 for cutting the cord on 8.2. People using it still will need
to use the last release available, upgrade, or consult to have
a back-port/build made.

Regards,
Ken

#135Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#129)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On Thu, Apr 21, 2011 at 12:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

[ man, this thread has totally outlived its title, could we change that?
 I'll start with this subtopic ]

Robert Haas <robertmhaas@gmail.com> writes:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small.  I guess the brunt of the work would
actually fall on the packagers.  It looks like we've done 5 point
releases of 8.2.x in the last year, so presumably if we did decide to
extend the EOL date by a year or so that's about how much incremental
effort would be needed.

I agree that the incremental effort would not be so large, but what
makes you think that the situation will change given another year?
My expectation is that'd just mean people will do nothing about
migrating for a year longer.

More generally: it took a lot of argument to establish the current EOL
policy, and bending it the first time anyone feels any actual pain
will pretty much destroy the whole concept.

I don't think that's quite a fair description of the proposal. I
don't think that having a general policy about EOL should preclude us
from making exceptions when there is some particularly compelling
reason to do so, and "it's particularly difficult to upgrade to
release X+1" seems to me to be something that might merit a bit of
consideration in that area. It is hard to imagine that 8.3, 8.4, 9.0,
or 9.1 could justify special treatment on similar grounds, nor did
7.4, 8.0, or 8.1, which we recently retired under this policy.

However, I can see that I'm way, way in the minority on this one, so
never mind! It was just a thought...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#136Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Page (#131)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

Dave Page <dpage@pgadmin.org> writes:

On Thu, Apr 21, 2011 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I agree that the incremental effort would not be so large, but what
makes you think that the situation will change given another year?

It would also make at least one packager very unhappy as the 8.2
Windows build is by far the hardest and most time consuming to do and
I happen to know he's been counting the days until it goes.

Well, if we did extend support for 8.2, we could specifically exclude
Windows. But I'm still unclear on what would really be accomplished
by extending support for it. Sooner or later we have to get people
to migrate up from it, and I see no reason to think that supporting
it for just a year more will change anything.

regards, tom lane

#137Tom Lane
tgl@sss.pgh.pa.us
In reply to: Josh Berkus (#133)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

Josh Berkus <josh@agliodbs.com> writes:

Better that someone should just focus on whipping Robert's (or was it
Greg's?) replace-the-missing-casts package into shape as an extension.

I think Peter originated that, actually. My recollection is that there
didn't seem to be any way to extend it to a complete solution, and
besides which it's really a crutch to avoid fixing bugs in your
application. Still, if someone does want to expend more work on it
I wouldn't object.

regards, tom lane

#138Jaime Casanova
jaime@2ndquadrant.com
In reply to: Tom Lane (#136)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On Thu, Apr 21, 2011 at 12:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

But I'm still unclear on what would really be accomplished
by extending support for it.  Sooner or later we have to get people
to migrate up from it, and I see no reason to think that supporting
it for just a year more will change anything.

And people is more likely to migrate if they see some kind of hard
line, specially when migrate means a lot of work.
Actually, someone i know is targeting to migrate before the EOL, just
because the EOL exists.

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

#139Greg Stark
gsstark@mit.edu
In reply to: Robert Haas (#135)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On Thu, Apr 21, 2011 at 6:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:

However, I can see that I'm way, way in the minority on this one, so
never mind!  It was just a thought...

Fwiw I would have agreed with you on the basic question. Just because
we've said that users can count on N years of support doesn't mean
there's anything binding us to *not* support things for N+x years. The
argument that we should cut refuse to back-patch security fixes and
bug fixes that we could handle without much effort to versions that
users are using just because we think we know better than them and
know they should upgrade is a bad path imho.

However your theory was all predicated on the idea that supporting 8.2
was not much incremental effort and Dave said that's not true so this
is all moot. Doing it Windows-excluded seems not worth the effort ---
unless... what version of Postgres was shipped in the last supported
releases of major distributions? I think it was 8.1 in Ubuntu Hardy
and 8.4 in Ubuntu Lucid so that's irrelevant. What about Redhat and
Debian?

--
greg

#140Tom Lane
tgl@sss.pgh.pa.us
In reply to: Greg Stark (#139)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

Greg Stark <gsstark@mit.edu> writes:

Fwiw I would have agreed with you on the basic question. Just because
we've said that users can count on N years of support doesn't mean
there's anything binding us to *not* support things for N+x years.

Certainly. The question is what's the point --- and perhaps even more
to the point, if we extend 8.2 support, when are we going to stop
extending it?

However your theory was all predicated on the idea that supporting 8.2
was not much incremental effort and Dave said that's not true so this
is all moot. Doing it Windows-excluded seems not worth the effort ---
unless... what version of Postgres was shipped in the last supported
releases of major distributions? I think it was 8.1 in Ubuntu Hardy
and 8.4 in Ubuntu Lucid so that's irrelevant. What about Redhat and
Debian?

So far as Red Hat is concerned, neither 8.2 nor 8.3 are of any interest
whatsoever. I'm still on the hook for 7.4 and 8.1 to some extent, but
only very severe security issues are likely to be considered for those.

regards, tom lane

#141Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Stark (#139)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On 04/21/2011 05:17 PM, Greg Stark wrote:

what version of Postgres was shipped in the last supported
releases of major distributions? I think it was 8.1 in Ubuntu Hardy
and 8.4 in Ubuntu Lucid so that's irrelevant. What about Redhat and
Debian?

IIRC RedHat has a ten year EOL policy, so what they have shipped in old
releases should not really bind us. In any case, our EOL policy only
affects what formal releases we make. We can commit fixes to branches
past their EOL date, and IIRC Tom did this not long ago.

cheers

andrew

#142Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#137)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On tor, 2011-04-21 at 13:39 -0400, Tom Lane wrote:

Josh Berkus <josh@agliodbs.com> writes:

Better that someone should just focus on whipping Robert's (or was it
Greg's?) replace-the-missing-casts package into shape as an extension.

I think Peter originated that, actually. My recollection is that there
didn't seem to be any way to extend it to a complete solution, and
besides which it's really a crutch to avoid fixing bugs in your
application. Still, if someone does want to expend more work on it
I wouldn't object.

http://petereisentraut.blogspot.com/2008/03/readding-implicit-casts-in-postgresql.html

There are some problems if you just add *all* the casts back without
thinking. In particular, the || appears to be causing problems.

But other than those few specific cases, that tool kit fixes all known
problems. So anyone who is willing to spend more than zero minutes on
planning and executing a major version upgrade shouldn't really have any
problems with this aspect.

#143Peter Eisentraut
peter_e@gmx.net
In reply to: Greg Stark (#139)
Re: EOL for 8.2 (was Re: Formatting Curmudgeons WAS: MMAP Buffers)

On tor, 2011-04-21 at 22:17 +0100, Greg Stark wrote:

However your theory was all predicated on the idea that supporting 8.2
was not much incremental effort and Dave said that's not true so this
is all moot. Doing it Windows-excluded seems not worth the effort ---
unless... what version of Postgres was shipped in the last supported
releases of major distributions? I think it was 8.1 in Ubuntu Hardy
and 8.4 in Ubuntu Lucid so that's irrelevant. What about Redhat and
Debian?

Debian: 8.3 in oldstable (<1 year left), 8.4 in stable, probably 9.1 in
next

#144Greg Smith
greg@2ndQuadrant.com
In reply to: Robert Haas (#127)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 04/21/2011 12:39 PM, Robert Haas wrote:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small.

The pending EOL for 8.2 is the only thing that keeps me sane when
speaking with people who refuse to upgrade, yet complain that their 8.2
install is slow. This last month, that seems to be more than usual "why
does autovacuum suck so much?" complaints that would all go away with an
8.3 upgrade. Extending the EOL is not doing any of these users a
favor. Every day that goes by when someone is on a version of
PostgreSQL that won't ever allow in-place upgrade is just making worse
the eventual dump and reload they face worse. The time spent porting to
8.3 is a one-time thing; the suffering you get trying to have a 2011
sized database on 2006's 8.2 just keeps adding up the longer you
postpone it.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#145Greg Smith
greg@2ndQuadrant.com
In reply to: Tom Lane (#103)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On the big picture of scheduling issues, I have never seen a major piece
of software ship every 6 months without being incredibly buggy. I'd
lose serious faith in this project if that happens here. Since I've
never seen a major operating system ship usefully more than about once
every two years, so I'm not sure who it would be serving anyway. Even
if this project pulled it off, those who would see the benefit because
they're using things like the upgrade-happy Ubuntu/Fedora/Gentoo
treadmills are clearly not optimizing for the sort of things database
users care about anyway. So whacking around the low-level schedule to
aim at that goal boggles my mind.

As for trying to improve things within the existing yearly cycle, there
are several types of patch to be concerned about here. And the useful
interval to respond isn't the same for all of them.

This discussion started with "newbie patch". I'd like to see these get
a review sufficient to say "you're not following the good practices
outlined by our code guidelines and we can't do anything with this"
quickly, with a hand-off to resources to help them with that. Everyone
reading this list surely knows where that documentation is at now after
all this publicity. You might schedule a weekly "answer the newbies"
scan usefully to help with this. But the project doesn't get much out
of that besides being more friendly and encouraging, to help in the
growing the community long-term. In the short term, adding more process
here just to help these submitters will, pragmatically, mainly get in
the way of working on more finished patches.

Second is "WIP", where the author knows there are issues but is looking
for feedback. In the cases where these are interesting to people, these
sometimes get immediate feedback too. The ones that don't are because
a) it's hard to review, or b) no one else is interested enough to poke
at it immediately. That means a reviewer will likely need to be either
assigned or found & motivated to look for it. And that's painful enough
that you don't want to do it regularly. The overhead of herding patch
reviewers is seriously underestimated by some of the ideas throw around
here for reducing the intervals of this process. It's only reasonable
to do in bulk, where you can at least yelp on-list to try and get
volunteers usefully.

[There were complaints upthread about things like how Aster's patch
submissions were treated. Those were WIP patches that half implemented
some useful ideas. But they were presented as completed features, and
they seemed to expect the community would pick those up and commit in
that not quite right state without extended additional work on their
side. Not doing that sort of thing is part of the reason the PostgreSQL
code isn't filled with nothing but the fastest hack to get any given job
done. Anyone who thinks I'm misrepresenting that view of history should
revisit the lengthy feedback provided to them at
https://commitfest.postgresql.org/action/patch_view?id=173 and
https://commitfest.postgresql.org/action/patch_view?id=205 -- it
actually goes back even further than that because the first versions of
these patches were even less suitable for commit.]

Next up is "solid patch that needs technical review". This is mainly
different from the WIP case in that it's unlikely any quick feedback
will help the submitter. So here it's back to needing to find a
reviewer again.

Finally, "big feature patch", likely taking multiple CFs to process.
It's barely possible to get useful feedback on these every two months.
I don't see how dropping the interval is going to solve any of the
problems around these. Two to three people all need to get aligned for
progress on these to happen: the author, a reviewer, and a committer,
with the commiter sometimes also doing the initial review. Good luck
making that happen more often than it already does.

I think that anyone who suggests shortening the cycles here, or making
the CommitFests more frequent, should volunteer to run one. That will
beat the idea right out of you. Work on the problem of how to
motivate/create more patch reviewers instead; that's where the actual
bottleneck in this process is at. Part of the problem with how newbies
are handled is that they jump right to writing patches, because that's
cooler to do, rather than starting with doing review. That's
counterproductive--best way to learn how to write a good patch is to
consider the difficulty someone else faces reading one--but you can't
tell that to some people usefully.

That goes double for some of the people complaining in this thread about
dissatisfaction with the current process. If you're not helping review
patches already, you're not participating in the thing that needs the
most help. This is not a problem you make better with fuzzy management
directives to be nicer to people. There are real software engineering
issues about how to ensure good code quality at its core.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#146Greg Smith
greg@2ndquadrant.com
In reply to: Andrew Dunstan (#61)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

Personally, I want pgindent installed in /usr/local/ or similar. That
way I can have multiple trees and it will work in all of them without
my having to build it for each. What I don't want is for the installed
patched BSD indent to conflict with the system's indent, which is why
I renamed it. If you still think that's a barrier to easy use, then I
think we need a way to provide hooks in the makefiles for specifying
the install location, so we can both be satisfied.

I don't think there's a conflict here, because the sort of uses I'm
worried about don't want to install the thing at all; just to run it. I
don't really care what "make install" does because I never intend to run
it; dumping into /usr/local is completely reasonable for the people who do.

Since there's no script I know of other than your prototype, I don't
think repackaging is likely to break anything. That makes it worth
doing *now* rather than later.

But frankly, I'd rather do without an extra script if possible.

Fine, the renaming bit I'm not really opposed to. The odds there's
anyone using this thing that isn't reading this message exchange is
pretty low. There is the documentation backport issue if you make any
serious change it though. Maybe put the new version in another
location, leave the old one where it is?

There's a fair number of steps to this though. It's possible to
automate them all such that running the program is trivial. I don't
know how we'd ever get that same ease of use without some sort of
scripting for the whole process. Could probably do it in a makefile
instead, but I don't know if that's really any better. The intersection
between people who want to run this and people who don't have bash
available is pretty slim I think. I might re-write in Perl to make it
more portable, but I think that will be at the expense of making it
harder for people to tweak if it doesn't work out of the box. More
code, too.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#147Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#58)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

... Maybe someone out there is under the impression
that I get high off of rejecting patches; but the statistics you cite
from the CF app don't exactly support the contention that I'm going
around looking for reasons to reject things, or if I am, I'm doing a
pretty terrible job finding them.

Hm ... there are people out there who think *I* get high off rejecting
patches. I have a t-shirt to prove it. But I seem to be pretty
ineffective at it too, judging from these numbers.

Late reply, but almost all the things Tom rejects I would have rejected
too.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#148Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#147)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, May 9, 2011 at 10:58 AM, Bruce Momjian <bruce@momjian.us> wrote:

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

... Maybe someone out there is under the impression
that I get high off of rejecting patches; but the statistics you cite
from the CF app don't exactly support the contention that I'm going
around looking for reasons to reject things, or if I am, I'm doing a
pretty terrible job finding them.

Hm ... there are people out there who think *I* get high off rejecting
patches.  I have a t-shirt to prove it.  But I seem to be pretty
ineffective at it too, judging from these numbers.

Late reply, but almost all the things Tom rejects I would have rejected
too.

Well, I think I've been guilty more than once of leaning on Tom to try
to get him to accept patches that he might've been inclined to reject.
I think that my standards for code quality are similar to Tom's
(though sometimes I let through things he would have caught, woops)
but I think I am more inclined to commit feature changes that he might
not find entirely worthwhile. Like Tom, I'm reasonably wary of random
knickknacks that are extremely special-purpose or will slow down
common cases, but on the average I think I'm slightly more
new-feature-positive than he is. Not without some exceptions, of
course.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#149Bruce Momjian
bruce@momjian.us
In reply to: Greg Smith (#144)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Greg Smith wrote:

On 04/21/2011 12:39 PM, Robert Haas wrote:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small.

The pending EOL for 8.2 is the only thing that keeps me sane when
speaking with people who refuse to upgrade, yet complain that their 8.2
install is slow. This last month, that seems to be more than usual "why
does autovacuum suck so much?" complaints that would all go away with an
8.3 upgrade. Extending the EOL is not doing any of these users a
favor. Every day that goes by when someone is on a version of
PostgreSQL that won't ever allow in-place upgrade is just making worse
the eventual dump and reload they face worse. The time spent porting to
8.3 is a one-time thing; the suffering you get trying to have a 2011
sized database on 2006's 8.2 just keeps adding up the longer you
postpone it.

Interesting. You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#150Bruce Momjian
bruce@momjian.us
In reply to: Greg Smith (#145)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Greg Smith wrote:

[There were complaints upthread about things like how Aster's patch
submissions were treated. Those were WIP patches that half implemented
some useful ideas. But they were presented as completed features, and
they seemed to expect the community would pick those up and commit in
that not quite right state without extended additional work on their
side. Not doing that sort of thing is part of the reason the PostgreSQL
code isn't filled with nothing but the fastest hack to get any given job
done. Anyone who thinks I'm misrepresenting that view of history should
revisit the lengthy feedback provided to them at
https://commitfest.postgresql.org/action/patch_view?id=173 and
https://commitfest.postgresql.org/action/patch_view?id=205 -- it
actually goes back even further than that because the first versions of
these patches were even less suitable for commit.]

[ Again, sorry for my late reply.]

Greg hits a big item above --- it takes 3-4x more work to get a patch to
merge cleanly into our code ("look like it was always there") than to
write the initial patch. If the author isn't willing to do that 3-4x
work, it is not something the community is going to do on a regular
basis, so it is not surprising the patches are dropped. This is very
often true of academicly-developed patches too. (I know I rewrite my
patches 4-5 times, and some feel even that is not enough interations for
me. ;-) )

That goes double for some of the people complaining in this thread about
dissatisfaction with the current process. If you're not helping review
patches already, you're not participating in the thing that needs the
most help. This is not a problem you make better with fuzzy management
directives to be nicer to people. There are real software engineering
issues about how to ensure good code quality at its core.

I agree on this one too. It is good for people outside the patch review
group to make suggestions (external review is good), but when those
external people can't give clear examples of problems, it is impossible
for the patch review group to react or improve, and the complaints do
more harm than good. The complaints did spark discussion to reevaluate
our development process, so something good did come out of it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#151Robert Haas
robertmhaas@gmail.com
In reply to: Bruce Momjian (#149)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian <bruce@momjian.us> wrote:

Greg Smith wrote:

On 04/21/2011 12:39 PM, Robert Haas wrote:

In fact, I've been wondering if we shouldn't consider extending the
support window for 8.2 past the currently-planned December 2011.
There seem to be quite a lot of people running that release precisely
because the casting changes in 8.3 were so painful, and I think the
incremental effort on our part to extend support for another year
would be reasonably small.

The pending EOL for 8.2 is the only thing that keeps me sane when
speaking with people who refuse to upgrade, yet complain that their 8.2
install is slow.  This last month, that seems to be more than usual "why
does autovacuum suck so much?" complaints that would all go away with an
8.3 upgrade.  Extending the EOL is not doing any of these users a
favor.  Every day that goes by when someone is on a version of
PostgreSQL that won't ever allow in-place upgrade is just making worse
the eventual dump and reload they face worse.  The time spent porting to
8.3 is a one-time thing; the suffering you get trying to have a 2011
sized database on 2006's 8.2 just keeps adding up the longer you
postpone it.

Interesting.  You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

Personally, I think the support window is on the borderline of being
too short already. There are several Linux distributions out there
that offer 5-year support for certain releases. Even assuming they
incorporate the latest version of PostgreSQL at the time they wrap the
final release, it'll already be some months since we released that
version, and that means we'll stop supporting that version of
PostgreSQL before they stop supporting that release. I regularly have
systems that run for 3 or 4 years without needing to be reinstalled,
and they're not necessarily running the bleeding-edge version of
PostgreSQL when first installed. So they, too, are on the trailing
edge of our support. As much as I believe that 9.0 (and, now, 9.1)
are the future and people should move to them, we can't enforce that.
EOL doesn't necessarily drive people to move. If they're just running
"yum update" they're going to get 8.whatever.latest, and that's out of
support and missing relevant bug fixes, then it is. I haven't run
into much 8.1 recently, but it seems there is still a decent chunk of
8.2 out there.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#152Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#151)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas wrote:

Interesting. ?You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

Personally, I think the support window is on the borderline of being
too short already. There are several Linux distributions out there
that offer 5-year support for certain releases. Even assuming they
incorporate the latest version of PostgreSQL at the time they wrap the
final release, it'll already be some months since we released that
version, and that means we'll stop supporting that version of
PostgreSQL before they stop supporting that release. I regularly have
systems that run for 3 or 4 years without needing to be reinstalled,
and they're not necessarily running the bleeding-edge version of
PostgreSQL when first installed. So they, too, are on the trailing
edge of our support. As much as I believe that 9.0 (and, now, 9.1)
are the future and people should move to them, we can't enforce that.
EOL doesn't necessarily drive people to move. If they're just running
"yum update" they're going to get 8.whatever.latest, and that's out of
support and missing relevant bug fixes, then it is. I haven't run
into much 8.1 recently, but it seems there is still a decent chunk of
8.2 out there.

I agree we don't want to shorten the window --- I was just pointing out
that we have more upgrade options than in the past. One big push for
shortening was the Win32 issues on 8.0 and perhaps 8.1 that were
unfixable, which helped push retiring, at least on that platforms, and
once you retire on one platform, there is momentum to retire all
platforms for that release.

With Win32 stable on 8.2, we could say we don't need to shorten the
window as much, but pg_upgrade would allow us to keep it the same as now
because upgrades are potentially easier.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#153Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#151)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian <bruce@momjian.us> wrote:

Interesting. �You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

Personally, I think the support window is on the borderline of being
too short already. There are several Linux distributions out there
that offer 5-year support for certain releases.

Keep in mind that at least some contributors are paid to do exactly that
long-term support (and if you've not heard, Red Hat is up to seven years
support on RHEL ...). So the work is going to get done, and if it
doesn't get committed to the community SCM, I'm not sure that really
helps anybody.

Although whether we do formal releases is a different question. Maybe
it would be sensible to continue patching an old branch but not bother
wrapping up release tarballs? But the incremental work to do one more
set of release notes and one more tarball build is not that large.

regards, tom lane

#154Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#151)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 05/09/2011 11:43 AM, Robert Haas wrote:

Interesting. You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

Personally, I think the support window is on the borderline of being
too short already. There are several Linux distributions out there
that offer 5-year support for certain releases.

Some (RH?) offer significantly longer periods.

I agree that we should not reduce the support window. The fact that we
can do in place upgrades of the data only addresses one pain point in
upgrading. Large legacy apps require large retesting efforts when
upgrading, often followed by lots more work renovating the code for
backwards incompatibilities. This can be a huge cost for what the suits
see as little apparent gain, and making them do it more frequently in
order to stay current will not win us any friends. I often want to wait
a while after a release for certain customers, while it beds down, and
to get them to start moving towards upgrading well before it's the last
minute. That makes an effective life of four years or less per release
as things are now. That's plenty short enough.

cheers

andrew

#155Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#153)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, May 9, 2011 at 11:25 AM, Bruce Momjian <bruce@momjian.us> wrote:

Interesting. ���You could argue that once 8.3 is our earliest supported
release that we could even shrink the support window because the
argument "I can't dump/reload my data" would be gone.

Personally, I think the support window is on the borderline of being
too short already. There are several Linux distributions out there
that offer 5-year support for certain releases.

Keep in mind that at least some contributors are paid to do exactly that
long-term support (and if you've not heard, Red Hat is up to seven years
support on RHEL ...). So the work is going to get done, and if it
doesn't get committed to the community SCM, I'm not sure that really
helps anybody.

Although whether we do formal releases is a different question. Maybe
it would be sensible to continue patching an old branch but not bother
wrapping up release tarballs? But the incremental work to do one more
set of release notes and one more tarball build is not that large.

I think the big reason we trimmed the support window was to push people
off of old releases, not to lighten our workload. Until we stated that
a release was not supported, we didn't give administrators ammunition to
force upgrades within their organizations.

Yeah, that is a lousy reason, but it was the stated case when we shrunk
to five years. You can argue that our more recent releases are not as
"stop using them" bad as previous ones, but Greg Smith's statement about
autovacuum badness reinforces that goal.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#156Joshua Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#154)
Re: Formatting Curmudgeons WAS: MMAP Buffers

All,

I agree that we should not reduce the support window. The fact that we
can do in place upgrades of the data only addresses one pain point in
upgrading. Large legacy apps require large retesting efforts when
upgrading, often followed by lots more work renovating the code for
backwards incompatibilities.

Definitely. Heck, I can't get half our clients to apply *update* releases because they have a required QA process which takes a month. And a lot of companies are just now deploying the 8.4 versions of their products.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
San Francisco

#157Greg Smith
greg@2ndQuadrant.com
In reply to: Andrew Dunstan (#154)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 05/09/2011 12:06 PM, Andrew Dunstan wrote:

The fact that we can do in place upgrades of the data only addresses
one pain point in upgrading. Large legacy apps require large retesting
efforts when upgrading, often followed by lots more work renovating
the code for backwards incompatibilities. This can be a huge cost for
what the suits see as little apparent gain, and making them do it more
frequently in order to stay current will not win us any friends.

I just had a "why a new install on 8.3?" conversation today, and it was
all about the application developer not wanting to do QA all over again
for a later release.

Right now, one of the major drivers for "why upgrade?" has been the
performance improvements in 8.3, relative to any older version. The
main things pushing happy 8.3 sites to 8.4 or 9.0 that I see are either
VACUUM issues (improved with partial vacuum in 8.4) or wanting real-time
replication (9.0). I predict many sites that don't want either are
likely to sit on 8.3 for a really long time. The community won't be
able to offer a compelling reason why smaller sites in particular should
go through the QA an upgrade requires. The fact that the app QA time is
now the main driver--not the dump and reload time--is good, because it
makes it does make it easier for the people with the biggest data sets
to move. They're the ones that need the newer versions the most anyway,
and in that regard having in-place upgrade start showing up as of 8.3
was really just in time.

I think 8.3 is going to be one of those releases like 7.4, where people
just keep running it forever. At least shortening the upgrade path has
made that concern a little bit better.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#158Josh Berkus
josh@agliodbs.com
In reply to: Bruce Momjian (#150)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Greg,

[There were complaints upthread about things like how Aster's patch
submissions were treated. Those were WIP patches that half implemented
some useful ideas.

There are two reasons why I think we failed with the Aster patches:

1) I passed Aster along to Bruce, who said he would review the patches
and give them a private response on them before they put them on
-hackers (which response would be "these aren't nearly ready") Bruce
punted on this instead, passing their submissions straight through to
-hackers without review.

2) Our process for reviewing and approving patches, and what criteria
such patches are required to meet, is *very* opaque to a first-time
submitter (as in no documentation the submitter knows about), and does
not become clearer as they go through the process. Aster, for example,
was completely unable to tell the difference between hackers who were
giving them legit feedback, and random list members who were
bikeshedding. As a result, they were never able to derive a concrete
list of "these are the things we need to fix to make the patch
acceptable," and gave up.

While the first was specific to the Aster submissions, I've seen the
second problem with lots of first-time submissions to this list. Our
feedback to submitters of big patches requires a lot of comprehension of
project personalities and politics to make any sense of. As I don't
think we can change this, I think the best answer is to tell people
"Don't submit a big patch to PostgreSQL until you've done a few small
patches first. You'll regret it".

That goes double for some of the people complaining in this thread about
dissatisfaction with the current process.

The problem is not the process itself, but that there is little
documentation of that process, and that much of that documentation does
not match the defacto process. Obviously, the onus is on me as much as
anyone else to fix this.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#159Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#158)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, May 9, 2011 at 1:53 PM, Josh Berkus <josh@agliodbs.com> wrote:

While the first was specific to the Aster submissions, I've seen the
second problem with lots of first-time submissions to this list.  Our
feedback to submitters of big patches requires a lot of comprehension of
project personalities and politics to make any sense of.

Ah ha! Now we're getting somewhere. As was doubtless obvious from my
previous responses, I don't agree that the process is as broken as I
felt you were suggesting, and I think we've made a lot of
improvements. However, I am in complete agreement with you on this
point. Unfortunately, people often come into our community with
incorrect assumptions about how it works, including:

- someone's in charge
- there's one right answer
- it's our job to fix your problem

Now if you read a few hundred emails (which is not that much calendar
time, if you read them all) it's not too hard to figure out what the
real dynamic is, and I think that real dynamic is increasingly
positive (with some unfortunate exceptions). But if the first thing
you do is post (no doubt about some large or controversial change),
yeah, serious culture shock.

That goes double for some of the people complaining in this thread about
dissatisfaction with the current process.

The problem is not the process itself, but that there is little
documentation of that process, and that much of that documentation does
not match the defacto process.  Obviously, the onus is on me as much as
anyone else to fix this.

I can't disagree with this, either. I'm not sure where it would be
possible for us to document this that people would actually see and
read, and I think it's a tough to understand just from reading a wiki
page or a blog post: if you've never been part of a community that
operates this way, then it's kind of strange and it takes a while to
adjust. Of course from the inside it seems to make a fair amount of
sense, but what good is that? Anyhow, whatever we can do to help
people get into the swing of things I'm highly in favor of.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#160Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#159)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert,

I can't disagree with this, either. I'm not sure where it would be
possible for us to document this that people would actually see and
read, and I think it's a tough to understand just from reading a wiki
page or a blog post:

Still, if we had a wiki page which was a really comprehensive guide to
submitting patches, then we could send people a link after they submit
their first patch. As well as having it in the header for the
commitfest page.

While it wouldn't do everything, it would help.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#161Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#160)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, May 9, 2011 at 2:41 PM, Josh Berkus <josh@agliodbs.com> wrote:

Robert,

I can't disagree with this, either.  I'm not sure where it would be
possible for us to document this that people would actually see and
read, and I think it's a tough to understand just from reading a wiki
page or a blog post:

Still, if we had a wiki page which was a really comprehensive guide to
submitting patches, then we could send people a link after they submit
their first patch.   As well as having it in the header for the
commitfest page.

While it wouldn't do everything, it would help.

I'm all in favor.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#162Greg Stark
gsstark@mit.edu
In reply to: Robert Haas (#159)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Mon, May 9, 2011 at 7:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Ah ha!  Now we're getting somewhere.  As was doubtless obvious from my
previous responses, I don't agree that the process is as broken as I
felt you were suggesting, and I think we've made a lot of
improvements.  However, I am in complete agreement with you on this
point.  Unfortunately, people often come into our community with
incorrect assumptions about how it works, including:

- someone's in charge
- there's one right answer
- it's our job to fix your problem

Now if you read a few hundred emails (which is not that much calendar
time, if you read them all) it's not too hard to figure out what the
real dynamic is, and I think that real dynamic is increasingly
positive (with some unfortunate exceptions).  But if the first thing
you do is post (no doubt about some large or controversial change),
yeah, serious culture shock.

Honestly it's not even that clear. It took me years to realize that
when Tom says "There's problems x, y, z" he doesn't mean "give up now
there are all these fatal flaws" but rather "think about these things
and maybe they're problems and maybe they're not, but we need to
figure that out".

To be fair that's true for everyone on th4 list depending on the
audience. We have a tendency to state general concerns as pretty
black-and-white statements that would read to a newbie as fatal flaws
that aren't worth investigating.

--
greg

#163Alvaro Herrera
alvherre@commandprompt.com
In reply to: Greg Stark (#162)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Excerpts from Greg Stark's message of lun may 09 19:44:15 -0400 2011:

Honestly it's not even that clear. It took me years to realize that
when Tom says "There's problems x, y, z" he doesn't mean "give up now
there are all these fatal flaws" but rather "think about these things
and maybe they're problems and maybe they're not, but we need to
figure that out".

These things may seem trivial but I think they are worth documenting.
It feels weird to document something that's inherently "social" rather
than technical (to me at least), but if that's what we need to help
others to collaborate, then we should.

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#164Greg Smith
greg@2ndquadrant.com
In reply to: Josh Berkus (#158)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Josh Berkus wrote:

As I don't think we can change this, I think the best answer is to tell people
"Don't submit a big patch to PostgreSQL until you've done a few small
patches first. You'll regret it".

When I last did a talk about getting started writing patches, I had a
few people ask me afterwards if I'd ever run into problems with having
patch submissions rejected. I said I hadn't. When asked what my secret
was, I told them my first serious submission modified exactly one line
of code[1]http://archives.postgresql.org/pgsql-patches/2007-03/msg00553.php. And *that* I had to defend in regards to its performance
impact.[2]http://archives.postgresql.org/pgsql-patches/2007-02/msg00222.php

Anyway, I think the intro message should be "Don't submit a big patch to
PostgreSQL until you've done a small patch and some patch review"
instead though. It's both a good way to learn what not to do, and it
helps with one of the patch acceptance bottlenecks.

The problem is not the process itself, but that there is little
documentation of that process, and that much of that documentation does
not match the defacto process. Obviously, the onus is on me as much as
anyone else to fix this.

I know the documentation around all this has improved a lot since then.
Unfortunately there's plenty of submissions done by people who never
read it. Sometimes it's because people didn't know about it; in others
I suspect it was seen but some hard parts ignored because it seemed like
too much work.

One of these days I'm going to write the "Formatting Curmudgeon Guide to
Patch Submission", to give people an idea just how much diff reading and
revision a patch should go through in order to keep common issues like
spurious whitespace diffs out of it. Submitters can either spent a
block of time sweating those details out themselves, or force the job
onto a reviewer/committer; they're always there. And every minute it's
sitting in someone else's hands who is doing that job instead of reading
the real code, the odds of the patch being kicked back go up.

[1]: http://archives.postgresql.org/pgsql-patches/2007-03/msg00553.php
[2]: http://archives.postgresql.org/pgsql-patches/2007-02/msg00222.php

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#165Andrew Dunstan
andrew@dunslane.net
In reply to: Greg Smith (#164)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 05/09/2011 09:43 PM, Greg Smith wrote:

When I last did a talk about getting started writing patches, I had a
few people ask me afterwards if I'd ever run into problems with having
patch submissions rejected. I said I hadn't.

Part of the trouble is in the question. Having a patch rejected is not
really a problem; it's something you should learn from. I know it can be
annoying. I get annoyed when it happens to me too. But I try to get over
it as quickly as possible, and either fix the patch, or find another
(and better) way to do the same thing, or move on. Everybody here is
acting in good faith, and nobody's on a power trip. That's one of the
good things about working on Postgres. If it were otherwise I would have
moved on to something else long ago.

cheers

andrew

#166Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Greg Smith (#164)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On 10.05.2011 04:43, Greg Smith wrote:

Josh Berkus wrote:

As I don't think we can change this, I think the best answer is to
tell people
"Don't submit a big patch to PostgreSQL until you've done a few small
patches first. You'll regret it".

When I last did a talk about getting started writing patches, I had a
few people ask me afterwards if I'd ever run into problems with having
patch submissions rejected. I said I hadn't. When asked what my secret
was, I told them my first serious submission modified exactly one line
of code[1]. And *that* I had to defend in regards to its performance
impact.[2]

Anyway, I think the intro message should be "Don't submit a big patch to
PostgreSQL until you've done a small patch and some patch review"
instead though.

Well, my first patch was two-phase commit. And I had never even used
PostgreSQL before I dived into the source tree and started to work on
that. I did, however, lurk on the pgsql-hackers mailing list for a few
months before posting, so I knew the social dynamics. I basically did
exactly what Robert described elsewhere in this thread, and successfully
avoided the culture shock.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#167Pavan Deolasee
pavan.deolasee@gmail.com
In reply to: Heikki Linnakangas (#166)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Tue, May 10, 2011 at 1:46 PM, Heikki Linnakangas <
heikki.linnakangas@enterprisedb.com> wrote:

On 10.05.2011 04:43, Greg Smith wrote:

Josh Berkus wrote:

As I don't think we can change this, I think the best answer is to
tell people
"Don't submit a big patch to PostgreSQL until you've done a few small
patches first. You'll regret it".

When I last did a talk about getting started writing patches, I had a
few people ask me afterwards if I'd ever run into problems with having
patch submissions rejected. I said I hadn't. When asked what my secret
was, I told them my first serious submission modified exactly one line
of code[1]. And *that* I had to defend in regards to its performance
impact.[2]

Anyway, I think the intro message should be "Don't submit a big patch to
PostgreSQL until you've done a small patch and some patch review"
instead though.

Well, my first patch was two-phase commit. And I had never even used
PostgreSQL before I dived into the source tree and started to work on that.
I did, however, lurk on the pgsql-hackers mailing list for a few months
before posting, so I knew the social dynamics. I basically did exactly what
Robert described elsewhere in this thread, and successfully avoided the
culture shock.

Yeah, probably same for me, though I got a lot of support from existing
hackers during my first submission. But it was a tiring experience for sure.
I would submit a patch and then wait anxiously for any comments. I used to
get a lot of interesting and valuable comments, but would know that unless
one of the very few (Tom ?) members say something, good or bad, it won't go
anywhere and those comments did not come in the early days/months. I was an
unknown name and what I was trying to do was very invasive. So when I look
back now, I can understand the reluctance on other members to get excited
about the work. Most often they would see something in the design or the
patch which is completely stupid and they would loose all interest at the
very moment.

Since I had backing of EnterpriseDB and it was my paid job, it was much
easier to keep the enthusiasm, but I wouldn't be surprised if few others
would have turned their back to the project forever.

Fortunately, things have changed for better now. I think the entire commit
fest business is good. Also, we now have a lot more hackers with expertise
in different areas and with influential opinions. Its very likely that if
you submit an idea or a patch, you would get some
comment/suggestion/criticism very early.

Since HOT is mentioned often in these discussions, I thought I should share
my experience.

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

#168Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Heikki Linnakangas (#166)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

Anyway, I think the intro message should be "Don't submit a big patch to
PostgreSQL until you've done a small patch and some patch review"
instead though.

Well, my first patch was two-phase commit. And I had never even used
PostgreSQL before I dived into the source tree and started to work on
that. I did, however, lurk on the pgsql-hackers mailing list for a few
months before posting, so I knew the social dynamics. I basically did
exactly what Robert described elsewhere in this thread, and successfully
avoided the culture shock.

I tend to share the experience, my first patch (not counting
documentation patch) has been extensions. The fact that everybody here
knew me before (from side projects, events, reviews in commit fests,
design reviews on list…) certainly helped, but the real deal has been
that the design was agreed on by everybody before I started — that took
*lots of* time, but really paid off (good ideas all around, real buy in,
some good beers shared, etc).

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#169Greg Smith
greg@2ndquadrant.com
In reply to: Heikki Linnakangas (#166)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Heikki Linnakangas wrote:

Well, my first patch was two-phase commit. And I had never even used
PostgreSQL before I dived into the source tree and started to work on that

Well, everyone knows you're awesome. A small percentage of the people
who write patches end up having the combination of background skills,
mindset, and approach to pull something like that off. But there are at
least a dozens submissions that start review with "I don't think there
will ever work, but I can't even read your malformed patch to be sure"
for every one that went like 2PC. If every submitter was a budding
Heikki we wouldn't need patch submission guidelines at all.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

#170Josh Berkus
josh@agliodbs.com
In reply to: Andrew Dunstan (#165)
Re: Formatting Curmudgeons WAS: MMAP Buffers

All,

Part of the trouble is in the question. Having a patch rejected is not
really a problem; it's something you should learn from. I know it can be
annoying. I get annoyed when it happens to me too. But I try to get over
it as quickly as possible, and either fix the patch, or find another
(and better) way to do the same thing, or move on. Everybody here is
acting in good faith, and nobody's on a power trip. That's one of the
good things about working on Postgres. If it were otherwise I would have
moved on to something else long ago.

The problem is not that patches get rejected. It's *how* they get
rejected, and how the submitter experiences the process of them getting
rejected. Did they learn something from it and understand the reasons
for the rejection? or did they experience the process as arbitrary,
frustrating, and incomprehesible?

Ideally, we want a sumbitter whose patch has been rejected to walk away
with either "my proposal was rejected, and I understand why it's a bad
idea even if I don't agree", or "my proposal was rejected, and I know
what needs to be done to fix it."

Of course, there are always idiots who won't learn anything no matter
how good our process is. But if the whole submission process is
perceived to be fair and understandible, those will be a tiny minority.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#171Greg Stark
gsstark@mit.edu
In reply to: Josh Berkus (#170)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Tue, May 10, 2011 at 6:54 PM, Josh Berkus <josh@agliodbs.com> wrote:

Of course, there are always idiots who won't learn anything no matter
how good our process is.  But if the whole submission process is
perceived to be fair and understandible, those will be a tiny minority.

The thing is, I think things are much better now than they were three
or four years ago. At the time the project had grown much faster than
the existing stable of developers and the rate at which patches were
being submitted was much greater than they could review.

It's not perfect, Tom still spends more of his time reviewing patches
when he would probably enjoy writing fresh code -- and it's a shame if
you think about the possibilities we might be missing out on if he
were let loose. And patches still don't get a detailed HOWTO on what
needs to happen before commit. But it's better.

We need to be careful about looking at the current situation and
deciding it's not perfect so a wholesale change is needed when the
only reason it's not worse is because the current system was adopted.

--
greg

#172Josh Berkus
josh@agliodbs.com
In reply to: Greg Stark (#171)
Re: Formatting Curmudgeons WAS: MMAP Buffers

The thing is, I think things are much better now than they were three
or four years ago.

Oh, no question.

If you read above in this thread, I'm not really proposing a change in
the current process, just documenting the current process. Right now
there's a gap between how sumbitters expect things to work, and how they
actually do work.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

#173Robert Haas
robertmhaas@gmail.com
In reply to: Greg Stark (#171)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Tue, May 10, 2011 at 3:09 PM, Greg Stark <gsstark@mit.edu> wrote:

The thing is, I think things are much better now than they were three
or four years ago. At the time the project had grown much faster than
the existing stable of developers and the rate at which patches were
being submitted was much greater than they could review.

Just in the last 2.5 years since I've been around, there have, AFAICT,
been major improvements both in the timeliness and quality of the
feedback we provide, and the quality of the patches we receive. When
I first started reviewing, it was very common to blow through the
CommitFest application and bounce half the patches back for failure to
apply, failure to pass the regression tests, or other blindingly
obvious breakage. That's gone down almost to nothing. It's also
become much more common for patches to include adequate documentation
and regression tests - or at least *an effort* at documentation and
regression tests - than was formerly the case. We still bounce things
for those reasons from time to time - generally from recurring
contributors who think for some reason that it's someone else's job to
worry about cleaning up their patch - but it's less common than it
used to be.

We still have some rough edges around the incorporation of large
patches. But it could be so much worse. We committed something like
six major features in a month: collations, sync rep, SSI, SQL/MED,
extensions, writeable CTEs, and a major overhaul of PL/python. While
that's likely to delay the release a bit (and already has), and has
already produced quite a few bug reports and will produce many more
before we're done, it's still an impressive accomplishment. I'm not
sure we could have done that even a year ago.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#174Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#159)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Robert Haas <robertmhaas@gmail.com> wrote:

Unfortunately, people often come into our community with incorrect
assumptions about how it works, including:

- someone's in charge
- there's one right answer
- it's our job to fix your problem

Would it make sense to dispel such notions explicitly in the
Developer FAQ?

-Kevin

#175Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#174)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On Thu, May 12, 2011 at 11:55 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

Robert Haas <robertmhaas@gmail.com> wrote:

Unfortunately, people often come into our community with incorrect
assumptions about how it works, including:

- someone's in charge
- there's one right answer
- it's our job to fix your problem

Would it make sense to dispel such notions explicitly in the
Developer FAQ?

Can't hurt, though these principles also apply (perhaps even more
strongly) to bug reports and people wanting support.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#176Kevin Barnard
kevinbarnard@mac.com
In reply to: Josh Berkus (#158)
Re: Formatting Curmudgeons WAS: MMAP Buffers

On May 9, 2011, at 12:53 PM, Josh Berkus wrote:

2) Our process for reviewing and approving patches, and what criteria
such patches are required to meet, is *very* opaque to a first-time
submitter (as in no documentation the submitter knows about), and does
not become clearer as they go through the process. Aster, for example,
was completely unable to tell the difference between hackers who were
giving them legit feedback, and random list members who were
bikeshedding. As a result, they were never able to derive a concrete
list of "these are the things we need to fix to make the patch
acceptable," and gave up.

I know I'm not in the position to talk work flow as it effects others and not myself, but I though I would at least throw out an idea. Feel free to completely shoot it down and I won't take any offense.

A ticketing system with work flow could help with transparency. If it's setup correctly the work flow could help document where the item is in the review process. As another idea maybe have a status to indicate that the patch has been reviewed for formatting. It might make things easier to deal with because a ticket identified as WIP is obviously not ready for a CF etc etc. Hell you may even be able to find somebody to take care of reviewing formatting and dealing with those issues before it get's sent to a committer.

I know the core group is use to the mailing lists so a system that can be integrated into the mailing list would be a must I think. That shouldn't be too hard to setup. I don't think this is a cure all but transparency to status in the process is surely going to give first timers more of a warm and fuzzy.

--
Kevin Barnard
kevinbarnard@mac.com

#177Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Barnard (#176)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Kevin Barnard <kevinbarnard@mac.com> wrote:

A ticketing system with work flow could help with transparency.
If it's setup correctly the work flow could help document where
the item is in the review process. As another idea maybe have a
status to indicate that the patch has been reviewed for
formatting. It might make things easier to deal with because a
ticket identified as WIP is obviously not ready for a CF etc etc.
Hell you may even be able to find somebody to take care of
reviewing formatting and dealing with those issues before it get's
sent to a committer.

What you describe and more-is the intent of the CommifFest process
and its related web application. If you review these links and have
suggestions on how to improve the process, or how to make it more
obvious to newcomers, we'd be happy to hear about them.

http://wiki.postgresql.org/wiki/CommitFest

http://wiki.postgresql.org/wiki/Submitting_a_Patch

http://wiki.postgresql.org/wiki/Reviewing_a_Patch

http://wiki.postgresql.org/wiki/RRReviewers

https://commitfest.postgresql.org/action/commitfest_view/open

This process has, in my opinion, been a very big improvement on the
vagueness that came before.

-Kevin

#178Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#37)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

Attached are two patches, one to remove some infelicity in the entab
makefile, and the other to allow skipping specifying the typedefs file

I have applied the 'entab' Makefile fix.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#179Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#35)
Re: Formatting Curmudgeons WAS: MMAP Buffers

Andrew Dunstan wrote:

Now we could certainly make this quite a bit slicker. Apart from
anything else, we should change the indent source code tarball so it
unpacks into its own directory. Having it unpack into the current

Yes, done.

directory is ugly and unfriendly. And we should get rid of the "make
clean" line in the install target of entab's makefile, which just seems
totally ill-conceived.

Yes, fixed.

It might also be worth setting it up so that instead of having to pass a
path to a typedefs file on the command line, we default to a file
sitting in, say, /usr/local/etc. Then you'd just be able to say
"pgindent my_file.c".

Yes, also done.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +