run pgindent on a regular basis / scripted manner

Started by Andres Freundover 5 years ago308 messages
#1Andres Freund
andres@anarazel.de

Hi,

When developing patches I find it fairly painful that I cannot re-indent
patches with pgindent without also seeing a lot of indentation changes
in unmodified parts of files. It is easy enough ([1]./src/tools/pgindent/pgindent $(git diff-tree --no-commit-id --name-only -r upstream/master..HEAD|grep -v src/test|grep -v READ ME|grep -v typedefs.list)) to only re-indent
files that I have modified, but there's often a lot of independent
indentation changes in the files that I did modified.

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

Currently the yearly pgindent runs are somewhat painful for patches that
didn't make it into the release, but if we were to reindent on a more
regular basis, that should be much less the case. It'd also help newer
developers who we sometimes tell to use pgindent - which doesn't really
work.

Greetings,

Andres Freund

[1]: ./src/tools/pgindent/pgindent $(git diff-tree --no-commit-id --name-only -r upstream/master..HEAD|grep -v src/test|grep -v READ ME|grep -v typedefs.list)

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: run pgindent on a regular basis / scripted manner

On 2020-Aug-12, Andres Freund wrote:

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

Seconded.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Jesse Zhang
sbjesse@gmail.com
In reply to: Andres Freund (#1)
Re: run pgindent on a regular basis / scripted manner

Hi Andres,

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:

Hi,

When developing patches I find it fairly painful that I cannot re-indent
patches with pgindent without also seeing a lot of indentation changes
in unmodified parts of files. It is easy enough ([1]) to only re-indent
files that I have modified, but there's often a lot of independent
indentation changes in the files that I did modified.

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.

How bad was it right after branching 13? I wonder if we have any
empirical measure of badness over time -- assuming there was a point in
the recent past where everything was good, and the bad just crept in.

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

You know what's better than weekly? Every check-in. I for one would love
it if we can just format the entire codebase, and ensure that new
check-ins are also formatted. We _do_ need some form of continuous
integration to catch us when we have fallen short (again, once HEAD
reaches a "known good" state, it's conceivably cheap to keep it in the
good state.

Cheers,
Jesse

#4Andres Freund
andres@anarazel.de
In reply to: Jesse Zhang (#3)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2020-08-12 16:08:50 -0700, Jesse Zhang wrote:

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:

Hi,

When developing patches I find it fairly painful that I cannot re-indent
patches with pgindent without also seeing a lot of indentation changes
in unmodified parts of files. It is easy enough ([1]) to only re-indent
files that I have modified, but there's often a lot of independent
indentation changes in the files that I did modified.

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.

How bad was it right after branching 13? I wonder if we have any
empirical measure of badness over time -- assuming there was a point in
the recent past where everything was good, and the bad just crept in.

Well, just after branching it was perfect, because pgindent was
customarily is run just before branching. After that it incrementally
gets worse.

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

You know what's better than weekly? Every check-in. I for one would love
it if we can just format the entire codebase, and ensure that new
check-ins are also formatted. We _do_ need some form of continuous
integration to catch us when we have fallen short (again, once HEAD
reaches a "known good" state, it's conceivably cheap to keep it in the
good state.

Unfortunately that is, with the current tooling, not entirely trivial to
do so completely. The way we generate the list of known typedefs
unfortunately depends on the platform a build is run on. Therefore the
buildfarm collects a number of the generated list of typedefs from
different platforms, and then we use that combined list to run pgindent.

We surely can improve further, but I think having any automation around
this already would be a huge step.

Greetings,

Andres Freund

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jesse Zhang (#3)
Re: run pgindent on a regular basis / scripted manner

Jesse Zhang <sbjesse@gmail.com> writes:

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

You know what's better than weekly? Every check-in.

I'm not in favor of unsupervised pgindent runs, really. It can do a lot
of damage to code that was written without thinking about it --- in
particular, it'll make a hash of comment blocks that were manually
formatted and not protected with dashes.

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

I do like the idea of more frequent, smaller pgindent runs instead of
doing just one giant run per year. With the giant runs it's necessary
to invest a fair amount of time eyeballing all the changes; if we did it
every couple weeks then the pain would be a lot less.

Another idea would be to have a bot that runs pgindent *without*
committing the results, and emails the people who have made commits
into files that changed, saying "if you don't like these diffs then
you'd better clean up your commit before it happens for real". With
some warning like that, it might be okay to do automatic reindenting
a little bit later. Plus, nagging committers who habitually commit
improperly-indented code might persuade them to clean up their acts ;-)

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#4)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

Unfortunately that is, with the current tooling, not entirely trivial to
do so completely. The way we generate the list of known typedefs
unfortunately depends on the platform a build is run on. Therefore the
buildfarm collects a number of the generated list of typedefs from
different platforms, and then we use that combined list to run pgindent.

Yeah, it's hard to see how to avoid that given that the set of typedefs
provided by system headers is not fixed. (Windows vs not Windows is the
worst case of course, but Unixen aren't all alike either.)

Another gotcha that we have to keep our eyes on is that sometimes the
process finds spurious names that we don't want to treat as typedefs
because it results in messing up too much code. There's a reject list
in pgindent that takes care of those, but it has to be maintained
manually. So I'm not sure how that could work in conjunction with
unsupervised reindents --- by the time you notice a problem, the git
history will already be cluttered with bogus reindentations.

Maybe the secret is to not allow automated adoption of new typedefs.list
entries, but to require somebody to add entries to that file by hand,
even if they're basing it on the buildfarm results. (This would
encourage the habit some people have adopted of updating typedefs.list
along with commits that add typedefs. I've never done that, but would
be willing to change if there's good motivation to.)

regards, tom lane

#7Michael Paquier
michael@paquier.xyz
In reply to: Alvaro Herrera (#2)
Re: run pgindent on a regular basis / scripted manner

On Wed, Aug 12, 2020 at 06:53:25PM -0400, Alvaro Herrera wrote:

On 2020-Aug-12, Andres Freund wrote:

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

Seconded.

Thirded.
--
Michael

#8Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#5)
Re: run pgindent on a regular basis / scripted manner

On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:

Jesse Zhang <sbjesse@gmail.com> writes:

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:

Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

You know what's better than weekly? Every check-in.

I'm not in favor of unsupervised pgindent runs, really. It can do a lot
of damage to code that was written without thinking about it --- in
particular, it'll make a hash of comment blocks that were manually
formatted and not protected with dashes.

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at every
check-in:

1. After each push, an automated followup commit appears, restoring
pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what the
committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
pgindent-cleanliness.

Were you thinking of (2)? (1) doesn't have the lack-of-history problem, but
it does have the unexpected-damage problem, and it makes gitweb noisier. (3)
has neither problem, and I'd prefer it over (1), (2), or $SUBJECT.

Regarding typedefs.list, I would use the in-tree one, like you discussed here:

Show quoted text

On Wed, Aug 12, 2020 at 07:57:29PM -0400, Tom Lane wrote:

Maybe the secret is to not allow automated adoption of new typedefs.list
entries, but to require somebody to add entries to that file by hand,
even if they're basing it on the buildfarm results. (This would
encourage the habit some people have adopted of updating typedefs.list
along with commits that add typedefs. I've never done that, but would
be willing to change if there's good motivation to.)

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#8)
Re: run pgindent on a regular basis / scripted manner

Noah Misch <noah@leadboat.com> writes:

On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at every
check-in:

1. After each push, an automated followup commit appears, restoring
pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what the
committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
pgindent-cleanliness.

Were you thinking of (2)?

I was objecting to (2). (1) would perhaps work. (3) could be pretty
darn annoying, especially if it blocks a time-critical security patch.

Regarding typedefs.list, I would use the in-tree one, like you discussed here:

Yeah, after thinking about that more, it seems like automated
typedefs.list updates would be far riskier than automated reindent
based on the existing typedefs.list. The latter could at least be
expected not to change code unrelated to the immediate commit.
typedefs.list updates need some amount of adult supervision.

(I'd still vote for nag-mail to the committer whose work got reindented,
in case the bot made things a lot worse.)

I hadn't thought about the angle of HEAD versus back-branch patches,
but that does seem like something to ponder. The back branches don't
have the same pgindent rules necessarily, plus the patch versions
might be different in more than just whitespace. My own habit when
back-patching has been to indent the HEAD patch per-current-rules and
then preserve that layout as much as possible in the back branches,
but I doubt we could get a tool to do that with any reliability.

Of course, there's also the possibility of forcibly reindenting
all the active back branches to current rules. But I think we've
rejected that idea already, because it would cause so much pain
for forks that are following a back branch.

regards, tom lane

#10Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#9)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 12:08:36AM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at every
check-in:

1. After each push, an automated followup commit appears, restoring
pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what the
committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
pgindent-cleanliness.

Were you thinking of (2)?

I was objecting to (2). (1) would perhaps work. (3) could be pretty
darn annoying,

Right. I think of three use cases here:

a) I'm a committer who wants to push clean code. I want (3).
b) I'm a committer who wants to ignore pgindent. I get some email complaints
under (1), which I ignore. Under (3), I'm forced to become (a).
c) I'm reading the history. I want (3).

I hadn't thought about the angle of HEAD versus back-branch patches,
but that does seem like something to ponder. The back branches don't
have the same pgindent rules necessarily, plus the patch versions
might be different in more than just whitespace. My own habit when
back-patching has been to indent the HEAD patch per-current-rules and
then preserve that layout as much as possible in the back branches,
but I doubt we could get a tool to do that with any reliability.

Similar habit here. Another advantage of master-only is a guarantee against
disrupting time-critical patches. (It would be ugly to push back branches and
sort out the master push later, but it doesn't obstruct the mission.)

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#10)
Re: run pgindent on a regular basis / scripted manner

Noah Misch <noah@leadboat.com> writes:

... Another advantage of master-only is a guarantee against
disrupting time-critical patches. (It would be ugly to push back branches and
sort out the master push later, but it doesn't obstruct the mission.)

Hm, doesn't it? I had the idea that "git push" is atomic --- either all
the per-branch commits succeed, or they all fail. I might be wrong.

regards, tom lane

#12Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#11)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 01:14:33AM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

... Another advantage of master-only is a guarantee against
disrupting time-critical patches. (It would be ugly to push back branches and
sort out the master push later, but it doesn't obstruct the mission.)

Hm, doesn't it? I had the idea that "git push" is atomic --- either all
the per-branch commits succeed, or they all fail. I might be wrong.

Atomicity is good. I just meant that you could issue something like "git push
origin $(cd .git/refs/heads && ls REL*)" to defer the complaint about master.

#13Jesse Zhang
sbjesse@gmail.com
In reply to: Tom Lane (#11)
Re: run pgindent on a regular basis / scripted manner

On Wed, Aug 12, 2020 at 10:14 PM Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

... Another advantage of master-only is a guarantee against
disrupting time-critical patches. (It would be ugly to push back branches and
sort out the master push later, but it doesn't obstruct the mission.)

Hm, doesn't it? I had the idea that "git push" is atomic --- either all
the per-branch commits succeed, or they all fail. I might be wrong.

As of Git 2.28, atomic pushes are not turned on by default. That means
"git push my-remote foo bar" _can_ result in partial success. I'm that
paranoid who does "git push --atomic my-remote foo bar fizz".

Cheers,
Jesse

#14Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#9)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 6:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at

every

check-in:

1. After each push, an automated followup commit appears, restoring
pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what

the

committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
pgindent-cleanliness.

There's another option here as well, that is a bit "softer", is to use a
pre-commit hook.

That is, it's a hook that runs on the committers machine prior to the
commit. This hook can then yell "hey, you need to run pgindent before
committing this", but it gives the committer the ability to do --no-verify
and commit anyway (thus won't block things that are urgent).

Since it allows a simple bypass, and very much relies on the committer to
remember to install the hook in their local repository, this is not a
guarantee in any way. So it might need to be done together with something
else in the background doing like a daily job or so, but it might make that
background work be smaller and fewer changes.

This obviously only works in the case where we can rely on the committers
to remember to install such a hook, but given the few committers that we do
have, I think we can certainly get that up to an "acceptable rate" fairly
easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few
other repositories, to ensure python styleguides are followed.

Were you thinking of (2)?

I was objecting to (2). (1) would perhaps work. (3) could be pretty
darn annoying, especially if it blocks a time-critical security patch.

FWIW, I agree that (2) seems like a really bad option. In that suddenly a
committer has committed something they were not aware of.

Regarding typedefs.list, I would use the in-tree one, like you discussed

here:

Yeah, after thinking about that more, it seems like automated
typedefs.list updates would be far riskier than automated reindent
based on the existing typedefs.list. The latter could at least be
expected not to change code unrelated to the immediate commit.
typedefs.list updates need some amount of adult supervision.

(I'd still vote for nag-mail to the committer whose work got reindented,
in case the bot made things a lot worse.)

Yeah, I'm definitely not a big fan of automated commits, regardless of if
it's just indent or both indent+typedef. It's happened at least once, and I
think more than once, that we've had to basically hard reset the upstream
repository and clean things up after automated commits have gone bonkers
(hi, Bruce!). Having an automated system do the whole flow of
change->commit->push definitely invites this type of problem.

There are many solutions that do such things but that do it in the "github
workflow" way, which is they do change -> commit -> create pull request,
and then somebody eyeballs that pullrequest and approves it. We don't have
PRs, but we could either have a script that simply sends out a patch to a
mailinglist, or we could have a script that maintains a separate branch
which is auto-pgindented, and then have a committer squash-merge that
branch after having reviewed it.

Maybe something like that in combination with a pre-commit hook per above.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#15Stephen Frost
sfrost@snowman.net
In reply to: Magnus Hagander (#14)
Re: run pgindent on a regular basis / scripted manner

Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:

There's another option here as well, that is a bit "softer", is to use a
pre-commit hook.

Yeah, +1 on a pre-commit idea to help address this. I certainly agree
with Andres that it's quite annoying to deal with commits coming in that
aren't indented properly but are in some file that I'm working on.

This obviously only works in the case where we can rely on the committers
to remember to install such a hook, but given the few committers that we do
have, I think we can certainly get that up to an "acceptable rate" fairly
easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few
other repositories, to ensure python styleguides are followed.

Yeah, no guarantee, but definitely seems like it'd be a good
improvement.

I was objecting to (2). (1) would perhaps work. (3) could be pretty
darn annoying, especially if it blocks a time-critical security patch.

FWIW, I agree that (2) seems like a really bad option. In that suddenly a
committer has committed something they were not aware of.

Yeah, I dislike (2) a lot too.

Yeah, I'm definitely not a big fan of automated commits, regardless of if
it's just indent or both indent+typedef. It's happened at least once, and I
think more than once, that we've had to basically hard reset the upstream
repository and clean things up after automated commits have gone bonkers
(hi, Bruce!). Having an automated system do the whole flow of
change->commit->push definitely invites this type of problem.

Agreed, automated commits seems terribly risky.

There are many solutions that do such things but that do it in the "github
workflow" way, which is they do change -> commit -> create pull request,
and then somebody eyeballs that pullrequest and approves it. We don't have
PRs, but we could either have a script that simply sends out a patch to a
mailinglist, or we could have a script that maintains a separate branch
which is auto-pgindented, and then have a committer squash-merge that
branch after having reviewed it.

Maybe something like that in combination with a pre-commit hook per above.

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Thanks,

Stephen

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#15)
Re: run pgindent on a regular basis / scripted manner

Stephen Frost <sfrost@snowman.net> writes:

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Meh. Asking all submitters to install pgindent is a bit of a burden.
Moreover, sometimes it's better to provide a patch that deliberately
hasn't reindented existing code, for ease of review (say, when you're
adding if() { ... } around some big hunk of code). I think getting
committers to do this as part of commit is a better workflow.

(Admittedly, since I've been doing that for a long time, I don't
find it to be a burden. I suppose some committers do.)

regards, tom lane

#17Magnus Hagander
magnus@hagander.net
In reply to: Stephen Frost (#15)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 6:30 PM Stephen Frost <sfrost@snowman.net> wrote:

* Magnus Hagander (magnus@hagander.net) wrote:

There are many solutions that do such things but that do it in the

"github

workflow" way, which is they do change -> commit -> create pull request,
and then somebody eyeballs that pullrequest and approves it. We don't

have

PRs, but we could either have a script that simply sends out a patch to a
mailinglist, or we could have a script that maintains a separate branch
which is auto-pgindented, and then have a committer squash-merge that
branch after having reviewed it.

Maybe something like that in combination with a pre-commit hook per

above.

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Werll, that's one thing, but what I was thinking of here was more of an
automated branch maintained for the committers, not for the individual
patch owners.

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked
off (or maybe wait 5 minutes to coalesce multiple patches if there are)
2. This process checks out master, and runs pgindent on it
3. When done, this gets committed to a new branch (or just overwrites an
existing branch of course, we don't need to maintain history here) like
"master-indented". This branch can be in a different repo, but one that
starts out as a clone of the main one
4. A committer (any committer) can then on regular basis examine the
differences by fetch + diff. If they're happy with it, cherry pick it in.
If not, figure out what needs to be done to adjust it.

Step 4 can be done at whatever interval we prefer, and we can have
something to nag us if head has been "off-indent" for too long.

This would be the backup for things that aren't indented during patch
commit, not other things.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#18Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#5)
Re: run pgindent on a regular basis / scripted manner

On Wed, Aug 12, 2020 at 7:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'm not in favor of unsupervised pgindent runs, really. It can do a lot
of damage to code that was written without thinking about it --- in
particular, it'll make a hash of comment blocks that were manually
formatted and not protected with dashes.

No committer should be committing code without thinking about
pgindent. If some are, they need to up their game.

I am not sure whether weekly or after-every-commit pgindent runs is a
good idea, but I think we should try to do it once a month or so. It's
too annoying otherwise. I could go either way on the question of
automation.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#19Robert Haas
robertmhaas@gmail.com
In reply to: Stephen Frost (#15)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:

So, in our world, wouldn't this translate to 'make cfbot complain'?

This seems like it would be useful, but we'd have to figure out what
to do about typedefs.list. If the patch is indented with the current
one (which is auto-generated by the entire build farm, remember) it's
likely to mess up a patch that's otherwise properly formatted. We'd
either need to insist that people include updates to typedefs.list in
the patch, or else have the cfbot take a stab at doing those updates
itself.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#18)
Re: run pgindent on a regular basis / scripted manner

Robert Haas <robertmhaas@gmail.com> writes:

I am not sure whether weekly or after-every-commit pgindent runs is a
good idea, but I think we should try to do it once a month or so. It's
too annoying otherwise. I could go either way on the question of
automation.

The traditional reason for not doing pgindent too often has been that
it'd cause more work for people who have to rebase their patches over
pgindent's results. If we want to do it more often, then in order to
respond to that concern, I think we need to do it really often ---
not necessarily quite continuously, but often enough that pgindent
is only changing recently-committed code. In this way, it'd be likely
that anyone with a patch touching that same code would only need to
rebase once not twice. The approaches involving an automated run
give a guarantee of that, otherwise we don't have a guarantee; but
as long as it's not many days delay I think it wouldn't be bad.

Intervals on the order of a month seem likely to be the worst of
both worlds from this standpoint --- too long for people to wait
before rebasing their patch, yet short enough that they'd have
to do so repeatedly.

regards, tom lane

#21Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#20)
Re: run pgindent on a regular basis / scripted manner

On Thu, Aug 13, 2020 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

The traditional reason for not doing pgindent too often has been that
it'd cause more work for people who have to rebase their patches over
pgindent's results. If we want to do it more often, then in order to
respond to that concern, I think we need to do it really often ---
not necessarily quite continuously, but often enough that pgindent
is only changing recently-committed code. In this way, it'd be likely
that anyone with a patch touching that same code would only need to
rebase once not twice. The approaches involving an automated run
give a guarantee of that, otherwise we don't have a guarantee; but
as long as it's not many days delay I think it wouldn't be bad.

Intervals on the order of a month seem likely to be the worst of
both worlds from this standpoint --- too long for people to wait
before rebasing their patch, yet short enough that they'd have
to do so repeatedly.

Yeah, I get the point. It depends somewhat on how often you think
people will rebase. The main thing against more frequent pgindent runs
is that it clutters the history. If done manually, it's also a lot of
work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#16)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2020-08-13 12:50:16 -0400, Tom Lane wrote:

Stephen Frost <sfrost@snowman.net> writes:

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Meh. Asking all submitters to install pgindent is a bit of a burden.

+1. We could improve on that by slurping it into src/tools though. If
there were a 'make patchprep' target, it'd be a lot more realistic.

But even so, it'd robably further inrease the rate of needing to
constantly rebase lingering patches if cfbot considered indentation
issues failures. E.g. because of typedefs.list updates etc. So I'm
against that, even if we had a patchprep target that didn't need
external tools.

Greetings,

Andres Freund

#23Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#19)
Re: run pgindent on a regular basis / scripted manner

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:

On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:

So, in our world, wouldn't this translate to 'make cfbot complain'?

This seems like it would be useful, but we'd have to figure out what
to do about typedefs.list. If the patch is indented with the current
one (which is auto-generated by the entire build farm, remember) it's
likely to mess up a patch that's otherwise properly formatted. We'd
either need to insist that people include updates to typedefs.list in
the patch, or else have the cfbot take a stab at doing those updates
itself.

For my 2c, anyway, I like the idea of having folks update the typedefs
themselves when they've got a patch that needs a new typedef to be
indented correctly. Having cfbot try to do that seems unlikely to work
well.

I also didn't mean to imply that we'd push back and ask for a rebase due
to indentation changes, but at the same time, I question if it's really
that realistic a concern- either whomever posted the patch ran pgindent
on it, or they didn't, and I doubt cfbot's check of that would change
without there being a conflict between the patch and something that got
committed anyway.

I also disagree that it's that much of a burden to ask people who are
already hacking on PG to install pgindent.

All that said, seems that others feel differently and while I still
think it's a pretty reasonable idea to have cfbot check, if no one
agrees with me, that's fine too. Having the pre-commit hook would help
with the downstream issue of pgindent pain from unrelated incorrect
indentation, so at least dealing with the patch author not properly
indenting to start with would be just on the bits the patch is already
modifying, which is a lot better.

Thanks,

Stephen

#24Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Stephen Frost (#23)
Re: run pgindent on a regular basis / scripted manner

On 2020-Aug-13, Stephen Frost wrote:

For my 2c, anyway, I like the idea of having folks update the typedefs
themselves when they've got a patch that needs a new typedef to be
indented correctly.

Well, let's for starters encourage committers to update typedefs.
Personally I've stayed away from it for each commit just because we've
historically not done it, but I can easily change that. Plus, it cannot
harm.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Magnus Hagander (#17)
Re: run pgindent on a regular basis / scripted manner

On 2020-Aug-13, Magnus Hagander wrote:

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked
off (or maybe wait 5 minutes to coalesce multiple patches if there are)
2. This process checks out master, and runs pgindent on it
3. When done, this gets committed to a new branch (or just overwrites an
existing branch of course, we don't need to maintain history here) like
"master-indented". This branch can be in a different repo, but one that
starts out as a clone of the main one
4. A committer (any committer) can then on regular basis examine the
differences by fetch + diff. If they're happy with it, cherry pick it in.
If not, figure out what needs to be done to adjust it.

Sounds good -- for branch master.

Yesterday I tried to indent some patch across all branches, only to
discover that I'm lacking the pg_bsd_indent necessary for the older
ones. I already have two, but apparently I'd need *four* different
versions with current branches (1.3, 2.0, 2.1, 2.1.1)

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#26Amit Kapila
amit.kapila16@gmail.com
In reply to: Alvaro Herrera (#25)
Re: run pgindent on a regular basis / scripted manner

On Sat, Aug 15, 2020 at 1:57 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2020-Aug-13, Magnus Hagander wrote:

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked
off (or maybe wait 5 minutes to coalesce multiple patches if there are)
2. This process checks out master, and runs pgindent on it
3. When done, this gets committed to a new branch (or just overwrites an
existing branch of course, we don't need to maintain history here) like
"master-indented". This branch can be in a different repo, but one that
starts out as a clone of the main one
4. A committer (any committer) can then on regular basis examine the
differences by fetch + diff. If they're happy with it, cherry pick it in.
If not, figure out what needs to be done to adjust it.

Sounds good -- for branch master.

Yesterday I tried to indent some patch across all branches, only to
discover that I'm lacking the pg_bsd_indent necessary for the older
ones. I already have two, but apparently I'd need *four* different
versions with current branches (1.3, 2.0, 2.1, 2.1.1)

FWIW, for back-branches, I just do similar to what Tom said above [1]/messages/by-id/397020.1597291716@sss.pgh.pa.us
("My own habit when back-patching has been to indent the HEAD patch
per-current-rules and then preserve that layout as much as possible in
the back branches"). If we want we can maintain all the required
versions of pg_bsd_indent but as of now, I am not doing so and thought
that following some approximation rule (do it for HEAD and try my best
to maintain the layout for back-branches) is good enough.

[1]: /messages/by-id/397020.1597291716@sss.pgh.pa.us

--
With Regards,
Amit Kapila.

#27Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Andres Freund (#1)
Re: run pgindent on a regular basis / scripted manner

On 2020-08-13 00:34, Andres Freund wrote:

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.

Do we have a sense of why poorly-indented code gets committed? I think
some of the indentation rules are hard to follow manually. (pgperltidy
is worse.)

Also, since pgindent gets run eventually anyway, it's not really that
important to get the indentation right the first time. I suspect the
goal of most authors and committers is to write readable code rather
than to divine the exact pgindent output.

I think as a start, we could just issue a guidelines that all committed
code should follow pgindent. That has never really been a guideline, so
it's not surprising that it's not followed.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#28Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#27)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2020-08-15 13:47:41 +0200, Peter Eisentraut wrote:

On 2020-08-13 00:34, Andres Freund wrote:

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.

Do we have a sense of why poorly-indented code gets committed? I think some
of the indentation rules are hard to follow manually. (pgperltidy is
worse.)

Also, since pgindent gets run eventually anyway, it's not really that
important to get the indentation right the first time. I suspect the goal
of most authors and committers is to write readable code rather than to
divine the exact pgindent output.

One thing is that some here are actively against manually adding entries
to typedefs.list. Which then means that code gets oddly indented if you
use pgindent. I personally try to make the predictable updates to
typedefs.list, which then at least allows halfway sensibly indenting my
own changes.

I think as a start, we could just issue a guidelines that all committed code
should follow pgindent. That has never really been a guideline, so it's not
surprising that it's not followed.

Without a properly indented baseline that's hard to do, because it'll
cause damage all over. So I don't think we easily can start just there -
we'd first need to re-indent everything.

Greetings,

Andres Freund

#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#28)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

One thing is that some here are actively against manually adding entries
to typedefs.list.

I've been of the opinion that it's pointless to do so under the current
regime where (a) only a few people do that and (b) we only officially
re-indent once a year anyway. When I want to manually run pgindent,
I always pull down a fresh typedefs.list from the buildfarm, which is
reasonably up-to-date regardless of what people added or didn't add,
and then add any new typedefs from my current patch to that out-of-tree
copy.

Now, if we switch to a regime where we're trying to keep the tree in
more nearly correctly-indented shape all the time, it would make sense
to revisit that. I'm not saying that it's unreasonable to want to have
the in-tree typedefs.list track reality more closely --- only that doing
so in a half-baked way won't be very helpful.

I think as a start, we could just issue a guidelines that all committed code
should follow pgindent. That has never really been a guideline, so it's not
surprising that it's not followed.

Without a properly indented baseline that's hard to do, because it'll
cause damage all over. So I don't think we easily can start just there -
we'd first need to re-indent everything.

Well, we can certainly do a tree-wide re-indent anytime we're ready.
I doubt it would be very painful right now, with so little new work
since the last run.

regards, tom lane

#30Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#29)
Re: run pgindent on a regular basis / scripted manner

On Sat, Aug 15, 2020 at 01:44:34PM -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Without a properly indented baseline that's hard to do, because it'll
cause damage all over. So I don't think we easily can start just there -
we'd first need to re-indent everything.

Well, we can certainly do a tree-wide re-indent anytime we're ready.
I doubt it would be very painful right now, with so little new work
since the last run.

Uh, I thought Tom was saying we need to reindent all branches, which
would be a big change for those tracking forks.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

#31Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#30)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2020-08-17 13:54:15 -0400, Bruce Momjian wrote:

On Sat, Aug 15, 2020 at 01:44:34PM -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Without a properly indented baseline that's hard to do, because it'll
cause damage all over. So I don't think we easily can start just there -
we'd first need to re-indent everything.

Well, we can certainly do a tree-wide re-indent anytime we're ready.
I doubt it would be very painful right now, with so little new work
since the last run.

Uh, I thought Tom was saying we need to reindent all branches, which
would be a big change for those tracking forks.

I don't think he said that? He said *if* we were to reindent all
branches, forks would probably have an issue. We're already reindenting
HEAD on a regular basis (just very infrequent), so it can't be a
fundamental issue.

Greetings,

Andres Freund

#32Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#30)
Re: run pgindent on a regular basis / scripted manner

Bruce Momjian <bruce@momjian.us> writes:

On Sat, Aug 15, 2020 at 01:44:34PM -0400, Tom Lane wrote:

Well, we can certainly do a tree-wide re-indent anytime we're ready.
I doubt it would be very painful right now, with so little new work
since the last run.

Uh, I thought Tom was saying we need to reindent all branches, which
would be a big change for those tracking forks.

No, I'm not for reindenting the back branches in general. There was
some discussion about whether to try to indent back-patched patches
to meet the conventions of the specific back branch, but I doubt
that that's worth troubling over. I think we really only care
about whether HEAD is fully consistently indented.

regards, tom lane

#33Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#29)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2020-08-15 13:44:34 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

One thing is that some here are actively against manually adding entries
to typedefs.list.

I've been of the opinion that it's pointless to do so under the current
regime where (a) only a few people do that and (b) we only officially
re-indent once a year anyway. When I want to manually run pgindent,
I always pull down a fresh typedefs.list from the buildfarm, which is
reasonably up-to-date regardless of what people added or didn't add,
and then add any new typedefs from my current patch to that out-of-tree
copy.

Well, properly indenting new code still is worthwhile. And once you go
through the trouble of adding the typedefs locally, I don't really see
the reason not to also include them in the commit. Sure it'll not help
much with tree-wide re-indents, but with individual files it can still
make life less painful.

Greetings,

Andres Freund

#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#33)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2020-08-15 13:44:34 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

One thing is that some here are actively against manually adding entries
to typedefs.list.

I've been of the opinion that it's pointless to do so under the current
regime where (a) only a few people do that and (b) we only officially
re-indent once a year anyway. When I want to manually run pgindent,
I always pull down a fresh typedefs.list from the buildfarm, which is
reasonably up-to-date regardless of what people added or didn't add,
and then add any new typedefs from my current patch to that out-of-tree
copy.

Well, properly indenting new code still is worthwhile. And once you go
through the trouble of adding the typedefs locally, I don't really see
the reason not to also include them in the commit.

Yeah, I'm quite religious about making sure my commits have been through
pgindent already (mainly to reduce subsequent "git blame" noise).
However, relying on manual updates to the in-tree typedefs.list only
works if every committer is equally religious about it. They're not;
else we'd not be having this discussion. The workflow I describe above
is not dependent on how careful everybody else is, which is why I
prefer it.

I think that the main new idea that's come out of this thread so far
is that very frequent reindents would be as good, maybe better, as
once-a-year reindents in terms of the amount of rebasing pain they
cause for not-yet-committed patches. If we can fix it so that any
mis-indented commit is quickly corrected, then rebasing would only be
needed in places that were changed anyway. So it seems like that
would be OK as a new project policy if we can make it happen.

However, I don't see any way to make it happen like that without
more-or-less automated reindents and typedefs.list updates,
and that remains a bit scary.

I did just have an idea that seems to ameliorate the scariness
a good bit: allow the reindent bot to auto-commit its results
only if the only lines it's changing are ones that were touched
by commits since the last auto-reindent. Otherwise punt and ask
a human to review the results. Not sure how hard that is to
implement, though.

Another good safety check would be to not proceed unless the latest
typedefs list available from the buildfarm is newer than the last
commit --- then we won't mess up recent commits whose typedefs are
not in the list yet.

regards, tom lane

#35Magnus Hagander
magnus@hagander.net
In reply to: Alvaro Herrera (#25)
Re: run pgindent on a regular basis / scripted manner

On Fri, Aug 14, 2020 at 10:26 PM Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

On 2020-Aug-13, Magnus Hagander wrote:

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked
off (or maybe wait 5 minutes to coalesce multiple patches if there are)
2. This process checks out master, and runs pgindent on it
3. When done, this gets committed to a new branch (or just overwrites an
existing branch of course, we don't need to maintain history here) like
"master-indented". This branch can be in a different repo, but one that
starts out as a clone of the main one
4. A committer (any committer) can then on regular basis examine the
differences by fetch + diff. If they're happy with it, cherry pick it in.
If not, figure out what needs to be done to adjust it.

Sounds good -- for branch master.

So mostly for testing, I've set up a job that does this.

Basically it runs every 15 minutes and if there is a new commit on master
it will rebase onto the latest master and run pgindent on it. This then
gets pushed up to a separate repo (postgresql-pgindent.git on
git.postgresql.org), and can be viewed there.

To see the current state of pgindent, view:
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=commitdiff;h=master-pgindent

(or decorate as wanted to see for example a raw patch format)

If a committer wants to use it directly, just "git remote add" the
postgresql-pgindent.git and then cherry-pick the branch tip into your own
repository, and push. Well, actually that will right now fail because the
commit is made by "Automatic pgindent" which is not an approved committer,
but if we want to do this as a more regular thing, we can certainly fix
that.

Note that the only thing the job does is run pgindent. It does not attempt
to do anything with the typedef list at this point.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#36Jelte Fennema
postgres@jeltef.nl
In reply to: Magnus Hagander (#35)
Re: run pgindent on a regular basis / scripted manner

As a not very frequent submitter, on all the patches that I submit I
keep running into this problem. I have two open patchsets for
libpq[1]https://commitfest.postgresql.org/41/3511/[2]https://commitfest.postgresql.org/41/3679/ both of which currently include the same initial "run
pgindent" patch in addition to the actual patch, just so I can
actually run it on my own patch because a9e9a9f caused formatting to
totally be off in the libpq directory. And there's lots of other
changes that pgindent wants to make, which are visible on the job that
Magnus has set up.

To me a master branch that pgindent never complains about sounds
amazing! And I personally think rejection of unindented pushes and
cfbot complaining about unindented patches would be a very good thing,
because that seems to be the only solution that could achieve that.

Having cfbot complain also doesn't sound like a crazy burden for
submitters. Many open-source projects have CI complaining if code
formatting does not pass automatic formatting tools. As long as there
is good documentation on how to install and run pgindent I don't think
it should be a big problem. A link to those docs could even be
included in the failing CI job its error message. A pre-commit hook
that submitters/committers could install would be super useful too.
Since right now I sometimes forget to run pgindent, especially since
there's no editor integration (that I know of) for pgindent.

Side-question: What's the reason why pgindent is used instead of some
more "modern" code formatter that doesn't require keeping
typedefs.list up to date for good looking output? (e.g. uncrustify or
clang-format) Because that would also allow for easy editor
integration.

[1]: https://commitfest.postgresql.org/41/3511/
[2]: https://commitfest.postgresql.org/41/3679/

#37Bruce Momjian
bruce@momjian.us
In reply to: Jelte Fennema (#36)
Re: run pgindent on a regular basis / scripted manner

On Fri, Jan 20, 2023 at 10:43:50AM +0100, Jelte Fennema wrote:

Side-question: What's the reason why pgindent is used instead of some
more "modern" code formatter that doesn't require keeping
typedefs.list up to date for good looking output? (e.g. uncrustify or
clang-format) Because that would also allow for easy editor
integration.

Good question. Our last big pgindent dicussion was in 2017, where I
said:

/messages/by-id/20170612213525.GA4074@momjian.us

Understood. You would think that with the number of open-source
programs written in C that there would be more interest in C formatting
tools. Is the Postgres community the only ones with specific
requirements, or is it just that we settled on an older tool and can't
easily change? I have reviewed the C formatting options a few times
over the years and every time the other options were worse than what we
had.

We also discussed it in 2011, and this email was key for me:

/messages/by-id/201106220218.p5M2InB08144@momjian.us

I am excited Andrew has done this. It has been on my TODO list for a
while --- I was hoping someday we could switch to GNU indent but gave up
after the GNU indent report from Greg Stark that exactly matched my
experience years ago:

http://archives.postgresql.org/pgsql-hackers/2011-04/msg01436.php

Basically, GNU indent has new bugs, but bugs that are harder to work
around than the BSD indent bugs.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#36)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

To me a master branch that pgindent never complains about sounds
amazing! And I personally think rejection of unindented pushes and
cfbot complaining about unindented patches would be a very good thing,
because that seems to be the only solution that could achieve that.

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.
Maybe that's not a huge lift, but it'd be YA obstacle to new contributors,
and we don't need any more of those.

Yeah, if we switched to some other tool maybe we could reduce the size
of that problem. But as Bruce replied, we've not found another one that
(a) can be coaxed to make output comparable to what we're accustomed to
and (b) seems decently well maintained.

regards, tom lane

#39Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#38)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

Jelte Fennema <postgres@jeltef.nl> writes:

To me a master branch that pgindent never complains about sounds
amazing! And I personally think rejection of unindented pushes and
cfbot complaining about unindented patches would be a very good thing,
because that seems to be the only solution that could achieve that.

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Maybe that's not a huge lift, but it'd be YA obstacle to new contributors,
and we don't need any more of those.

Yeah, if we switched to some other tool maybe we could reduce the size
of that problem. But as Bruce replied, we've not found another one that
(a) can be coaxed to make output comparable to what we're accustomed to
and (b) seems decently well maintained.

One question around this is how much change we'd accept. clang-format for
example is well maintained and can get somewhat close to our style - but
there are things that can't easily be approximated.

Greetings,

Andres Freund

#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#39)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

Yeah, if we switched to some other tool maybe we could reduce the size
of that problem. But as Bruce replied, we've not found another one that
(a) can be coaxed to make output comparable to what we're accustomed to
and (b) seems decently well maintained.

One question around this is how much change we'd accept. clang-format for
example is well maintained and can get somewhat close to our style - but
there are things that can't easily be approximated.

If somebody wants to invest the effort in seeing how close we can get
and what the remaining delta would look like, we could have a discussion
about whether it's an acceptable change. I don't think anyone has
tried with clang-format.

regards, tom lane

#41Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#40)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-20 Fr 13:19, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

I'll take a look at fleshing this out, for my own if no-one else's use.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#42Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#41)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:

On 2023-01-20 Fr 13:19, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

I'll take a look at fleshing this out, for my own if no-one else's use.

Here's a quick patch for this. I have it in mind to use like this in a
pre-commit hook:

# only do this on master
test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0

src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \

  { echo "Need a pgindent run" >&2 ; exit 1; }

The committer could then run

src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`

to see what changes it thinks are needed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#43Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#42)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-21 Sa 10:00, Andrew Dunstan wrote:

On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:

On 2023-01-20 Fr 13:19, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

I'll take a look at fleshing this out, for my own if no-one else's use.

Here's a quick patch for this. I have it in mind to use like this in a
pre-commit hook:

# only do this on master
test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0

src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \

  { echo "Need a pgindent run" >&2 ; exit 1; }

The committer could then run

src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`

to see what changes it thinks are needed.

This time with patch.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pgindent_silent_plus_showdiff.patchtext/x-patch; charset=UTF-8; name=pgindent_silent_plus_showdiff.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 741b0ccb58..2834f5c9fc 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -21,7 +21,8 @@ my $indent_opts =
 
 my $devnull = File::Spec->devnull;
 
-my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build);
+my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build,
+	$show_diff, $silent_diff);
 
 my %options = (
 	"typedefs=s"         => \$typedefs_file,
@@ -29,9 +30,14 @@ my %options = (
 	"code-base=s"        => \$code_base,
 	"excludes=s"         => \$excludes,
 	"indent=s"           => \$indent,
-	"build"              => \$build,);
+	"build"              => \$build,
+    "show-diff"          => \$show_diff,
+    "silent_diff"        => \$silent_diff,);
 GetOptions(%options) || die "bad command line argument\n";
 
+die "Cannot have both --silent-diff and --show-diff"
+  if $silent_diff && $show_diff;
+
 run_build($code_base) if ($build);
 
 # command line option wins, then first non-option arg,
@@ -283,30 +289,19 @@ sub run_indent
 
 }
 
-
-# for development diagnostics
-sub diff
+sub show_diff
 {
-	my $pre   = shift;
-	my $post  = shift;
-	my $flags = shift || "";
+	my $indented   = shift;
+	my $source_filename  = shift;
 
-	print STDERR "running diff\n";
-
-	my $pre_fh  = new File::Temp(TEMPLATE => "pgdiffbXXXXX");
 	my $post_fh = new File::Temp(TEMPLATE => "pgdiffaXXXXX");
 
-	print $pre_fh $pre;
-	print $post_fh $post;
+	print $post_fh $indented;
 
-	$pre_fh->close();
 	$post_fh->close();
 
-	system( "diff $flags "
-		  . $pre_fh->filename . " "
-		  . $post_fh->filename
-		  . " >&2");
-	return;
+	my $diff = `diff -u $source_filename $post_fh->filename  2>&1`);
+	return $diff;
 }
 
 
@@ -404,6 +399,8 @@ push(@files, @ARGV);
 
 foreach my $source_filename (@files)
 {
+	# ignore anything that's not a .c or .h file
+	next unless $source_filename =~/\.[ch]$/;
 
 	# Automatically ignore .c and .h files that correspond to a .y or .l
 	# file.  indent tends to get badly confused by Bison/flex output,
@@ -429,7 +426,24 @@ foreach my $source_filename (@files)
 
 	$source = post_indent($source, $source_filename);
 
-	write_source($source, $source_filename) if $source ne $orig_source;
+	if ($source ne $orig_source)
+	{
+		if ($silent_diff)
+		{
+			exit 2;
+		}
+		elsif ($show_diff)
+		{
+			print show_diff($source, $source_filename);
+		}
+		else
+		{
+			write_source($source, $source_filename);
+		}
+	}
+
 }
 
 build_clean($code_base) if $build;
+
+exit 0;
#44Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#43)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-21 Sa 10:02, Andrew Dunstan wrote:

On 2023-01-21 Sa 10:00, Andrew Dunstan wrote:

On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:

On 2023-01-20 Fr 13:19, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

I'll take a look at fleshing this out, for my own if no-one else's use.

Here's a quick patch for this. I have it in mind to use like this in a
pre-commit hook:

# only do this on master
test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0

src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \

  { echo "Need a pgindent run" >&2 ; exit 1; }

The committer could then run

src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`

to see what changes it thinks are needed.

This time with patch.

... with typo fixed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pgindent_silent_plus_showdiff-2.patchtext/x-patch; charset=UTF-8; name=pgindent_silent_plus_showdiff-2.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 741b0ccb58..af0ddd1944 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -21,7 +21,8 @@ my $indent_opts =
 
 my $devnull = File::Spec->devnull;
 
-my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build);
+my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build,
+	$show_diff, $silent_diff);
 
 my %options = (
 	"typedefs=s"         => \$typedefs_file,
@@ -29,9 +30,14 @@ my %options = (
 	"code-base=s"        => \$code_base,
 	"excludes=s"         => \$excludes,
 	"indent=s"           => \$indent,
-	"build"              => \$build,);
+	"build"              => \$build,
+    "show-diff"          => \$show_diff,
+    "silent_diff"        => \$silent_diff,);
 GetOptions(%options) || die "bad command line argument\n";
 
+die "Cannot have both --silent-diff and --show-diff"
+  if $silent_diff && $show_diff;
+
 run_build($code_base) if ($build);
 
 # command line option wins, then first non-option arg,
@@ -283,30 +289,19 @@ sub run_indent
 
 }
 
-
-# for development diagnostics
-sub diff
+sub show_diff
 {
-	my $pre   = shift;
-	my $post  = shift;
-	my $flags = shift || "";
+	my $indented   = shift;
+	my $source_filename  = shift;
 
-	print STDERR "running diff\n";
-
-	my $pre_fh  = new File::Temp(TEMPLATE => "pgdiffbXXXXX");
 	my $post_fh = new File::Temp(TEMPLATE => "pgdiffaXXXXX");
 
-	print $pre_fh $pre;
-	print $post_fh $post;
+	print $post_fh $indented;
 
-	$pre_fh->close();
 	$post_fh->close();
 
-	system( "diff $flags "
-		  . $pre_fh->filename . " "
-		  . $post_fh->filename
-		  . " >&2");
-	return;
+	my $diff = `diff -u $source_filename $post_fh->filename  2>&1`;
+	return $diff;
 }
 
 
@@ -404,6 +399,8 @@ push(@files, @ARGV);
 
 foreach my $source_filename (@files)
 {
+	# ignore anything that's not a .c or .h file
+	next unless $source_filename =~/\.[ch]$/;
 
 	# Automatically ignore .c and .h files that correspond to a .y or .l
 	# file.  indent tends to get badly confused by Bison/flex output,
@@ -429,7 +426,24 @@ foreach my $source_filename (@files)
 
 	$source = post_indent($source, $source_filename);
 
-	write_source($source, $source_filename) if $source ne $orig_source;
+	if ($source ne $orig_source)
+	{
+		if ($silent_diff)
+		{
+			exit 2;
+		}
+		elsif ($show_diff)
+		{
+			print show_diff($source, $source_filename);
+		}
+		else
+		{
+			write_source($source, $source_filename);
+		}
+	}
+
 }
 
 build_clean($code_base) if $build;
+
+exit 0;
#45Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#44)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

That sounds like a good idea from here. I do not think we want a
mandatory commit filter, but if individual committers care to work
this into their process in some optional way, great! I can think
of ways I'd use it during patch development, too.

(One reason not to want a mandatory filter is that you might wish
to apply pgindent as a separate commit, so that you can then
put that commit into .git-blame-ignore-revs. This could be handy
for example when a patch needs to change the nesting level of a lot
of pre-existing code, without making changes in it otherwise.)

This time with patch.

... with typo fixed.

Looks reasonable, but you should also update
src/tools/pgindent/pgindent.man, which AFAICT is our only
documentation for pgindent switches. (Is it time for a
--help option in pgindent?)

regards, tom lane

#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#45)
Re: run pgindent on a regular basis / scripted manner

... btw, can we get away with making the diff run be "diff -upd"
not just "diff -u"? I find diff output for C files noticeably
more useful with those options, but I'm unsure about their
portability.

regards, tom lane

#47Bruce Momjian
bruce@momjian.us
In reply to: Jelte Fennema (#36)
Re: run pgindent on a regular basis / scripted manner

On Fri, Jan 20, 2023 at 10:43:50AM +0100, Jelte Fennema wrote:

Side-question: What's the reason why pgindent is used instead of some
more "modern" code formatter that doesn't require keeping
typedefs.list up to date for good looking output? (e.g. uncrustify or
clang-format) Because that would also allow for easy editor
integration.

One reason the typedef list is required is a quirk of the C syntax.
Most languages have a lexer/scanner, which tokenizes, and a parser,
which parses. The communication is usually one-way, lexer to parser.
For C, typedefs require the parser to feed new typedefs back into the
lexer:

http://calculist.blogspot.com/2009/02/c-typedef-parsing-problem.html

BSD indent doesn't have that feedback mechanism, probably because it
doesn't fully parse the C file. Therefore, we have to supply typedefs
manually, and for Postgres we pull them from debug-enabled binaries in
our buildfarm. The problem with that is you often import typedefs from
system headers, and the typedefs apply to all C files, not just the ones
were the typdefs are visible.

I don't see uncrustify or clang-format supporting typedef lists so maybe
they implemented this feedback loop. It would be good to see if we can
get either of these tools to match our formatting.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

In reply to: Bruce Momjian (#47)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 9:30 AM Bruce Momjian <bruce@momjian.us> wrote:

I don't see uncrustify or clang-format supporting typedef lists so maybe
they implemented this feedback loop. It would be good to see if we can
get either of these tools to match our formatting.

I personally use clang-format for Postgres development work, since it
integrates nicely with my text editor, and can be configured to
produce approximately the same result as pgindent (certainly good
enough when working on a patch that's still far from a committable
state). I'm fairly sure that clang-format has access to a full AST
from the clang compiler, which is the ideal approach - at least in
theory.

In practice this approach tends to run into problems when the relevant
AST isn't available. For example, if there's code that only builds on
Windows, maybe it won't work at all (at least on my Linux system).
This doesn't really bother me currently, since I only rely on
clang-format as a first pass sort of thing. Maybe I could figure out a
better way to deal with such issues, but right now I don't have much
incentive to.

Another advantage of clang-format is that it's a known quantity. For
example there is direct support for it built into meson, with bells
and whistles such as CI support:

https://mesonbuild.com/Code-formatting.html

My guess is that moving to clang-format would require giving up some
flexibility, but getting much better integration with text editors and
tools like meson in return. It would probably make it practical to
have much stronger rules about how committed code must be indented --
rules that are practical, and can actually be enforced. That trade-off
seems likely to be worth it in my view, though it's not something that
I feel too strongly about.

--
Peter Geoghegan

#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#48)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

In practice this approach tends to run into problems when the relevant
AST isn't available. For example, if there's code that only builds on
Windows, maybe it won't work at all (at least on my Linux system).

Hmm, that could be a deal-breaker. It's not going to be acceptable
to have to pgindent different parts of the system on different platforms
... at least not unless we can segregate them on the file level, and
even that would have a large PITA factor.

Still, we won't know unless someone makes a serious experiment with it.

regards, tom lane

In reply to: Tom Lane (#49)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 12:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm, that could be a deal-breaker. It's not going to be acceptable
to have to pgindent different parts of the system on different platforms
... at least not unless we can segregate them on the file level, and
even that would have a large PITA factor.

It's probably something that could be worked around. My remarks are
based on some dim memories of dealing with the tool before I arrived
at a configuration that works well enough for me. Importantly,
clang-format doesn't require you to futz around with Makefiles or
objdump or anything like that -- that's a huge plus. It doesn't seem
to impose any requirements on how I build Postgres at all (I generally
use gcc, not clang).

Even if these kinds of issues proved to be a problem for the person
tasked with running clang-format against the whole tree periodically,
they still likely wouldn't affect most of us. It's quite convenient to
use clang-format from an editor -- it can be invoked very
incrementally, against a small range of lines at a time. It's pretty
much something that I can treat like the built-in indent for my
editor. It's vastly different to the typical pgindent workflow.

Still, we won't know unless someone makes a serious experiment with it.

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Of course the way that I want to do things is (almost by definition)
the pgindent way, at least right now -- it's not necessarily about my
fixed preferences (though it can be hard to tell!). It's really not
surprising that clang-format cannot quite perfectly simulate pgindent.
How flexible can we be about stuff like that? Obviously there is no
clear answer right now.

--
Peter Geoghegan

#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#50)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

Of course the way that I want to do things is (almost by definition)
the pgindent way, at least right now -- it's not necessarily about my
fixed preferences (though it can be hard to tell!). It's really not
surprising that clang-format cannot quite perfectly simulate pgindent.
How flexible can we be about stuff like that? Obviously there is no
clear answer right now.

I don't feel wedded to every last detail of what pgindent does (and
especially not the bugs). But I think if the new tool is not a pretty
close match we'll be in for years of back-patching pain. We have made
changes in pgindent itself in the past, and the patching consequences
weren't *too* awful, but the changes weren't very big either.

As I said upthread, this is really impossible to answer without a
concrete proposal of how to configure clang-format and a survey of
what diffs we'd wind up with.

regards, tom lane

#52Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#50)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-21 14:05:41 -0800, Peter Geoghegan wrote:

On Sat, Jan 21, 2023 at 12:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm, that could be a deal-breaker. It's not going to be acceptable
to have to pgindent different parts of the system on different platforms
... at least not unless we can segregate them on the file level, and
even that would have a large PITA factor.

Unless I miss something, I don't think clang-format actually does that level
of C parsing - you can't pass include paths etc, so it really can't.

It's probably something that could be worked around. My remarks are
based on some dim memories of dealing with the tool before I arrived
at a configuration that works well enough for me.

Could you share your .clang-format?

Still, we won't know unless someone makes a serious experiment with it.

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Of course the way that I want to do things is (almost by definition)
the pgindent way, at least right now -- it's not necessarily about my
fixed preferences (though it can be hard to tell!). It's really not
surprising that clang-format cannot quite perfectly simulate pgindent.
How flexible can we be about stuff like that? Obviously there is no
clear answer right now.

I personally find the current indentation of variables assignment deeply
unhelpful - but changing it would be a very noisy change.

Greetings,

Andres Freund

#53Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#51)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-21 17:20:35 -0500, Tom Lane wrote:

I don't feel wedded to every last detail of what pgindent does (and
especially not the bugs). But I think if the new tool is not a pretty
close match we'll be in for years of back-patching pain. We have made
changes in pgindent itself in the past, and the patching consequences
weren't *too* awful, but the changes weren't very big either.

Perhaps we could backpatch formatting changes in a way that doesn't
inconvenience forks too much. One way to deal with such changes is to

1) revert the re-indent commits in $backbranch
2) merge $backbranch-with-revert into $forkbranch
3) re-indent $forkbranch

After that future changes should be mergable again.

Certainly doesn't do away with the pain entirely, but it does make it perhaps
bearable

Greetings,

Andres Freund

In reply to: Andres Freund (#52)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 2:43 PM Andres Freund <andres@anarazel.de> wrote:

Unless I miss something, I don't think clang-format actually does that level
of C parsing - you can't pass include paths etc, so it really can't.

It's hard to keep track of, since I also use clangd, which is
influenced by .clang-format for certain completions. It clearly does
plenty of stuff that requires an AST, since it requires a
compile_commands.json. You're the LLVM committer, not me.

Attached is my .clang-format, since you asked for it. It was
originally based on stuff that both you and Peter E posted several
years back, I believe. Plus the timescaledb one in one or two places.
I worked a couple of things out through trial and error. It's
relatively hard to follow the documentation, and there have been
features added to newer LLVM versions.

--
Peter Geoghegan

Attachments:

clang-formatapplication/octet-stream; name=clang-formatDownload
#55Jelte Fennema
postgres@jeltef.nl
In reply to: Andres Freund (#53)
2 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Attached are a .clang-format file and an uncrustify.cfg file that I
whipped up to match the current style at least somewhat.

I was unable to get either of them to produce the weird alignment of
declarations that pgindent outputs. Maybe that's just because I don't
understand what this alignment is supposed to do. Because to me the
current alignment seems completely random most of the time (and like
Andres said also not very unhelpful). For clang-format you should use
at least clang-format 15, otherwise it has some bugs in the alignment
logic.

One thing that clang-format really really wants to change in the
codebase, is making it consistent on what to do when
arguments/parameters don't fit on a single line: You can either choose
to minimally pack everything on the minimum number of lines, or to put
all arguments on their own separate line. Uncrustify is a lot less
strict about that and will leave most things as they currently are.

Overall it seems that uncrustify is a lot more customizable, whether
that's a good thing is debatable. Personally I think the fewer
possible debates I have about codestyle the better, so I usually like
the tools with fewer options better. But if the customizability allows
for closer matching of existing codestyle then it might be worth the
extra debates and effort in customization in this case.

1) revert the re-indent commits in $backbranch
2) merge $backbranch-with-revert into $forkbranch
3) re-indent $forkbranch

The git commands to achieve this are something like the following.
I've used such git commands in the past to make big automatic
refactors much easier to get in and it has worked quite well so far.

git checkout forkbranch
git rebase {commit-before-re-indent}
# now, get many merge conflicts
git rebase {re-indent-commit}
# keep your own changes (its --theirs instead of --ours because rebase
flips it around)
git checkout --theirs .
run-new-reindent
git add .
git rebase --continue

Show quoted text

On Sat, 21 Jan 2023 at 23:47, Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2023-01-21 17:20:35 -0500, Tom Lane wrote:

I don't feel wedded to every last detail of what pgindent does (and
especially not the bugs). But I think if the new tool is not a pretty
close match we'll be in for years of back-patching pain. We have made
changes in pgindent itself in the past, and the patching consequences
weren't *too* awful, but the changes weren't very big either.

Perhaps we could backpatch formatting changes in a way that doesn't
inconvenience forks too much. One way to deal with such changes is to

1) revert the re-indent commits in $backbranch
2) merge $backbranch-with-revert into $forkbranch
3) re-indent $forkbranch

After that future changes should be mergable again.

Certainly doesn't do away with the pain entirely, but it does make it perhaps
bearable

Greetings,

Andres Freund

Attachments:

uncrustify.cfgapplication/octet-stream; name=uncrustify.cfgDownload
.clang-formatapplication/octet-stream; name=.clang-formatDownload
In reply to: Jelte Fennema (#55)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 3:39 PM Jelte Fennema <postgres@jeltef.nl> wrote:

I was unable to get either of them to produce the weird alignment of
declarations that pgindent outputs. Maybe that's just because I don't
understand what this alignment is supposed to do. Because to me the
current alignment seems completely random most of the time (and like
Andres said also not very unhelpful). For clang-format you should use
at least clang-format 15, otherwise it has some bugs in the alignment
logic.

Really? I have been using 14, which is quite recent. Did you just
figure this out recently? If this is true, then it's certainly
discouraging.

I don't have a problem with the current pgindent alignment of function
parameters, so not sure what you mean about that. It *was* terrible
prior to commit e3860ffa, but that was back in 2017 (pg_bsd_indent 2.0
fixed that problem).

--
Peter Geoghegan

In reply to: Peter Geoghegan (#50)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Looks like I'm not the only one that doesn't like this behavior:

https://github.com/llvm/llvm-project/issues/55605

--
Peter Geoghegan

#58Bruce Momjian
bruce@momjian.us
In reply to: Peter Geoghegan (#57)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 21, 2023 at 08:08:34PM -0800, Peter Geoghegan wrote:

On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Looks like I'm not the only one that doesn't like this behavior:

https://github.com/llvm/llvm-project/issues/55605

Wow, that is very weird. When I work with other open source projects, I
am regularly surprised by their low quality requirements.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#59Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#58)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-21 23:21:22 -0500, Bruce Momjian wrote:

On Sat, Jan 21, 2023 at 08:08:34PM -0800, Peter Geoghegan wrote:

On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Looks like I'm not the only one that doesn't like this behavior:

https://github.com/llvm/llvm-project/issues/55605

Wow, that is very weird. When I work with other open source projects, I
am regularly surprised by their low quality requirements.

I don't think not fulfulling precisely our needs is the same as low quality
requirements.

Greetings,

Andres Freund

#60Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#54)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-21 15:32:45 -0800, Peter Geoghegan wrote:

Attached is my .clang-format, since you asked for it. It was
originally based on stuff that both you and Peter E posted several
years back, I believe. Plus the timescaledb one in one or two places.
I worked a couple of things out through trial and error. It's
relatively hard to follow the documentation, and there have been
features added to newer LLVM versions.

Reformatting with your clang-format end up with something like:

Peter's:
2234 files changed, 334753 insertions(+), 289772 deletions(-)

Jelte's:
2236 files changed, 357500 insertions(+), 306815 deletions(-)

Mine (modified to reduce this):
2226 files changed, 261538 insertions(+), 256039 deletions(-)

Which is all at least an order of magnitude too much.

Jelte's uncrustify:
1773 files changed, 121722 insertions(+), 125369 deletions(-)

better, but still not great. Also had to prevent a file files it choked on
from getting reindented.

I think the main issue with either is that our variable definition indentation
just can't be emulated by the tools as-is.

Some tools can indent variable definitions so that the variable name starts on
the same column. Some can limit that for too long type names. But so far I
haven't seen one that cn make that column be column +12. They all look to
other surrounding types.

I hate that variable name indentation with a fiery passion. But switching away
from that intermixed with a lot of other changes isn't going to be fun.

Greetings,

Andres Freund

#61Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#45)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-21 Sa 11:10, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

That sounds like a good idea from here. I do not think we want a
mandatory commit filter, but if individual committers care to work
this into their process in some optional way, great! I can think
of ways I'd use it during patch development, too.

Yes, it's intended for use at committers' discretion. We have no way of
forcing use of a git hook on committers, although we could reject pushes
that offend against certain rules. For the reasons you give below that's
not a good idea. A pre-commit hook can be avoided by using `git commit
-n` and there's are similar option/hook for `git merge`.

(One reason not to want a mandatory filter is that you might wish
to apply pgindent as a separate commit, so that you can then
put that commit into .git-blame-ignore-revs. This could be handy
for example when a patch needs to change the nesting level of a lot
of pre-existing code, without making changes in it otherwise.)

Agreed.

Looks reasonable, but you should also update
src/tools/pgindent/pgindent.man, which AFAICT is our only
documentation for pgindent switches. (Is it time for a
--help option in pgindent?)

Yes, see revised patch.

... btw, can we get away with making the diff run be "diff -upd"
not just "diff -u"? I find diff output for C files noticeably
more useful with those options, but I'm unsure about their
portability.

I think they are available on Linux, MacOS and FBSD, and on Windows (if
anyone's actually using it for this) it's likely to be Gnu diff. So I
think that's probably enough coverage.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pgindent_silent_plus_showdiff-3.patchtext/x-patch; charset=UTF-8; name=pgindent_silent_plus_showdiff-3.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 741b0ccb58..bad2a17582 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -21,16 +21,27 @@ my $indent_opts =
 
 my $devnull = File::Spec->devnull;
 
-my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build);
+my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build,
+	$show_diff, $silent_diff, $help);
+
+$help = 0;
 
 my %options = (
+	"help"               => \$help,
 	"typedefs=s"         => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
 	"code-base=s"        => \$code_base,
 	"excludes=s"         => \$excludes,
 	"indent=s"           => \$indent,
-	"build"              => \$build,);
-GetOptions(%options) || die "bad command line argument\n";
+	"build"              => \$build,
+    "show-diff"          => \$show_diff,
+    "silent_diff"        => \$silent_diff,);
+GetOptions(%options) || usage ("bad command line argument");
+
+usage() if $help;
+
+usage ("Cannot have both --silent-diff and --show-diff")
+  if $silent_diff && $show_diff;
 
 run_build($code_base) if ($build);
 
@@ -230,7 +241,6 @@ sub pre_indent
 sub post_indent
 {
 	my $source          = shift;
-	my $source_filename = shift;
 
 	# Restore CATALOG lines
 	$source =~ s!^/\*(CATALOG\(.*)\*/$!$1!gm;
@@ -280,33 +290,21 @@ sub run_indent
 	close($src_out);
 
 	return $source;
-
 }
 
-
-# for development diagnostics
-sub diff
+sub show_diff
 {
-	my $pre   = shift;
-	my $post  = shift;
-	my $flags = shift || "";
+	my $indented   = shift;
+	my $source_filename  = shift;
 
-	print STDERR "running diff\n";
+	my $post_fh = new File::Temp(TEMPLATE => "pgdiffXXXXX");
 
-	my $pre_fh  = new File::Temp(TEMPLATE => "pgdiffbXXXXX");
-	my $post_fh = new File::Temp(TEMPLATE => "pgdiffaXXXXX");
+	print $post_fh $indented;
 
-	print $pre_fh $pre;
-	print $post_fh $post;
-
-	$pre_fh->close();
 	$post_fh->close();
 
-	system( "diff $flags "
-		  . $pre_fh->filename . " "
-		  . $post_fh->filename
-		  . " >&2");
-	return;
+	my $diff = `diff -upd $source_filename $post_fh->filename  2>&1`;
+	return $diff;
 }
 
 
@@ -377,6 +375,33 @@ sub build_clean
 	return;
 }
 
+sub usage
+{
+	my $message = shift;
+	my $helptext = <<'EOF';
+pgindent [OPTION]... [FILE]...
+Options:
+	--help                  show this message and quit
+	--typedefs=FILE         file containing a list of typedefs
+	--list-of-typedefs=STR  string containing typedefs, space separated
+	--code-base=DIR         path to the base of PostgreSQL source code
+	--excludes=PATH         file containing list of filename patterns to ignore
+	--indent=PATH           path to pg_bsd_indent program
+	--build                 build the pg_bsd_indent program
+    --show-diff             show the changes that would be made
+    --silent_diff           exit with status 2 if any changes would be made
+EOF
+	if ($help)
+	{
+		print $helptext;
+		exit 0;
+	}
+	else
+	{
+		print STDERR "$message\n", $helptext;
+		exit 1;
+	}
+}
 
 # main
 
@@ -404,6 +429,8 @@ push(@files, @ARGV);
 
 foreach my $source_filename (@files)
 {
+	# ignore anything that's not a .c or .h file
+	next unless $source_filename =~/\.[ch]$/;
 
 	# Automatically ignore .c and .h files that correspond to a .y or .l
 	# file.  indent tends to get badly confused by Bison/flex output,
@@ -427,9 +454,26 @@ foreach my $source_filename (@files)
 		next;
 	}
 
-	$source = post_indent($source, $source_filename);
+	$source = post_indent($source);
+
+	if ($source ne $orig_source)
+	{
+		if ($silent_diff)
+		{
+			exit 2;
+		}
+		elsif ($show_diff)
+		{
+			print show_diff($source, $source_filename);
+		}
+		else
+		{
+			write_source($source, $source_filename);
+		}
+	}
 
-	write_source($source, $source_filename) if $source ne $orig_source;
 }
 
 build_clean($code_base) if $build;
+
+exit 0;
diff --git a/src/tools/pgindent/pgindent.man b/src/tools/pgindent/pgindent.man
index 1c5aedda35..06bd748be1 100644
--- a/src/tools/pgindent/pgindent.man
+++ b/src/tools/pgindent/pgindent.man
@@ -33,6 +33,13 @@ file can be specified using the --excludes command line option. If indenting
 a PostgreSQL source tree, this option isn't necessary, as it will find the file
 src/tools/pgindent/exclude_file_patterns.
 
+There are also two non-destructive modes of pgindent. If given the --show-diffs
+option pgindent will show the changes it would make, but doesn't actually make
+them. If given instead the --silent-diff option, pgindent will exit with a
+status of 2 if it finds any indent changes are required, but will not
+make the changes or give any other information. This mode is intended for
+possible use in a git pre-commit hook.
+
 Any non-option arguments are taken as the names of files to be indented. In this
 case only these files will be changed, and nothing else will be touched. If the
 first non-option argument is not a .c or .h file, it is treated as the name
#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#61)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

... btw, can we get away with making the diff run be "diff -upd"
not just "diff -u"? I find diff output for C files noticeably
more useful with those options, but I'm unsure about their
portability.

I think they are available on Linux, MacOS and FBSD, and on Windows (if
anyone's actually using it for this) it's likely to be Gnu diff. So I
think that's probably enough coverage.

I checked NetBSD as well, and it has all three too.

Patch looks good to me.

regards, tom lane

#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#56)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

On Sat, Jan 21, 2023 at 3:39 PM Jelte Fennema <postgres@jeltef.nl> wrote:

... For clang-format you should use
at least clang-format 15, otherwise it has some bugs in the alignment
logic.

Really? I have been using 14, which is quite recent. Did you just
figure this out recently? If this is true, then it's certainly
discouraging.

Indeed. What that points to is a future where different contributors
get different results depending on what clang version they have
installed --- and it's not going to be practical to insist that
everybody have the same version, because AFAICT clang-format is tied
to clang itself. So that sounds a bit unappetizing.

One of the few advantages of the current tool situation is that at any
time there's just one agreed-on version of pgindent and pgperltidy.
I've not heard push-back about our policy that you should use
perltidy version 20170521, because that's not especially connected
to any other part of one's system. Maybe the same would hold for
uncrustify, but it's never going to work for pieces of the clang
ecosystem.

regards, tom lane

#64Jelte Fennema
postgres@jeltef.nl
In reply to: Andres Freund (#60)
Re: run pgindent on a regular basis / scripted manner

But so far I haven't seen one that can make that
column be column +12.

Thanks for clarifying what the current variable declaration indention
rule is. Indeed neither uncrustify or clang-format seem to support
that. Getting uncrustify to support it might not be too difficult, but
the question remains if we even want that.

But switching away from that intermixed with a lot of other changes isn't going to be fun.

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

I don't have a problem with the current pgindent alignment of function
parameters, so not sure what you mean about that.

Function parameter alignment is fine with pgindent imho, but this +12
column variable declaration thing I personally think is quite weird.

Really? I have been using 14, which is quite recent. Did you just
figure this out recently? If this is true, then it's certainly
discouraging.

It seems this was due to my Ubuntu 22.04 install having clang-format
14.0.0. After
updating it to 14.0.6 by using the official llvm provided packages, I
don't have this
issue on clang-format-14 anymore. To be clear this was an issue in alignment of
variable declarations not function parameters.

But I agree with Tom Lane that this makes clear that whatever tool we
pick we'll need
to pick a specific version, just like we do now with perltidy. And
indeed I'm not sure
how easy that is with clang. Installing a specific uncrustify version
is pretty easy btw,
the compilation from source is quite quick.

#65Jelte Fennema
postgres@jeltef.nl
In reply to: Jelte Fennema (#64)
Re: run pgindent on a regular basis / scripted manner

But I do think this discussion about other formatting tools
is distracting from the main pain point I wanted to discuss:
our current formatting tool is not run consistently enough.
The only thing that another tool will change in this
regard is that there is no need to update typedefs.list.
It doesn't seem like that's such a significant difference
that it would change the solution to the first problem.

When reading the emails in this discussion from 2 years ago
it seems like the respondents wouldn't mind updating the
typedefs.list manually. And proposed approach number 3
seemed to have support overall, i.e. fail a push to master
when pgindent was not run on the new commit. Would
it make sense to simply try that approach and see if
there's any big issues with it?

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

My guess would be that the main reason is simply
because committers forget it sometimes because
there's no automation complaining about it.

Show quoted text

On Sun, 22 Jan 2023 at 18:20, Jelte Fennema <postgres@jeltef.nl> wrote:

But so far I haven't seen one that can make that
column be column +12.

Thanks for clarifying what the current variable declaration indention
rule is. Indeed neither uncrustify or clang-format seem to support
that. Getting uncrustify to support it might not be too difficult, but
the question remains if we even want that.

But switching away from that intermixed with a lot of other changes isn't going to be fun.

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

I don't have a problem with the current pgindent alignment of function
parameters, so not sure what you mean about that.

Function parameter alignment is fine with pgindent imho, but this +12
column variable declaration thing I personally think is quite weird.

Really? I have been using 14, which is quite recent. Did you just
figure this out recently? If this is true, then it's certainly
discouraging.

It seems this was due to my Ubuntu 22.04 install having clang-format
14.0.0. After
updating it to 14.0.6 by using the official llvm provided packages, I
don't have this
issue on clang-format-14 anymore. To be clear this was an issue in alignment of
variable declarations not function parameters.

But I agree with Tom Lane that this makes clear that whatever tool we
pick we'll need
to pick a specific version, just like we do now with perltidy. And
indeed I'm not sure
how easy that is with clang. Installing a specific uncrustify version
is pretty easy btw,
the compilation from source is quite quick.

#66Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#65)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

When reading the emails in this discussion from 2 years ago
it seems like the respondents wouldn't mind updating the
typedefs.list manually. And proposed approach number 3
seemed to have support overall, i.e. fail a push to master
when pgindent was not run on the new commit. Would
it make sense to simply try that approach and see if
there's any big issues with it?

I will absolutely not accept putting in something that fails pushes
on this basis. There are too many cases where pgindent purity is
not an overriding issue. I mentioned a counterexample just upthread:
even if you are as anal as you could be about indentation, you might
prefer to separate a logic-changing patch from the ensuing mechanical
reindentation.

regards, tom lane

#67Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#66)
Re: run pgindent on a regular basis / scripted manner

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time. Apart
from that "git diff -w" will hide any whitespace changes so I'm
not I personally wouldn't consider it important to split such
changes across commits.

#68Andres Freund
andres@anarazel.de
In reply to: Jelte Fennema (#67)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-22 22:19:10 +0100, Jelte Fennema wrote:

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time.

Right.

Apart from that "git diff -w" will hide any whitespace changes so I'm not I
personally wouldn't consider it important to split such changes across
commits.

I do think it's important. For one, the changes made by pgindent et al aren't
just whitespace ones. But I think it's also important to be able to see the
actual changes made in a patch precisely - lots of spurious whitespace changes
could indicate a problem.

Greetings,

Andres Freund

#69Andres Freund
andres@anarazel.de
In reply to: Jelte Fennema (#64)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:

But switching away from that intermixed with a lot of other changes isn't going to be fun.

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

It's practically not possible to review a 300k line change. And perhaps I'm
paranoid, but I would have a problem with a commit in the history that's
practically not reviewable.

I don't have a problem with the current pgindent alignment of function
parameters, so not sure what you mean about that.

Function parameter alignment is fine with pgindent imho, but this +12
column variable declaration thing I personally think is quite weird.

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Personally I think the cost of trying to align local variable names is way way
higher than the benefit.

Greetings,

Andres Freund

#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#69)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Yeah. That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up. That's just awful. You can either
run pgperltidy on new code before committing, and accept that the feature
patch will touch a lot of lines it's not making real changes to (thereby
dirtying the "git blame" history) or not do so and thereby commit code
that's not passing tidiness checks. Let's *not* adopt any style that
causes similar things to start happening in our C code.

regards, tom lane

#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#67)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time.

I don't know that much about git commit hooks, but do they really
only check the final state of a series of commits?

In any case, I'm still down on the idea of checking this in a
commit hook because of the complexity and lack of transparency
of such a check. If you think your commit is correctly indented,
but the hook (running on somebody else's machine) disagrees,
how are you going to debug that? I don't want to get into such
a situation, especially since Murphy's law guarantees that it
would mainly bite people under time pressure, like when pushing
a security fix.

regards, tom lane

#72Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#69)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

It's practically not possible to review a 300k line change. And perhaps I'm
paranoid, but I would have a problem with a commit in the history that's
practically not reviewable.

As far as that goes, if you had concern then you could run the indentation
tool locally and confirm you got matching results. But this does point up
that the processes Jelte suggested all depend critically on indentation
results being 100% reproducible by anybody.

So the more I think about it the less excited I am about depending on
clang-format, because version skew in peoples' clang installations seems
inevitable, and there's good reason to fear that that would show up
as varying indentation results.

regards, tom lane

In reply to: Tom Lane (#72)
Re: run pgindent on a regular basis / scripted manner

On Sun, Jan 22, 2023 at 3:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

So the more I think about it the less excited I am about depending on
clang-format, because version skew in peoples' clang installations seems
inevitable, and there's good reason to fear that that would show up
as varying indentation results.

I have to admit that the way that I was thinking about this was
colored by the way that I use clang-format today. I only now realize
how different my requirements are to the requirements that we'd have
for any tool that gets run against the tree in bulk. In particular, I
didn't realize how annoying the non-additive nature of certain
variable alignment rules might be until you pointed it out today
(seems obvious now!).

In my experience clang-format really shines when you need to quickly
indent code that is indented in some way that looks completely wrong
-- it does quite a lot better than pgindent when that's your starting
point. It has a reasonable way of balancing competing goals like
maximum number of columns (a soft maximum) and how function parameters
are displayed, which pgindent can't do. It also avoids allowing a
function parameter from a function declaration with its type name on
its own line, before the variable name -- also beyond the capabilities
of pgindent IIRC.

Features like that make it very useful as a first pass thing, where
all the bells and whistles have little real downside. Running
clang-format and then running pgindent tends to produce better results
than just running pgindent, at least when working on a new patch.

--
Peter Geoghegan

#74Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#72)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-22 18:28:27 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

It's practically not possible to review a 300k line change. And perhaps I'm
paranoid, but I would have a problem with a commit in the history that's
practically not reviewable.

As far as that goes, if you had concern then you could run the indentation
tool locally and confirm you got matching results.

Of course, but I somehow feel a change of formatting should be reviewable to
at least some degree. Even if it's just to make sure that the tool didn't have
a bug and cause some subtle behavioural change.

So the more I think about it the less excited I am about depending on
clang-format, because version skew in peoples' clang installations seems
inevitable, and there's good reason to fear that that would show up
as varying indentation results.

One thing that I like about clang-format is that it's possible to treat it
about our include order rules (which does find some "irregularities). But of
course that's not enough.

If we decide to move to another tool, I think it might be worth to remove a
few of the pg_bsd_indent options, that other tools won't be able to emulate,
first. E.g. -di12 -> -di4 would remove a *lot* of the noise from a move to
another tool. And be much easier to write manually, but ... :)

I think I've proposed this before, but I still think that as long as we rely
on pg_bsd_indent, we should have it be part of our source tree and
automatically built. It's no wonder that barely anybody indents their
patches, given that it requires building pg_bsd_ident in a separate repo (but
referencing our sourc etree), putting the binary in path, etc.

Greetings,

Andres Freund

#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#74)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

I think I've proposed this before, but I still think that as long as we rely
on pg_bsd_indent, we should have it be part of our source tree and
automatically built. It's no wonder that barely anybody indents their
patches, given that it requires building pg_bsd_ident in a separate repo (but
referencing our sourc etree), putting the binary in path, etc.

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license? If we could
get past the license question, I agree that doing what you suggest
would be superior to the current situation.

regards, tom lane

#76Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#75)
Re: run pgindent on a regular basis / scripted manner

On Sun, Jan 22, 2023 at 07:28:42PM -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I think I've proposed this before, but I still think that as long as we rely
on pg_bsd_indent, we should have it be part of our source tree and
automatically built. It's no wonder that barely anybody indents their
patches, given that it requires building pg_bsd_ident in a separate repo (but
referencing our sourc etree), putting the binary in path, etc.

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license? If we could
get past the license question, I agree that doing what you suggest
would be superior to the current situation.

+1.
--
Michael
#77Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#71)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-22 Su 18:14, Tom Lane wrote:

Jelte Fennema <postgres@jeltef.nl> writes:

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time.

I don't know that much about git commit hooks, but do they really
only check the final state of a series of commits?

The pre-commit hook is literally run every time you do `git commit`. But
it's only run on your local instance and only if you have enabled it.
It's not project-wide.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#78Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#77)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-22 19:50:10 -0500, Andrew Dunstan wrote:

On 2023-01-22 Su 18:14, Tom Lane wrote:

Jelte Fennema <postgres@jeltef.nl> writes:

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time.

I don't know that much about git commit hooks, but do they really
only check the final state of a series of commits?

The pre-commit hook is literally run every time you do `git commit`. But
it's only run on your local instance and only if you have enabled it.
It's not project-wide.

There's different hooks. Locally, I think pre-push would be better suited to
this than pre-commit (I often save WIP work in local branches, it'd be pretty
annoying if some indentation thing swore at me).

But there's also hooks like pre-receive, that allow doing validation on the
server side. Which obviously would be project wide...

Greetings,

Andres Freund

#79Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#75)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-22 19:28:42 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I think I've proposed this before, but I still think that as long as we rely
on pg_bsd_indent, we should have it be part of our source tree and
automatically built. It's no wonder that barely anybody indents their
patches, given that it requires building pg_bsd_ident in a separate repo (but
referencing our sourc etree), putting the binary in path, etc.

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license?

Yep:
https://www.freebsd.org/copyright/license/

NOTE: The copyright of UC Berkeley’s Berkeley Software Distribution ("BSD") source has been updated. The copyright addendum may be found at ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change and is included below.

July 22, 1999

To All Licensees, Distributors of Any Version of BSD:

As you know, certain of the Berkeley Software Distribution ("BSD") source code files require that further distributions of products containing all or portions of the software, acknowledge within their advertising materials that such products contain software developed by UC Berkeley and its contributors.

Specifically, the provision reads:

* 3. All advertising materials mentioning features or use of this software
* must display the following acknowledgement:
* This product includes software developed by the University of
* California, Berkeley and its contributors."

Effective immediately, licensees and distributors are no longer required to include the acknowledgement within advertising materials. Accordingly, the foregoing paragraph of those BSD Unix files containing it is hereby deleted in its entirety.

William Hoskins
Director, Office of Technology Licensing
University of California, Berkeley

I did check, and the FTP bit is still downloadable. A bit awkward though, now
that browsers have/are removing ftp support.

Greetings,

Andres Freund

#80Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#79)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 19:28:42 -0500, Tom Lane wrote:

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license?

Yep:
https://www.freebsd.org/copyright/license/

Cool. I'll take a look at doing this later (probably after the current
CF) unless somebody beats me to it.

regards, tom lane

#81Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#80)
2 attachment(s)
Re: run pgindent on a regular basis / scripted manner

I whipped up a pre-commit hook which automatically runs pgindent on the
changed files in the commit. It won't add any changes automatically, but
instead it fails the commit if it made any changes. That way you can add
them manually if you want. Or if you don't, you can simply run git commit
again without adding the changes. (or you can use the --no-verify flag of
git commit to skip the hook completely)

It did require adding some extra flags to pgindent. While it only required
the --staged-only and --fail-on-changed flags, the --changed-only flag
was easy to add and seemed generally useful.

I also attached a patch which adds the rules for formatting pgindent
itself to the .editorconfig file.

Locally, I think pre-push would be better suited to
this than pre-commit (I often save WIP work in local branches, it'd be pretty
annoying if some indentation thing swore at me).

I personally prefer pre-commit hooks, since then I don't have to
go back and change some commit I made some time ago. And I
think with the easy opt-out that this hook has it would work fine
for my own workflow. But it shouldn't be hard to also include a
pre-push hook too if you want that after all. Then people can
choose to install the hook that they prefer.

Show quoted text

On Mon, 23 Jan 2023 at 02:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-01-22 19:28:42 -0500, Tom Lane wrote:

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license?

Yep:
https://www.freebsd.org/copyright/license/

Cool. I'll take a look at doing this later (probably after the current
CF) unless somebody beats me to it.

regards, tom lane

Attachments:

v1-0002-Add-pgindent-pre-commit-hook.patchapplication/octet-stream; name=v1-0002-Add-pgindent-pre-commit-hook.patchDownload
From 7450a781beeb0ca95eae9baf85dc9708f652b674 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Mon, 23 Jan 2023 11:14:09 +0100
Subject: [PATCH v1 2/2] Add pgindent pre-commit hook

It's easy to forget to run pgindent before doing a commit, eventhough
most of the time you will want to do it. This adds an (optional)
pre-commit hook that can run pgindent automatically for you whenever you
commit. To make this pgindent faster to run it also adds flags to
pgindent to only run against staged/changed files instead of against the
whole repo.
---
 src/tools/pgindent/README     | 19 +++++++++++
 src/tools/pgindent/pgindent   | 64 ++++++++++++++++++++++++++++-------
 src/tools/pgindent/pre-commit |  4 +++
 3 files changed, 74 insertions(+), 13 deletions(-)
 create mode 100755 src/tools/pgindent/pre-commit

diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index 103970c1042..7b67f30c062 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -150,6 +150,9 @@ but we currently exclude *.y and *.l files, as well as *.c and *.h files
 derived from *.y and *.l files.  Additional exceptions are listed
 in exclude_file_patterns; see the notes therein for rationale.
 
+You can choose to only process files that are changed according to git by using
+the --staged-only or --changed-only flags.
+
 Note that we do not exclude ecpg's header files from the run.  Some of them
 get copied verbatim into ecpg's output, meaning that ecpg's expected files
 may need to be updated to match.
@@ -157,3 +160,19 @@ may need to be updated to match.
 The perltidy run processes all *.pl and *.pm files, plus a few
 executable Perl scripts that are not named that way.  See the "find"
 rules in pgperltidy for details.
+
+---------------------------------------------------------------------------
+
+Automation
+----------
+
+You can install the pre-commit hook by running:
+
+	cp src/tools/pgindent/pre-commit .git/hooks/pre-commit
+
+This hook will automatically run pgindent on all of the files you are
+commiting. It will abort the commit if it made any changes, so that you can add
+those changes to your commit. If you did not indent this commit on purpose,
+simply run git commit again and it will succeed. If before committing you
+already know you don't want to run pgindent at all you can use the --no-verify
+flag of git commit to skip the hook.
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 741b0ccb586..b6c79b49872 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -21,7 +21,7 @@ my $indent_opts =
 
 my $devnull = File::Spec->devnull;
 
-my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build);
+my ($typedefs_file, $typedef_str, $code_base, $excludes, $indent, $build, $changed_only, $staged_only, $fail_on_changed);
 
 my %options = (
 	"typedefs=s"         => \$typedefs_file,
@@ -29,7 +29,10 @@ my %options = (
 	"code-base=s"        => \$code_base,
 	"excludes=s"         => \$excludes,
 	"indent=s"           => \$indent,
-	"build"              => \$build,);
+	"build"              => \$build,
+	"changed-only"       => \$changed_only,
+	"staged-only"        => \$staged_only,
+	"fail-on-changed"    => \$fail_on_changed);
 GetOptions(%options) || die "bad command line argument\n";
 
 run_build($code_base) if ($build);
@@ -381,17 +384,42 @@ sub build_clean
 # main
 
 # get the list of files under code base, if it's set
-File::Find::find(
+if ($changed_only)
+{
+	my $staged_files = `git diff --name-only --diff-filter=ACMR`;
+	if ($? != 0)
 	{
-		wanted => sub {
-			my ($dev, $ino, $mode, $nlink, $uid, $gid);
-			(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
-			  && -f _
-			  && /^.*\.[ch]\z/s
-			  && push(@files, $File::Find::name);
-		}
-	},
-	$code_base) if $code_base;
+		print STDERR
+		  "Failed to get changed files using git.\n";
+		exit 1;
+	}
+	@files = grep(/^.*\.[ch]\z/s, split("\n", $staged_files));
+}
+elsif ($staged_only)
+{
+	my $staged_files = `git diff --staged --name-only --diff-filter=ACMR`;
+	if ($? != 0)
+	{
+		print STDERR
+		  "Failed to get staged files using git.\n";
+		exit 1;
+	}
+	@files = grep(/^.*\.[ch]\z/s, split("\n", $staged_files));
+}
+else
+{
+	File::Find::find(
+		{
+			wanted => sub {
+				my ($dev, $ino, $mode, $nlink, $uid, $gid);
+				(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
+				  && -f _
+				  && /^.*\.[ch]\z/s
+				  && push(@files, $File::Find::name);
+			}
+		},
+		$code_base) if $code_base;
+}
 
 process_exclude();
 
@@ -402,6 +430,8 @@ check_indent();
 # any non-option arguments are files to be processed
 push(@files, @ARGV);
 
+my $changed_any_files = 0;
+
 foreach my $source_filename (@files)
 {
 
@@ -429,7 +459,15 @@ foreach my $source_filename (@files)
 
 	$source = post_indent($source, $source_filename);
 
-	write_source($source, $source_filename) if $source ne $orig_source;
+	if ($source ne $orig_source) {
+		write_source($source, $source_filename);
+		$changed_any_files = 1;
+	}
 }
 
 build_clean($code_base) if $build;
+if ($fail_on_changed && $changed_any_files)
+{
+	print STDERR "ERROR: Some of your files required formatting changes according to pgindent.\n";
+	exit(1)
+}
diff --git a/src/tools/pgindent/pre-commit b/src/tools/pgindent/pre-commit
new file mode 100755
index 00000000000..b0f2347ebfd
--- /dev/null
+++ b/src/tools/pgindent/pre-commit
@@ -0,0 +1,4 @@
+#!/bin/sh
+set -eu
+
+src/tools/pgindent/pgindent --staged-only --fail-on-changed
-- 
2.34.1

v1-0001-Add-pgindent-to-.editorconfig.patchapplication/octet-stream; name=v1-0001-Add-pgindent-to-.editorconfig.patchDownload
From ba32678de85c012567f4e283c8f8520601992e00 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Mon, 23 Jan 2023 11:17:22 +0100
Subject: [PATCH v1 1/2] Add pgindent to .editorconfig

Since it doesn't have the perl extension it didn't match the previous
rules.
---
 .editorconfig | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/.editorconfig b/.editorconfig
index d69a3d1dc4e..91bf0f2d055 100644
--- a/.editorconfig
+++ b/.editorconfig
@@ -5,6 +5,11 @@ indent_style = tab
 indent_size = tab
 tab_width = 4
 
+[src/tools/pgindent/pgindent]
+indent_style = tab
+indent_size = tab
+tab_width = 4
+
 [*.{sgml,xml}]
 indent_style = space
 indent_size = 1
-- 
2.34.1

#82Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#78)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-22 Su 20:03, Andres Freund wrote:

Hi,

On 2023-01-22 19:50:10 -0500, Andrew Dunstan wrote:

On 2023-01-22 Su 18:14, Tom Lane wrote:

Jelte Fennema <postgres@jeltef.nl> writes:

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time.

I don't know that much about git commit hooks, but do they really
only check the final state of a series of commits?

The pre-commit hook is literally run every time you do `git commit`. But
it's only run on your local instance and only if you have enabled it.
It's not project-wide.

There's different hooks. Locally, I think pre-push would be better suited to
this than pre-commit (I often save WIP work in local branches, it'd be pretty
annoying if some indentation thing swore at me).

Yes, me too, so I currently have a filter in my hook that ignores local
WIP branches. The problem with pre-push is that by the time you're
pushing you have already committed and you would have to go back and
undo some stuff to fix it. Probably 99 times out of 100 I'd prefer to
commit indented code off the bat rather than make a separate indentation
commit. But this really illustrates my point: how you do it is up to you.

But there's also hooks like pre-receive, that allow doing validation on the
server side. Which obviously would be project wide...

Yes, but I think it's been demonstrated (again) that there's no
consensus in using those for this purpose.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#83Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#62)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-22 Su 11:18, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

... btw, can we get away with making the diff run be "diff -upd"
not just "diff -u"? I find diff output for C files noticeably
more useful with those options, but I'm unsure about their
portability.

I think they are available on Linux, MacOS and FBSD, and on Windows (if
anyone's actually using it for this) it's likely to be Gnu diff. So I
think that's probably enough coverage.

I checked NetBSD as well, and it has all three too.

Patch looks good to me.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#84Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#81)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-23 Mo 05:44, Jelte Fennema wrote:

I whipped up a pre-commit hook which automatically runs pgindent on the
changed files in the commit. It won't add any changes automatically, but
instead it fails the commit if it made any changes. That way you can add
them manually if you want. Or if you don't, you can simply run git commit
again without adding the changes. (or you can use the --no-verify flag of
git commit to skip the hook completely)

It did require adding some extra flags to pgindent. While it only required
the --staged-only and --fail-on-changed flags, the --changed-only flag
was easy to add and seemed generally useful.

Please see the changes to pgindent I committed about the same time I got
your email. I don't think we need your new flags, as it's possible (and
always has been) to provide pgindent with a list of files to be
indented. Instead of having pgindent run `git diff --name-only ...` the
git hook can do it and pass the results to pgindent in its command line.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#85Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#84)
2 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Indeed the flags you added are enough. Attached is a patch
that adds an updated pre-commit hook with the same behaviour
as the one before. I definitely think having a pre-commit hook
in the repo is beneficial, since writing one that works in all
cases definitely takes some time.

as it's possible (and
always has been) to provide pgindent with a list of files to be
indented.

I guess I didn't realise this was a feature that existed, because
none of the documentation mentioned it.

Show quoted text

On Mon, 23 Jan 2023 at 15:07, Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-01-23 Mo 05:44, Jelte Fennema wrote:

I whipped up a pre-commit hook which automatically runs pgindent on the
changed files in the commit. It won't add any changes automatically, but
instead it fails the commit if it made any changes. That way you can add
them manually if you want. Or if you don't, you can simply run git commit
again without adding the changes. (or you can use the --no-verify flag of
git commit to skip the hook completely)

It did require adding some extra flags to pgindent. While it only required
the --staged-only and --fail-on-changed flags, the --changed-only flag
was easy to add and seemed generally useful.

Please see the changes to pgindent I committed about the same time I got
your email. I don't think we need your new flags, as it's possible (and
always has been) to provide pgindent with a list of files to be
indented. Instead of having pgindent run `git diff --name-only ...` the
git hook can do it and pass the results to pgindent in its command line.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

v2-0001-Add-pgindent-to-.editorconfig.patchapplication/octet-stream; name=v2-0001-Add-pgindent-to-.editorconfig.patchDownload
From a87e365050706d8e326b75afe2e2cafcf9eecbe4 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Mon, 23 Jan 2023 11:17:22 +0100
Subject: [PATCH v2 1/2] Add pgindent to .editorconfig

Since it doesn't have the perl extension it didn't match the previous
rules.
---
 .editorconfig | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/.editorconfig b/.editorconfig
index d69a3d1dc4e..91bf0f2d055 100644
--- a/.editorconfig
+++ b/.editorconfig
@@ -5,6 +5,11 @@ indent_style = tab
 indent_size = tab
 tab_width = 4
 
+[src/tools/pgindent/pgindent]
+indent_style = tab
+indent_size = tab
+tab_width = 4
+
 [*.{sgml,xml}]
 indent_style = space
 indent_size = 1
-- 
2.34.1

v2-0002-Add-pgindent-pre-commit-hook.patchapplication/octet-stream; name=v2-0002-Add-pgindent-pre-commit-hook.patchDownload
From b8b3e2abbb8d1ea51254204000b454dd7871543b Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Mon, 23 Jan 2023 11:14:09 +0100
Subject: [PATCH v2 2/2] Add pgindent pre-commit hook

It's easy to forget to run pgindent before doing a commit, eventhough
most of the time you will want to do it. This adds an (optional)
pre-commit hook that can run pgindent automatically for you whenever you
commit.
---
 src/tools/pgindent/README     | 16 ++++++++++++++++
 src/tools/pgindent/pre-commit | 15 +++++++++++++++
 2 files changed, 31 insertions(+)
 create mode 100755 src/tools/pgindent/pre-commit

diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index 103970c1042..9f256505a1a 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -157,3 +157,19 @@ may need to be updated to match.
 The perltidy run processes all *.pl and *.pm files, plus a few
 executable Perl scripts that are not named that way.  See the "find"
 rules in pgperltidy for details.
+
+---------------------------------------------------------------------------
+
+Automation
+----------
+
+You can install the pre-commit hook by running:
+
+	cp src/tools/pgindent/pre-commit .git/hooks/pre-commit
+
+This hook will automatically run pgindent on all of the files you are
+commiting. It will abort the commit if it made any changes, so that you can add
+those changes to your commit. If you did not indent this commit on purpose,
+simply run git commit again and it will succeed. If before committing you
+already know you don't want to run pgindent at all you can use the --no-verify
+flag of git commit to skip the hook.
diff --git a/src/tools/pgindent/pre-commit b/src/tools/pgindent/pre-commit
new file mode 100755
index 00000000000..7a61a770719
--- /dev/null
+++ b/src/tools/pgindent/pre-commit
@@ -0,0 +1,15 @@
+#!/bin/sh
+set -eu
+
+files=$(git diff --cached --name-only --diff-filter=ACMR | { grep '\.[ch]$' || true ; })
+if [ "$files" = '' ]; then
+    exit 0
+fi
+
+git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' |\
+    xargs src/tools/pgindent/pgindent --silent-diff \
+    || {
+        echo ERROR: Aborting commit because pgindent was not run
+        git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' | xargs src/tools/pgindent/pgindent
+        exit 1
+    }
-- 
2.34.1

#86Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#80)
Re: run pgindent on a regular basis / scripted manner

I wrote:

Cool. I'll take a look at doing this later (probably after the current
CF) unless somebody beats me to it.

Thinking about that (importing pg_bsd_indent into our main source
tree) a bit harder:

1. I'd originally thought vaguely that we could teach pgindent
how to build pg_bsd_indent automatically. But with a little
more consideration, I doubt that would work transparently.
It's common (at least for me) to run pgindent in a distclean'd
tree, where configure results wouldn't be available. It's even
worse if you habitually use VPATH builds, so that those files
*never* exist in your source tree. So now I think that we should
stick to the convention that it's on the user to install
pg_bsd_indent somewhere in their PATH; all we'll be doing with
this change is eliminating the step of fetching pg_bsd_indent's
source files from somewhere else.

2. Given #1, it'll be prudent to continue having pgindent
double-check that pg_bsd_indent reports a specific version
number. We could imagine starting to use the main Postgres
version number for that, but I'm inclined to continue with
its existing numbering series. One argument for that is
that we generally change pg_bsd_indent less often than annually,
so having it track the main version would end up forcing
make-work builds of your installed pg_bsd_indent at least
once a year. Also, when we do change pg_bsd_indent, it's
typically right before a mass reindentation commit, and those
do not happen at the same time as forking a new PG version.

3. If we do nothing special, the first mass reindentation is
going to reformat the pg_bsd_indent sources per PG style,
which is ... er ... not the way they look now. Do we want
to accept that outcome, or take steps to prevent pgindent
from processing pg_bsd_indent? I have a feeling that manual
cleanup would be necessary if we let such reindentation
happen, but I haven't experimented.

regards, tom lane

#87Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#86)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-01-23 10:09:06 -0500, Tom Lane wrote:

1. I'd originally thought vaguely that we could teach pgindent
how to build pg_bsd_indent automatically. But with a little
more consideration, I doubt that would work transparently.
It's common (at least for me) to run pgindent in a distclean'd
tree, where configure results wouldn't be available. It's even
worse if you habitually use VPATH builds, so that those files
*never* exist in your source tree. So now I think that we should
stick to the convention that it's on the user to install
pg_bsd_indent somewhere in their PATH; all we'll be doing with
this change is eliminating the step of fetching pg_bsd_indent's
source files from somewhere else.

I think it'd be better to build pg_bsd_indent automatically as you planned
earlier - most others don't run pgindent from a distcleaned source tree. And
it shouldn't be hard to teach pgindent to run from a vpath build directory.

I'd like to get to the point where we can have simple build target for
a) re-indenting the whole tree
b) re-indenting the files touched in changes compared to master

If we add that to the list of things to do before sending a patch upstream,
we're a heck of a lot more likely to get decently formatted patches compared
to today.

As long as we need typedefs.list, I think it'd be good for such a target to
add new typedefs found in the local build to typedefs.list (but *not* remove
old ones, due to platform dependent code). But that's a separate enough
topic...

2. Given #1, it'll be prudent to continue having pgindent
double-check that pg_bsd_indent reports a specific version
number.

+1

3. If we do nothing special, the first mass reindentation is
going to reformat the pg_bsd_indent sources per PG style,
which is ... er ... not the way they look now. Do we want
to accept that outcome, or take steps to prevent pgindent
from processing pg_bsd_indent? I have a feeling that manual
cleanup would be necessary if we let such reindentation
happen, but I haven't experimented.

I think we should exempt it, initially at least. If somebody decides to invest
a substantial amount of time in pgindent, let's change it, but I'm somewhat
doubtful that'll happen anytime soon.

Greetings,

Andres Freund

#88Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andres Freund (#87)
Re: run pgindent on a regular basis / scripted manner

On 2023-Jan-23, Tom Lane wrote:

1. [...] So now I think that we should
stick to the convention that it's on the user to install
pg_bsd_indent somewhere in their PATH; all we'll be doing with
this change is eliminating the step of fetching pg_bsd_indent's
source files from somewhere else.

+1

2. Given #1, it'll be prudent to continue having pgindent
double-check that pg_bsd_indent reports a specific version
number. We could imagine starting to use the main Postgres
version number for that, but I'm inclined to continue with
its existing numbering series.

+1

3. If we do nothing special, the first mass reindentation is
going to reformat the pg_bsd_indent sources per PG style,
which is ... er ... not the way they look now. Do we want
to accept that outcome, or take steps to prevent pgindent
from processing pg_bsd_indent? I have a feeling that manual
cleanup would be necessary if we let such reindentation
happen, but I haven't experimented.

Hmm, initially it must just be easier to have an exception so that
pg_bsd_indent itself isn't indented.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
#error "Operator lives in the wrong universe"
("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)

#89Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#85)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-23 Mo 09:49, Jelte Fennema wrote:

Attached is a patch
that adds an updated pre-commit hook with the same behaviour
as the one before. I definitely think having a pre-commit hook
in the repo is beneficial, since writing one that works in all
cases definitely takes some time.

Not sure if this should go in the git repo or in the developer wiki.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#90Nikita Malakhov
hukutoc@gmail.com
In reply to: Alvaro Herrera (#88)
Re: run pgindent on a regular basis / scripted manner

Hi!

I've ran pdindent on the whole Postgres and it'd changed
an awful lot of source files. Won't it create a lot of merge conflicts?

On Mon, Jan 23, 2023 at 8:48 PM Alvaro Herrera <alvherre@alvh.no-ip.org>
wrote:

On 2023-Jan-23, Tom Lane wrote:

1. [...] So now I think that we should
stick to the convention that it's on the user to install
pg_bsd_indent somewhere in their PATH; all we'll be doing with
this change is eliminating the step of fetching pg_bsd_indent's
source files from somewhere else.

+1

2. Given #1, it'll be prudent to continue having pgindent
double-check that pg_bsd_indent reports a specific version
number. We could imagine starting to use the main Postgres
version number for that, but I'm inclined to continue with
its existing numbering series.

+1

3. If we do nothing special, the first mass reindentation is
going to reformat the pg_bsd_indent sources per PG style,
which is ... er ... not the way they look now. Do we want
to accept that outcome, or take steps to prevent pgindent
from processing pg_bsd_indent? I have a feeling that manual
cleanup would be necessary if we let such reindentation
happen, but I haven't experimented.

Hmm, initially it must just be easier to have an exception so that
pg_bsd_indent itself isn't indented.

--
Álvaro Herrera 48°01'N 7°57'E —
https://www.EnterpriseDB.com/
#error <https://www.EnterpriseDB.com/#error&gt; "Operator lives in the wrong
universe"
("Use of cookies in real-time system development", M. Gleixner, M. Mc
Guire)

--
Regards,
Nikita Malakhov
Postgres Professional
https://postgrespro.ru/

#91Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nikita Malakhov (#90)
Re: run pgindent on a regular basis / scripted manner

Nikita Malakhov <hukutoc@gmail.com> writes:

I've ran pdindent on the whole Postgres and it'd changed
an awful lot of source files. Won't it create a lot of merge conflicts?

Well, yeah, you've rediscovered the fact that a lot of commits are sloppy
about this, and it's been awhile since our last tree-wide pgindent.

Our normal process has been to do a cleanup pgindent run annually or so,
usually after the end of the last commitfest of a cycle when there's
plenty of time for people to deal with the ensuing merge conflicts.

If we could get to "commits are pgindent clean to begin with", we
could avoid the merge problems from one-big-reindent. I'd still
be inclined to do an annual run as a backup, but hopefully it would
find few problems.

regards, tom lane

#92Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#89)
Re: run pgindent on a regular basis / scripted manner

Not sure if this should go in the git repo or in the developer wiki.

I would say the git repo is currently the most fitting place, since it
has all the existing docs for pgindent. The wiki even links to the
pgindent source directory:
https://wiki.postgresql.org/wiki/Developer_FAQ#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F

#93Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#87)
Re: run pgindent on a regular basis / scripted manner

On Mon, Jan 23, 2023 at 09:31:36AM -0800, Andres Freund wrote:

As long as we need typedefs.list, I think it'd be good for such a target to
add new typedefs found in the local build to typedefs.list (but *not* remove
old ones, due to platform dependent code). But that's a separate enough
topic...

One issue on requiring patches to have run pgindent previously is
actually the typedef list. If someone adds a typedef in a commit, they
will see different pgident output in the committed files, and perhaps
others, and the new typedefs might only appear after the commit, causing
later commits to not match.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#94Jelte Fennema
postgres@jeltef.nl
In reply to: Bruce Momjian (#93)
Re: run pgindent on a regular basis / scripted manner

One issue on requiring patches to have run pgindent previously is
actually the typedef list. If someone adds a typedef in a commit, they
will see different pgident output in the committed files, and perhaps
others, and the new typedefs might only appear after the commit, causing
later commits to not match.

I'm not sure I understand the issue you're pointing out. If someone
changes the typedef list, imho they want the formatting to change
because of that. So requiring an addition to the typedef list to also
commit reindentation to all files that this typedef indirectly impacts
seems pretty reasonable to me.

#95Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#94)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

One issue on requiring patches to have run pgindent previously is
actually the typedef list. If someone adds a typedef in a commit, they
will see different pgident output in the committed files, and perhaps
others, and the new typedefs might only appear after the commit, causing
later commits to not match.

I'm not sure I understand the issue you're pointing out. If someone
changes the typedef list, imho they want the formatting to change
because of that. So requiring an addition to the typedef list to also
commit reindentation to all files that this typedef indirectly impacts
seems pretty reasonable to me.

I think the issue Bruce is pointing out is that this is another mechanism
whereby different people could get different indentation results.
I fear any policy that is based on an assumption that indentation has
One True Result is going to fail.

As a concrete example, suppose Alice commits some code that uses "foo"
as a variable name, and more or less concurrently, Bob commits something
that defines "foo" as a typedef. Bob's change is likely to have
side-effects on the formatting of Alice's code. If they're working in
well-separated parts of the source tree, nobody is likely to notice
that for awhile --- but whoever next touches the files Alice touched
will be in for a surprise, which will be more or less painful depending
on whether we've installed brittle processes.

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]https://github.com/PGBuildFarm/client-code/commit/dcca861554e90d6395c3c153317b0b0e3841f103). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

I think we need to be content with a "soft", it's more-or-less-right
approach to indentation. As I explained to somebody upthread, the
main benefit of this for most people is avoiding the need for a massive
once-a-year reindent run that causes merge failures for many pending
patches. But we don't need to completely eliminate such runs to get
99.9% of that benefit; we only need to reduce the number of places
they're likely to touch.

regards, tom lane

[1]: https://github.com/PGBuildFarm/client-code/commit/dcca861554e90d6395c3c153317b0b0e3841f103

#96Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Andres Freund (#74)
Re: run pgindent on a regular basis / scripted manner

Andres> Of course, but I somehow feel a change of formatting should be
reviewable to
Andres> at least some degree

One way of reviewing the formatting changes is to compare the compiled
binaries.

If the binaries before and after formatting are the same, then there's a
high chance the behaviour is the same.

Vladimir

#97Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#95)
Re: run pgindent on a regular basis / scripted manner

As a concrete example, suppose Alice commits some code that uses "foo"
as a variable name, and more or less concurrently, Bob commits something
that defines "foo" as a typedef. Bob's change is likely to have
side-effects on the formatting of Alice's code. If they're working in
well-separated parts of the source tree, nobody is likely to notice
that for awhile --- but whoever next touches the files Alice touched
will be in for a surprise, which will be more or less painful depending
on whether we've installed brittle processes.

Sounds like this conflict could be handled fairly easily by
having a local git hook rerunning pgindent whenever
you rebase a commit:
1. if you changed typedefs.list the hook would format all files
2. if you didn't it only formats the files that you changed

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

Wouldn't the automatic addition-only solution that Andres suggested
solve this issue? Build farms could still remove unused typedefs
on a regular basis, but commits would at least add typedefs for the
platform that the comitter uses.

I think we need to be content with a "soft", it's more-or-less-right
approach to indentation.

I think that this would already be a significant improvement over the
current situation. My experience with the current situation is that
indentation is more-or-less-wrong.

As I explained to somebody upthread, the
main benefit of this for most people is avoiding the need for a massive
once-a-year reindent run that causes merge failures for many pending
patches.

Merge failures are one issue. But personally the main benefit that
I would be getting is being able to run pgindent on the files
I'm editing and get this weird +12 columns formatting correct
without having to manually type it. Without pgindent also
changing random parts of the files that someone else touched
a few commits before me.

#98Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#97)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

Sounds like this conflict could be handled fairly easily by
having a local git hook rerunning pgindent whenever
you rebase a commit:
1. if you changed typedefs.list the hook would format all files
2. if you didn't it only formats the files that you changed

I think that would be undesirable, because then reindentation noise
in completely-unrelated files would get baked into feature commits,
complicating review and messing up "git blame" history.
The approach we currently have allows reindent effects to be
separated into ignorable labeled commits, which is a nice property.

Merge failures are one issue. But personally the main benefit that
I would be getting is being able to run pgindent on the files
I'm editing and get this weird +12 columns formatting correct
without having to manually type it. Without pgindent also
changing random parts of the files that someone else touched
a few commits before me.

Yeah, that always annoys me too, but I've always considered that
it's my problem not something I can externalize onto other people.
The real bottom line here is that AFAICT, there are fewer committers
who care about indent cleanliness than committers who do not, so
I do not think that the former group get to impose strict rules
on the latter, much as I might wish otherwise.

FWIW, Andrew's recent --show-diff feature for pgindent has
already improved my workflow for that. I can do
"pgindent --show-diff >fixindent.patch", manually remove any hunks
in fixindent.patch that don't pertain to the code I'm working on,
and apply what remains to fix up my new code. (I had been doing
something basically like this, but with more file-copying steps
to undo pgindent's edit-in-place behavior.)

regards, tom lane

#99Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#98)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-24 Tu 11:43, Tom Lane wrote:

Jelte Fennema <postgres@jeltef.nl> writes:

Sounds like this conflict could be handled fairly easily by
having a local git hook rerunning pgindent whenever
you rebase a commit:
1. if you changed typedefs.list the hook would format all files
2. if you didn't it only formats the files that you changed

I think that would be undesirable, because then reindentation noise
in completely-unrelated files would get baked into feature commits,
complicating review and messing up "git blame" history.
The approach we currently have allows reindent effects to be
separated into ignorable labeled commits, which is a nice property.

Merge failures are one issue. But personally the main benefit that
I would be getting is being able to run pgindent on the files
I'm editing and get this weird +12 columns formatting correct
without having to manually type it. Without pgindent also
changing random parts of the files that someone else touched
a few commits before me.

Yeah, that always annoys me too, but I've always considered that
it's my problem not something I can externalize onto other people.
The real bottom line here is that AFAICT, there are fewer committers
who care about indent cleanliness than committers who do not, so
I do not think that the former group get to impose strict rules
on the latter, much as I might wish otherwise.

FWIW, Andrew's recent --show-diff feature for pgindent has
already improved my workflow for that. I can do
"pgindent --show-diff >fixindent.patch", manually remove any hunks
in fixindent.patch that don't pertain to the code I'm working on,
and apply what remains to fix up my new code. (I had been doing
something basically like this, but with more file-copying steps
to undo pgindent's edit-in-place behavior.)

I'm glad it's helpful.

Here's another improvement I think will be useful when the new gadgets
are used in a git hook: first, look for the excludes file under the
current directory if we aren't setting $code_base (e.g if we have files
given on the command line), and second apply the exclude patterns to the
command line files as well as to files found using File::Find.

I propose to apply this fairly soon.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pgindent-exclude-fix.patchtext/x-patch; charset=UTF-8; name=pgindent-exclude-fix.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 5eff1f8032..1f95a1a34e 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -63,6 +63,10 @@ $code_base ||= '.' unless @ARGV;
 $excludes ||= "$code_base/src/tools/pgindent/exclude_file_patterns"
   if $code_base && -f "$code_base/src/tools/pgindent/exclude_file_patterns";
 
+# also look under the current directory for the exclude patterns file
+$excludes ||= "src/tools/pgindent/exclude_file_patterns"
+  if -f "src/tools/pgindent/exclude_file_patterns";
+
 # The typedef list that's mechanically extracted by the buildfarm may omit
 # some names we want to treat like typedefs, e.g. "bool" (which is a macro
 # according to <stdbool.h>), and may include some names we don't want
@@ -421,8 +425,6 @@ File::Find::find(
 	},
 	$code_base) if $code_base;
 
-process_exclude();
-
 $filtered_typedefs_fh = load_typedefs();
 
 check_indent();
@@ -430,6 +432,9 @@ check_indent();
 # any non-option arguments are files to be processed
 push(@files, @ARGV);
 
+# the exclude list applies to command line arguments as well as found files
+process_exclude();
+
 foreach my $source_filename (@files)
 {
 	# ignore anything that's not a .c or .h file
#100Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#99)
Re: run pgindent on a regular basis / scripted manner

I think that would be undesirable, because then reindentation noise
in completely-unrelated files would get baked into feature commits,
complicating review and messing up "git blame" history.

With a rebase hook similar to the the pre-commit hook that I shared
upthread, your files will be changed accordingly, but you don't need
to commit those changes in the same commit as the one that you're
rebasing. You could append another commit after it. Another option
would be to move the typedefs.list change to a separate commit
together with all project wide indentation changes.

The real bottom line here is that AFAICT, there are fewer committers
who care about indent cleanliness than committers who do not, so
I do not think that the former group get to impose strict rules
on the latter, much as I might wish otherwise.

Is this actually the case? I haven't seen anyone in this thread say
they don't care. From my perspective it seems like the unclean
indents simply come from forgetting to run pgindent once in
a while. And those few forgetful moments add up over a year.
of commits. That's why to me tooling seems the answer here.
If the tooling makes it easy not to forget then the problem
goes away.

FWIW, Andrew's recent --show-diff feature for pgindent has
already improved my workflow for that. I can do
"pgindent --show-diff >fixindent.patch", manually remove any hunks
in fixindent.patch that don't pertain to the code I'm working on,
and apply what remains to fix up my new code. (I had been doing
something basically like this, but with more file-copying steps
to undo pgindent's edit-in-place behavior.)

Yeah, I have a similar workflow with the pre-commit hook that
I shared. By using "git checkout -p" I can remove hunks that
don't pertain to my code. Still it would be really nice not
to have to go through that effort (which is significant for the
libpq code that I've been workin on, since there's ~50
incorrectly indented hunks).

Here's another improvement I think will be useful when the new gadgets
are used in a git hook: first, look for the excludes file under the
current directory if we aren't setting $code_base (e.g if we have files
given on the command line), and second apply the exclude patterns to the
command line files as well as to files found using File::Find.

Change looks good to me.

#101Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#95)
Re: run pgindent on a regular basis / scripted manner

On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

As I now understand it, we would need to standardize on a typedef list
at the beginning of each major development cycle, and then only allow
additions, and the addition would have to include any pgindent affects
of the addition.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#102Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#100)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-24 Tu 13:42, Jelte Fennema wrote:

Here's another improvement I think will be useful when the new gadgets
are used in a git hook: first, look for the excludes file under the
current directory if we aren't setting $code_base (e.g if we have files
given on the command line), and second apply the exclude patterns to the
command line files as well as to files found using File::Find.

Change looks good to me.

Thanks, pushed

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#103Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#85)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-23 Mo 09:49, Jelte Fennema wrote:

Indeed the flags you added are enough. Attached is a patch
that adds an updated pre-commit hook with the same behaviour
as the one before. I definitely think having a pre-commit hook
in the repo is beneficial, since writing one that works in all
cases definitely takes some time.

I didn't really like your hook, as it forces a reindent, and many people
won't want that (for reasons given elsewhere in this thread).

Here's an extract from my pre-commit hook that does that if PGAUTOINDENT
is set to "yes", and otherwise just warns you that you need to run pgindent.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pre-commit-sampletext/plain; charset=UTF-8; name=pre-commit-sampleDownload
#104Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#103)
Re: run pgindent on a regular basis / scripted manner

On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

I didn't really like your hook, as it forces a reindent, and many people
won't want that (for reasons given elsewhere in this thread).

I'm not sure what you mean by "forces a reindent". Like I explained
you can simply run "git commit" again to ignore the changes and
commit anyway. As long as the files are indented on your filesystem
the hook doesn't care if you actually included the indentation changes
in the changes that you're currently committing.

So to be completely clear you can do the following with my hook:
git commit # runs pgindent and fails
git commit # commits changes anyway
git commit -am 'Run pgindent' # commit indentation changes separately

Or what I usually do:
git commit # runs pgindent and fails
git add --patch # choose relevant changes to add to commit
git commit # commit the changes
git checkout -- . # undo irrelevant changes on filesystem

Honestly PGAUTOINDENT=no seems stricter, since the only
way to bypass the failure is now to run manually run pgindent
or git commit with the --no-verify flag.

files=$(git diff --cached --name-only --diff-filter=ACMR)
src/tools/pgindent/pgindent $files

That seems like it would fail if there's any files or directories with
spaces in them. Maybe this isn't something we care about though.

# no need to filter files - pgindent ignores everything that isn't a
# .c or .h file

If the first argument is a non .c or .h file, then pgindent interprets
it as the typedefs file. So it's definitely important to filter non .c
and .h files out. Because now if you commit a single
non .c or .h file this hook messes up the indentation in all of
your files. You can reproduce by running:
src/tools/pgindent/pgindent README

# only do this on master
test "$branch" = "master" || return 0

I would definitely want a way to disable this check. As a normal
submitter I never work directly on master.

#105Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#104)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-26 Th 11:16, Jelte Fennema wrote:

On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

I didn't really like your hook, as it forces a reindent, and many people
won't want that (for reasons given elsewhere in this thread).

I'm not sure what you mean by "forces a reindent". Like I explained
you can simply run "git commit" again to ignore the changes and
commit anyway. As long as the files are indented on your filesystem
the hook doesn't care if you actually included the indentation changes
in the changes that you're currently committing.

Your hook does this:

+git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' |\
+    xargs src/tools/pgindent/pgindent --silent-diff \
+    || {
+        echo ERROR: Aborting commit because pgindent was not run
+        git diff --cached --name-only --diff-filter=ACMR | grep
'\.[ch]$' | xargs src/tools/pgindent/pgindent
+        exit 1
+    }

At this stage the files are now indented, so if it failed and you run
`git commit` again it will commit with the indention changes.

So to be completely clear you can do the following with my hook:
git commit # runs pgindent and fails
git commit # commits changes anyway
git commit -am 'Run pgindent' # commit indentation changes separately

Or what I usually do:
git commit # runs pgindent and fails
git add --patch # choose relevant changes to add to commit
git commit # commit the changes
git checkout -- . # undo irrelevant changes on filesystem

Honestly PGAUTOINDENT=no seems stricter, since the only
way to bypass the failure is now to run manually run pgindent
or git commit with the --no-verify flag.

files=$(git diff --cached --name-only --diff-filter=ACMR)
src/tools/pgindent/pgindent $files

That seems like it would fail if there's any files or directories with
spaces in them. Maybe this isn't something we care about though.

We don't have any, and the filenames git produces are relative to the
git root. I don't think this is an issue.

# no need to filter files - pgindent ignores everything that isn't a
# .c or .h file

If the first argument is a non .c or .h file, then pgindent interprets
it as the typedefs file. So it's definitely important to filter non .c
and .h files out. Because now if you commit a single
non .c or .h file this hook messes up the indentation in all of
your files. You can reproduce by running:
src/tools/pgindent/pgindent README

I have a patch at [1]  /messages/by-id/21bb8573-9e56-812b-84cf-1e4f3c4c2a7b@dunslane.net to remove this misfeature.

# only do this on master
test "$branch" = "master" || return 0

I would definitely want a way to disable this check. As a normal
submitter I never work directly on master.

Sure, that's your choice. My intended audience here is committers, who
of course do work on master.

cheers

andrew

[1]:   /messages/by-id/21bb8573-9e56-812b-84cf-1e4f3c4c2a7b@dunslane.net

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#106Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#105)
Re: run pgindent on a regular basis / scripted manner

At this stage the files are now indented, so if it failed and you run
`git commit` again it will commit with the indention changes.

No, because at no point a "git add" is happening, so the changes
made by pgindent are not staged. As long as you don't run the
second "git commit" with the -a flag the commit will be exactly
the same as you prepared it before.

Sure, that's your choice. My intended audience here is committers, who
of course do work on master.

Yes I understand, I meant it would be nice if the script had a environment
variable like PG_COMMIT_HOOK_ALL_BRANCHES (bad name)
for this purpose.

Show quoted text

On Thu, 26 Jan 2023 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-01-26 Th 11:16, Jelte Fennema wrote:

On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:

I didn't really like your hook, as it forces a reindent, and many people
won't want that (for reasons given elsewhere in this thread).

I'm not sure what you mean by "forces a reindent". Like I explained
you can simply run "git commit" again to ignore the changes and
commit anyway. As long as the files are indented on your filesystem
the hook doesn't care if you actually included the indentation changes
in the changes that you're currently committing.

Your hook does this:

+git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' |\
+    xargs src/tools/pgindent/pgindent --silent-diff \
+    || {
+        echo ERROR: Aborting commit because pgindent was not run
+        git diff --cached --name-only --diff-filter=ACMR | grep
'\.[ch]$' | xargs src/tools/pgindent/pgindent
+        exit 1
+    }

At this stage the files are now indented, so if it failed and you run
`git commit` again it will commit with the indention changes.

So to be completely clear you can do the following with my hook:
git commit # runs pgindent and fails
git commit # commits changes anyway
git commit -am 'Run pgindent' # commit indentation changes separately

Or what I usually do:
git commit # runs pgindent and fails
git add --patch # choose relevant changes to add to commit
git commit # commit the changes
git checkout -- . # undo irrelevant changes on filesystem

Honestly PGAUTOINDENT=no seems stricter, since the only
way to bypass the failure is now to run manually run pgindent
or git commit with the --no-verify flag.

files=$(git diff --cached --name-only --diff-filter=ACMR)
src/tools/pgindent/pgindent $files

That seems like it would fail if there's any files or directories with
spaces in them. Maybe this isn't something we care about though.

We don't have any, and the filenames git produces are relative to the
git root. I don't think this is an issue.

# no need to filter files - pgindent ignores everything that isn't a
# .c or .h file

If the first argument is a non .c or .h file, then pgindent interprets
it as the typedefs file. So it's definitely important to filter non .c
and .h files out. Because now if you commit a single
non .c or .h file this hook messes up the indentation in all of
your files. You can reproduce by running:
src/tools/pgindent/pgindent README

I have a patch at [1] to remove this misfeature.

# only do this on master
test "$branch" = "master" || return 0

I would definitely want a way to disable this check. As a normal
submitter I never work directly on master.

Sure, that's your choice. My intended audience here is committers, who
of course do work on master.

cheers

andrew

[1] /messages/by-id/21bb8573-9e56-812b-84cf-1e4f3c4c2a7b@dunslane.net

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#107Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#106)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-26 Th 12:05, Jelte Fennema wrote:

At this stage the files are now indented, so if it failed and you run
`git commit` again it will commit with the indention changes.

No, because at no point a "git add" is happening, so the changes
made by pgindent are not staged. As long as you don't run the
second "git commit" with the -a flag the commit will be exactly
the same as you prepared it before.

Hmm, but I usually run with -a, I even have a git alias for it. I guess
what this discussion illustrates is that there are various patters for
using git, and we shouldn't assume that everyone else is using the same
patterns we are.

I'm still  mildly inclined to say this material would be better placed
in the developer wiki. After all, this isn't the only thing a postgres
developer might use a git hook for (mine has more material in it than in
what I posted).

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#108Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#107)
Re: run pgindent on a regular basis / scripted manner

On Thu, 26 Jan 2023 at 22:46, Andrew Dunstan <andrew@dunslane.net> wrote:

Hmm, but I usually run with -a, I even have a git alias for it. I guess
what this discussion illustrates is that there are various patters for
using git, and we shouldn't assume that everyone else is using the same
patterns we are.

I definitely agree that there are lots of ways to use git. And I now
understand why my hook didn't work well for your existing workflow.

I've pretty much unlearned the -a flag. Because the easiest way I've
been able to split up changes into different commits is using "git add
-p", which adds partial pieces of files to the staging area. And that
workflow combines terribly with "git commit -a" because -a adds all
the things that I specifically didn't put in the staging area into the
final commit anyway.

I'm still mildly inclined to say this material would be better placed
in the developer wiki. After all, this isn't the only thing a postgres
developer might use a git hook for

I think it should definitely be somewhere. I have a preference for the
repo, since I think the docs on codestyle are already in too many
different places. But the wiki is already much better than having no
shared hook at all. I mainly think we should try to make it as easy as
possible for people to commit well indented code.

(mine has more material in it than in what I posted).

Anything that is useful for the wider community and could be part of
this example/template git hook? (e.g. some perltidy automation)

#109Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#108)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-26 Th 17:54, Jelte Fennema wrote:

I'm still mildly inclined to say this material would be better placed
in the developer wiki. After all, this isn't the only thing a postgres
developer might use a git hook for

I think it should definitely be somewhere. I have a preference for the
repo, since I think the docs on codestyle are already in too many
different places. But the wiki is already much better than having no
shared hook at all. I mainly think we should try to make it as easy as
possible for people to commit well indented code.

I've added a section to the wiki at
<https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks&gt; and
put a reference to that in the pgindent docco. You can of course add
some more info to the wiki if you feel it's necessary.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#110Noah Misch
noah@leadboat.com
In reply to: Bruce Momjian (#101)
Re: run pgindent on a regular basis / scripted manner

On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:

On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

As I now understand it, we would need to standardize on a typedef list
at the beginning of each major development cycle, and then only allow
additions,

Not to my knowledge. There's no particular obstacle to updating the list more
frequently or removing entries.

and the addition would have to include any pgindent affects
of the addition.

Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
pgindent when the given commit(s) change the typedef list. typedef list
changes essentially become another kind of refactoring that can yield merge
conflicts. If your commit passed the pgindent check, rebasing it onto a new
typedefs list may require further indentation changes. New typedefs don't
tend to change a lot of old code, so I would expect this sort of conflict to
be minor, compared to all the other sources of conflicts.

#111Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#110)
Re: run pgindent on a regular basis / scripted manner

Noah Misch <noah@leadboat.com> writes:

Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
pgindent when the given commit(s) change the typedef list. typedef list
changes essentially become another kind of refactoring that can yield merge
conflicts. If your commit passed the pgindent check, rebasing it onto a new
typedefs list may require further indentation changes. New typedefs don't
tend to change a lot of old code, so I would expect this sort of conflict to
be minor, compared to all the other sources of conflicts.

In fact, if a typedef addition *does* affect a lot of old code,
that's a good sign that the choice of typedef name ought to be
rethought: it's evidently conflicting with existing names.

I'm not sure what that observation implies for our standard
practices here. But it does suggest that "let pgindent do what
it wants without human oversight" probably isn't a good plan.
We've seen that to be true for other reasons as well, notably that
it can destroy the readability of carefully-laid-out comments.

regards, tom lane

#112Bruce Momjian
bruce@momjian.us
In reply to: Noah Misch (#110)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jan 28, 2023 at 05:06:03PM -0800, Noah Misch wrote:

On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:

On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

As I now understand it, we would need to standardize on a typedef list
at the beginning of each major development cycle, and then only allow
additions,

Not to my knowledge. There's no particular obstacle to updating the list more
frequently or removing entries.

We would need to re-pgindent the tree each time, I think, which would
cause disruption if we did it too frequently.

and the addition would have to include any pgindent affects
of the addition.

Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
pgindent when the given commit(s) change the typedef list. typedef list
changes essentially become another kind of refactoring that can yield merge
conflicts. If your commit passed the pgindent check, rebasing it onto a new
typedefs list may require further indentation changes. New typedefs don't
tend to change a lot of old code, so I would expect this sort of conflict to
be minor, compared to all the other sources of conflicts.

Agreed.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#113Noah Misch
noah@leadboat.com
In reply to: Bruce Momjian (#112)
Re: run pgindent on a regular basis / scripted manner

On Mon, Jan 30, 2023 at 03:42:09PM -0500, Bruce Momjian wrote:

On Sat, Jan 28, 2023 at 05:06:03PM -0800, Noah Misch wrote:

On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:

On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]). Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own. I
definitely do not want to go over to hand maintenance of that list.

As I now understand it, we would need to standardize on a typedef list
at the beginning of each major development cycle, and then only allow
additions,

Not to my knowledge. There's no particular obstacle to updating the list more
frequently or removing entries.

We would need to re-pgindent the tree each time, I think, which would
cause disruption if we did it too frequently.

More important than frequency is how much old code changes. A new typedef
typically is an identifier not already appearing in the tree, so no old code
changes. A removed typedef typically no longer appears in the tree, so again
no old code changes. The tree can get those daily; they're harmless.

The push that adds or removes FooTypedef from the source code is in the best
position to react to any surprising indentation consequences of adding or
removing FooTypedef from typedefs.list. (Reactions could include choosing a
different typedef name or renaming incidental matches in older code.) Hence,
changing typedefs.list as frequently as it affects the code is less disruptive
than changing it once a year. The same applies to challenges like pgindent
wrecking a non-"/*----------" comment. Such breakage is hard to miss when
it's part of the push that crafts the comment; it's easier to miss in a bulk,
end-of-cycle pgindent.

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

#114Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#113)
Re: run pgindent on a regular basis / scripted manner

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.

Maybe this should get taken up at the this-time-for-sure developer
meeting at PGCon?

regards, tom lane

#115Dean Rasheed
dean.a.rasheed@gmail.com
In reply to: Tom Lane (#114)
Re: run pgindent on a regular basis / scripted manner

On Thu, 2 Feb 2023 at 06:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements.

I didn't reply until now, but I'm solidly in the camp of committers
who care about keeping the tree properly indented, and I wouldn't have
any problem with such a check being imposed.

I regularly run pgindent locally, and if I ever commit without
indenting, it's either intentional, or because I forgot, so the
reminder would be useful.

And as someone who runs pgindent regularly, I think this will be a net
time saver, since I won't have to skip over other unrelated indent
changes all the time.

Regards,
Dean

#116Thomas Munro
thomas.munro@gmail.com
In reply to: Dean Rasheed (#115)
Re: run pgindent on a regular basis / scripted manner

On Fri, Feb 3, 2023 at 12:35 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

And as someone who runs pgindent regularly, I think this will be a net
time saver, since I won't have to skip over other unrelated indent
changes all the time.

+1

#117Amit Kapila
amit.kapila16@gmail.com
In reply to: Dean Rasheed (#115)
Re: run pgindent on a regular basis / scripted manner

On Thu, Feb 2, 2023 at 5:05 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:

On Thu, 2 Feb 2023 at 06:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements.

I didn't reply until now, but I'm solidly in the camp of committers
who care about keeping the tree properly indented, and I wouldn't have
any problem with such a check being imposed.

I regularly run pgindent locally, and if I ever commit without
indenting, it's either intentional, or because I forgot, so the
reminder would be useful.

And as someone who runs pgindent regularly, I think this will be a net
time saver, since I won't have to skip over other unrelated indent
changes all the time.

+1.

--
With Regards,
Amit Kapila.

#118Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#114)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-02 Th 01:40, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.

I'd be very surprised if this caused any active committer to walk away
from the project. Many will probably appreciate the nudge. But maybe I'm
overoptimistic.

Maybe this should get taken up at the this-time-for-sure developer
meeting at PGCon?

Sure. There's probably some work that could still be done in this area
too, such as making pgperltidy work similarly to how we've now make
pgindent work.

There's also a question of timing. Possibly the best time would be when
we next fork the tree.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#119Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#118)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

Sure. There's probably some work that could still be done in this area
too, such as making pgperltidy work similarly to how we've now make
pgindent work.

Perhaps. But before we commit to that, I'd like to see some tweaks to the
pgperltidy rules to make it less eager to revisit the formatting of lines
close to a change. Its current behavior will induce a lot of "git blame"
noise if we apply these same procedures to Perl code.

(Should I mention reformat-dat-files?)

There's also a question of timing. Possibly the best time would be when
we next fork the tree.

Yeah. We have generally not wanted to do a mass indent except
when there's a minimum amount of pending patches, ie after the last
CF of a cycle. What I'd suggest is that we plan on doing a mass
indent and then switch over to the new rules, right after the March
CF closes. That gives us a couple months to nail down and test out
the new procedures before they go live.

regards, tom lane

#120Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#119)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-02 Th 10:00, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Sure. There's probably some work that could still be done in this area
too, such as making pgperltidy work similarly to how we've now make
pgindent work.

Perhaps. But before we commit to that, I'd like to see some tweaks to the
pgperltidy rules to make it less eager to revisit the formatting of lines
close to a change. Its current behavior will induce a lot of "git blame"
noise if we apply these same procedures to Perl code.

I haven't done anything about that yet, but I have reworked the script
so it's a lot more like pgindent, with --show-diff and --silent-diff
modes, and allowing a list of files to be indented on the command line.
Non-perl files are filtered out from such a list.

(Should I mention reformat-dat-files?)

If you want I can add those flags there too.

There's also a question of timing. Possibly the best time would be when
we next fork the tree.

Yeah. We have generally not wanted to do a mass indent except
when there's a minimum amount of pending patches, ie after the last
CF of a cycle. What I'd suggest is that we plan on doing a mass
indent and then switch over to the new rules, right after the March
CF closes. That gives us a couple months to nail down and test out
the new procedures before they go live.

WFM. Of course then we're not waiting for the developer meeting.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pgperltidy-new.patchtext/x-patch; charset=UTF-8; name=pgperltidy-new.patchDownload
diff --git a/src/tools/pgindent/pgperltidy b/src/tools/pgindent/pgperltidy
index 5e704119eb..388c113526 100755
--- a/src/tools/pgindent/pgperltidy
+++ b/src/tools/pgindent/pgperltidy
@@ -1,12 +1,162 @@
-#!/bin/sh
+#!/usr/bin/perl
 
 # src/tools/pgindent/pgperltidy
 
-set -e
+use strict;
+use warnings;
 
-# set this to override default perltidy program:
-PERLTIDY=${PERLTIDY:-perltidy}
+use Getopt::Long;
+use FindBin;
+use File::Find;
+use Perl::Tidy;
 
-. src/tools/perlcheck/find_perl_files
+# we require this exact version. See README.
+die "wrong perltidy version $Perl::Tidy::VERSION"
+  unless $Perl::Tidy::VERSION eq '20170521';
+
+# make sure we're in the source root if we need to look for files
+chdir "$FindBin::RealBin/../../.." unless @ARGV;
+
+my ($show_diff, $silent_diff, $help);
+
+$help = 0;
+
+my %options = (
+	"help"               => \$help,
+	"show-diff"          => \$show_diff,
+	"silent-diff"        => \$silent_diff,);
+GetOptions(%options) || usage("bad command line argument");
+
+usage() if $help;
+
+my @files;
+
+# get the filtered list of files
+if (@ARGV)
+{
+	filter_perl_files();
+}
+else
+{
+	File::Find::find(\&wanted, ".");
+}
+
+# process according to the command switches
+if ($silent_diff || $show_diff)
+{
+	indent_files_diff();
+}
+else
+{
+	indent_files_in_place();
+}
+
+exit 0;
+
+# non-destructive perltidy mode
+sub indent_files_diff
+{
+	my $diff_found = 0;
+	foreach my $file (@files)
+	{
+		my $error_flag = Perl::Tidy::perltidy(
+			source            => $file,
+            destination       => "$file.tdy",
+            perltidyrc        => "$FindBin::RealBin/perltidyrc",
+		   );
+		if ($error_flag)
+		{
+			unlink "$file.tdy";
+			die "pgperltidy error!";
+		}
+		my $diff = `diff -ud -F '^sub ' $file $file.tdy 2>&1`;
+		unlink "$file.tdy";
+		next unless $diff;
+		exit 2 if ($silent_diff);
+		$diff_found = 1;
+		print $diff;
+	}
+}
+
+# destructive perltidy in place
+# expects that the perltidyrc will have --backup-and-modify-in-place
+sub indent_files_in_place
+{
+	foreach my $file (@files)
+	{
+		my $error_flag = Perl::Tidy::perltidy(
+			source            => $file,
+            perltidyrc        => "$FindBin::RealBin/perltidyrc",
+		   );
+		if ($error_flag)
+		{
+			die "pgperltidy error!";
+		}
+	}
+}
+
+# subroutine for use with File::Find::find
+sub wanted
+{
+    my $mode;
+
+    ($mode = (lstat($_))[2]) &&
+	  -f _ &&
+	  (/^.*\.p[lm]\z/s ||
+	   (!/^.*\.dat\z/s && (($mode & 0100) == 0100) &&
+		`file $_ 2>/dev/null` =~/:.*perl[0-9]*$/i)) &&
+	   push(@files, $File::Find::name);
+}
+
+# Similar but filtering files specified in @ARGV
+sub filter_perl_files
+{
+	while (my $file = shift @ARGV)
+	{
+		if (! -f $file)
+		{
+			die "can't find $file";
+		}
+		# ignore .dat files in case they are found
+		next if $file =~ /\.dat$/;
+		if ($file =~ /\.p[lm]\z/)
+		{
+			# file has an explicit perl extension, nothing more to test
+			push @files, $file;
+			next;
+		}
+		my $mode = (lstat(_))[2];
+		# check if it's executable by owner
+		next unless (($mode & 0100) == 0100);
+		# check if `file` says it's a perl file
+		if (`file $file 2>/dev/null` =~/:.*perl[0-9]*$/i)
+		{
+			push @files, $file;
+		}
+	}
+}
+
+
+sub usage
+{
+	my $message  = shift;
+	my $helptext = <<'EOF';
+Usage:
+pgperltidy [OPTION]... [FILE]...
+Options:
+	--help                  show this message and quit
+	--show-diff             show the changes that would be made
+	--silent-diff           exit with status 2 if any changes would be made
+EOF
+	if ($help)
+	{
+		print $helptext;
+		exit 0;
+	}
+	else
+	{
+		print STDERR "$message\n", $helptext;
+		exit 1;
+	}
+}
 
-find_perl_files | xargs $PERLTIDY --profile=src/tools/pgindent/perltidyrc
#121Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#70)
Re: run pgindent on a regular basis / scripted manner

On 2023-01-22 Su 17:47, Tom Lane wrote:

Andres Freund<andres@anarazel.de> writes:

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Yeah. That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up. That's just awful. You can either
run pgperltidy on new code before committing, and accept that the feature
patch will touch a lot of lines it's not making real changes to (thereby
dirtying the "git blame" history) or not do so and thereby commit code
that's not passing tidiness checks. Let's *not* adopt any style that
causes similar things to start happening in our C code.

Modern versions of perltidy give you much more control over this, so
maybe we need to investigate the possibility of updating. See the latest
docco at
<*https://metacpan.org/dist/Perl-Tidy/view/bin/perltidy#Completely-turning-off-vertical-alignment-with-novalign&gt;
*

Probably we'd want to use something like

|--valign-exclusion-list=||'= => ,'|
||
||
|cheers|
||
|andrew|

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#122Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#121)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-01-22 Su 17:47, Tom Lane wrote:

Yeah. That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up. That's just awful.

Modern versions of perltidy give you much more control over this, so
maybe we need to investigate the possibility of updating.

I have no objection to updating perltidy from time to time. I think the
idea is just to make sure that we have an agreed-on version for everyone
to use.

regards, tom lane

#123Michael Paquier
michael@paquier.xyz
In reply to: Dean Rasheed (#115)
Re: run pgindent on a regular basis / scripted manner

On Thu, Feb 02, 2023 at 11:34:37AM +0000, Dean Rasheed wrote:

I didn't reply until now, but I'm solidly in the camp of committers
who care about keeping the tree properly indented, and I wouldn't have
any problem with such a check being imposed.

So do I. pgindent is part of my routine when it comes to all the
patches I merge on HEAD, and having to clean up unrelated diffs in
the files touched after an indentation is always annoying.

FWIW, I just use a script that does pgindent, pgperltidy, pgperlcritic
and `make reformat-dat-files` in src/include/catalog.

I regularly run pgindent locally, and if I ever commit without
indenting, it's either intentional, or because I forgot, so the
reminder would be useful.

And as someone who runs pgindent regularly, I think this will be a net
time saver, since I won't have to skip over other unrelated indent
changes all the time.

+1.
--
Michael
#124Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#122)
Re: run pgindent on a regular basis / scripted manner

On Fri, Feb 03, 2023 at 12:52:50PM -0500, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-01-22 Su 17:47, Tom Lane wrote:

Yeah. That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up. That's just awful.

Modern versions of perltidy give you much more control over this, so
maybe we need to investigate the possibility of updating.

I have no objection to updating perltidy from time to time. I think the
idea is just to make sure that we have an agreed-on version for everyone
to use.

Agreed. If we're changing the indentation of assignments, that's a
considerable diff already. It would be a good time to absorb other diffs
we'll want eventually, including diffs from a perltidy version upgrade.

#125Andres Freund
andres@anarazel.de
In reply to: Noah Misch (#124)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-02-03 20:18:48 -0800, Noah Misch wrote:

On Fri, Feb 03, 2023 at 12:52:50PM -0500, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-01-22 Su 17:47, Tom Lane wrote:

Yeah. That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up. That's just awful.

Modern versions of perltidy give you much more control over this, so
maybe we need to investigate the possibility of updating.

I have no objection to updating perltidy from time to time. I think the
idea is just to make sure that we have an agreed-on version for everyone
to use.

Agreed. If we're changing the indentation of assignments, that's a
considerable diff already. It would be a good time to absorb other diffs
we'll want eventually, including diffs from a perltidy version upgrade.

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files. So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Greetings,

Andres Freund

#126Andrew Dunstan
andrew@dunslane.net
In reply to: Andres Freund (#125)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-04 Sa 06:34, Andres Freund wrote:

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files. So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Sure, we don't have to tie them together.

I'm currently experimenting with settings on the buildfarm code, trying
to get it both stable and looking nice. Then I'll try on the Postgres
core code and post some results.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#127Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#125)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files. So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Agreed, we should move more slowly with perltidy. Aside from the
points you raise, I bet fewer committers have it installed at all.

(I haven't forgotten that I'm on the hook to import pg_bsd_indent
into our tree. Will get to that soon.)

regards, tom lane

#128Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#127)
Re: run pgindent on a regular basis / scripted manner

On Sat, Feb 04, 2023 at 11:07:59AM -0500, Tom Lane wrote:

(I haven't forgotten that I'm on the hook to import pg_bsd_indent
into our tree. Will get to that soon.)

+1 for that - it's no surprise that you have trouble convincing people
to follow the current process:

1) requires using a hacked copy of BSD indent; 2) which is stored
outside the main repo; 3) is run via a perl script that itself mungles
the source code (because the only indent tool that can support the
project's style doesn't actually support what's needed); 4) and wants to
retrieve a remote copy of typedefs.list (?).

The only thing that makes this scheme even remotely viable is that
apt.postgresql.org includes a package for pg-bsd-indent. I've used it
only a handful of times by running:
pg_bsd_indent -bad -bap -bbb -bc -bl -cli1 -cp33 -cdb -nce -d0 -di12 -nfc1 -i4 -l79 -lp -lpl -nip -npro -sac -tpg -ts4 -U .../typedefs.list

The perl wrapper is still a step too far for me (maybe it'd be tolerable
if available as a build target).

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

--
Justin

#129Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#128)
Re: run pgindent on a regular basis / scripted manner

Justin Pryzby <pryzby@telsasoft.com> writes:

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that). I could see inventing a --postgresql shortcut switch perhaps.

But it's not clear to me why you're allergic to the perl wrapper?
It's not like that's the only perl infrastructure in our build process.
Also, whether or not we could push some of what it does into pg_bsd_indent
proper, I can't see pushing all of it (for instance, the very PG-specific
list of typedef exclusions).

regards, tom lane

#130Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#126)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-04 Sa 09:20, Andrew Dunstan wrote:

On 2023-02-04 Sa 06:34, Andres Freund wrote:

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files. So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Sure, we don't have to tie them together.

I'm currently experimenting with settings on the buildfarm code,
trying to get it both stable and looking nice. Then I'll try on the
Postgres core code and post some results.

So here's a diff made from running perltidy v20221112 with the
additional setting --valign-exclusion-list=", = => || && if unless"

Essentially this abandons those bits of vertical alignment that tend to
catch us out when additions are made to the code.

I think this will make the code much more maintainable and result in
much less perltidy churn. It would also mean that it's far more likely
that what we would naturally code can be undisturbed by perltidy.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgperltidy-no-valign.patch.gzapplication/gzip; name=pgperltidy-no-valign.patch.gzDownload
#131Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#129)
Re: run pgindent on a regular basis / scripted manner

On Sat, Feb 4, 2023 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

But it's not clear to me why you're allergic to the perl wrapper?
It's not like that's the only perl infrastructure in our build process.
Also, whether or not we could push some of what it does into pg_bsd_indent
proper, I can't see pushing all of it (for instance, the very PG-specific
list of typedef exclusions).

I don't mind that there is a script. I do mind that it's not that good
of a script. There have been some improvements for which I am
grateful, like removing the thing where the first argument was taken
as a typedefs file under some circumstances. But there are still some
things that I would like:

1. I'd like to be able to run pgindent src/include and have it indent
everything relevant under src/include. Right now that silently does
nothing.

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

3. I'd also like an easy way to indent every file touched by a recent
commit, e.g. pgindent --commit HEAD, pgindent --commit HEAD~2,
pgindent --commit 62e1e28bf7.

It'd be much less annoying to include this in my workflow with these
kinds of options. For instance:

patch -p1 < ~/Downloads/some_stuff_v94.patch
# committer adjustments as desired
git add -u
pgindent --cached
git diff # did pgindent change anything? does it look ok?
git commit -a

For a while I, like some others here, was trying to be religious about
pgindenting at least the bigger commits that I pushed. But I fear I've
grown slack. I don't mind if we tighten up the process, but the better
we make the tools, the less friction it will cause.

--
Robert Haas
EDB: http://www.enterprisedb.com

#132Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#131)
Re: run pgindent on a regular basis / scripted manner

Robert Haas <robertmhaas@gmail.com> writes:

I don't mind that there is a script. I do mind that it's not that good
of a script. There have been some improvements for which I am
grateful, like removing the thing where the first argument was taken
as a typedefs file under some circumstances. But there are still some
things that I would like:

I have no objection to someone coding those things up ;-).
I'll just note that adding features like those to a Perl script
is going to be a ton easier than doing it inside pg_bsd_indent.

regards, tom lane

#133Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#131)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 09:40, Robert Haas wrote:

On Sat, Feb 4, 2023 at 12:37 PM Tom Lane<tgl@sss.pgh.pa.us> wrote:

But it's not clear to me why you're allergic to the perl wrapper?
It's not like that's the only perl infrastructure in our build process.
Also, whether or not we could push some of what it does into pg_bsd_indent
proper, I can't see pushing all of it (for instance, the very PG-specific
list of typedef exclusions).

I don't mind that there is a script. I do mind that it's not that good
of a script. There have been some improvements for which I am
grateful, like removing the thing where the first argument was taken
as a typedefs file under some circumstances. But there are still some
things that I would like:

1. I'd like to be able to run pgindent src/include and have it indent
everything relevant under src/include. Right now that silently does
nothing.

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

3. I'd also like an easy way to indent every file touched by a recent
commit, e.g. pgindent --commit HEAD, pgindent --commit HEAD~2,
pgindent --commit 62e1e28bf7.

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on
those, and look into 2.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#134Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#132)
Re: run pgindent on a regular basis / scripted manner

On Mon, Feb 6, 2023 at 10:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

I'll just note that adding features like those to a Perl script
is going to be a ton easier than doing it inside pg_bsd_indent.

+1.

--
Robert Haas
EDB: http://www.enterprisedb.com

#135Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#133)
Re: run pgindent on a regular basis / scripted manner

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

Thanks!

--
Robert Haas
EDB: http://www.enterprisedb.com

#136Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#135)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 10:36, Robert Haas wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan<andrew@dunslane.net> wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

Thanks!

Here's a quick patch for 1 and 3. Would also need to adjust the docco.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#137Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#114)
Re: run pgindent on a regular basis / scripted manner

On 02.02.23 07:40, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.

I have some concerns about this.

First, as a matter of principle, it would introduce another level of
gatekeeping power. Right now, the committers are as a group in charge
of what gets into the tree. Adding commit hooks that are installed
somewhere(?) by someone(?) and can only be seen by some(?) would upset
that. If we were using something like github or gitlab (not suggesting
that, but for illustration), then you could put this kind of thing under
.github/ or similar and then it would be under the same control as the
source code itself.

Also, pgindent takes tens of seconds to run, so hooking that into the
git push process would slow this down quite a bit. And maybe we want to
add pgperltidy and so on, where would this lead? If somehow your local
indenting doesn't give you the "correct" result for some reason, you
might sit there for minutes and minutes trying to fix and push and fix
and push.

Then, consider the typedefs issue. If you add a typedef but don't add
it to the typedefs list but otherwise pgindent your code perfectly, the
push would be accepted. If then later someone updates the typedefs
list, perhaps from the build farm, it would then reject the indentation
of your previously committed code, thus making it their problem.

I think a better way to address these issues would be making this into a
test suite, so that you can run some command that checks "is everything
indented correctly". Then you can run this locally, on the build farm,
in the cfbot etc. in a uniform way and apply the existing
blaming/encouragement processes like for any other test failure.

#138Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#136)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 12:03, Andrew Dunstan wrote:

On 2023-02-06 Mo 10:36, Robert Haas wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan<andrew@dunslane.net> wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

Thanks!

Here's a quick patch for 1 and 3. Would also need to adjust the docco.

This time with patch.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgindent-directory+git-enhancements.patchtext/x-patch; charset=UTF-8; name=pgindent-directory+git-enhancements.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 56640e576a..ca329747a4 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,12 +23,14 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, $code_base,
 	@excludes,      $indent,      $build,
-	$show_diff,     $silent_diff, $help);
+	$show_diff,     $silent_diff, $help,
+	@commits,);
 
 $help = 0;
 
 my %options = (
 	"help"               => \$help,
+	"commit=s"           => \@commits,
 	"typedefs=s"         => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
 	"code-base=s"        => \$code_base,
@@ -44,6 +46,9 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
+usage("Cannot use --commit with --code-base or command line file list")
+  if (@commits && ($code_base || @ARGV));
+
 run_build($code_base) if ($build);
 
 # command line option wins, then environment (which is how --build sets it) ,
@@ -388,6 +393,7 @@ Usage:
 pgindent [OPTION]... [FILE]...
 Options:
 	--help                  show this message and quit
+    --commit=gitref         use files modified by the named commit
 	--typedefs=FILE         file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
 	--code-base=DIR         path to the base of PostgreSQL source code
@@ -396,7 +402,7 @@ Options:
 	--build                 build the pg_bsd_indent program
 	--show-diff             show the changes that would be made
 	--silent-diff           exit with status 2 if any changes would be made
-The --excludes option can be given more than once.
+The --excludes and --commit options can be given more than once.
 EOF
 	if ($help)
 	{
@@ -412,27 +418,38 @@ EOF
 
 # main
 
-# get the list of files under code base, if it's set
-File::Find::find(
-	{
-		wanted => sub {
-			my ($dev, $ino, $mode, $nlink, $uid, $gid);
-			(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
-			  && -f _
-			  && /^.*\.[ch]\z/s
-			  && push(@files, $File::Find::name);
-		}
-	},
-	$code_base) if $code_base;
-
 $filtered_typedefs_fh = load_typedefs();
 
 check_indent();
 
-# any non-option arguments are files to be processed
-push(@files, @ARGV);
+build_clean($code_base) if $build;
 
-# the exclude list applies to command line arguments as well as found files
+my $wanted = sub
+{
+	my ($dev, $ino, $mode, $nlink, $uid, $gid);
+	(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
+	  && -f _
+	  && /^.*\.[ch]\z/s
+	  && push(@files, $File::Find::name);
+};
+
+# get the list of files under code base, if it's set
+File::Find::find({wanted => $wanted }, $code_base) if $code_base;
+
+# any non-option arguments are files or directories to be processed
+File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
+
+# process named commits by comparing each with their immediate ancestor
+foreach my $commit (@commits)
+{
+	my $prev="$commit~";
+	my @affected=`git diff-tree --no-commit-id --name-only -r $commit $prev`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
+# remove excluded files from the file list
 process_exclude();
 
 foreach my $source_filename (@files)
@@ -481,6 +498,4 @@ foreach my $source_filename (@files)
 	}
 }
 
-build_clean($code_base) if $build;
-
 exit 0;
#139Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#137)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 12:17, Peter Eisentraut wrote:

On 02.02.23 07:40, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

Regarding the concern about a pre-receive hook blocking an emergency
push, the
hook could approve every push where a string like "pgindent: no"
appears in a
commit message within the push.  You'd still want to make the tree
clean
sometime the same week or so.  It's cheap to provide a break-glass
like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread.  I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.

I have some concerns about this.

First, as a matter of principle, it would introduce another level of
gatekeeping power.  Right now, the committers are as a group in charge
of what gets into the tree.  Adding commit hooks that are installed
somewhere(?) by someone(?) and can only be seen by some(?) would upset
that.  If we were using something like github or gitlab (not
suggesting that, but for illustration), then you could put this kind
of thing under .github/ or similar and then it would be under the same
control as the source code itself.

Also, pgindent takes tens of seconds to run, so hooking that into the
git push process would slow this down quite a bit.  And maybe we want
to add pgperltidy and so on, where would this lead?  If somehow your
local indenting doesn't give you the "correct" result for some reason,
you might sit there for minutes and minutes trying to fix and push and
fix and push.

Well, pgindent should produce canonical results or we're surely doing it
wrong. Regarding the time it takes, if we are only indenting the changed
files that time will be vastly reduced for most cases.

But I take your point to some extent. I think we should start by making
it easier and quicker to run pgindent locally, both by hand and in local
git hooks, for ordinary developers and for committers, and we should
encourage committers to be stricter in their use of pgindent. If there
are features we need to make this possible, speak up (c.f. Robert's
email earlier today). I'm committed to making this as easy as possible
for people.

Once we get over those hurdles we can possibly revisit automation.

Then, consider the typedefs issue.  If you add a typedef but don't add
it to the typedefs list but otherwise pgindent your code perfectly,
the push would be accepted.  If then later someone updates the
typedefs list, perhaps from the build farm, it would then reject the
indentation of your previously committed code, thus making it their
problem.

It would be nice if there were a gadget that would find new typedefs and
warn you about them. Unfortunately our current code to find typedefs
isn't all that fast either. Nicer still would be a way of not needing
the typedefs list, but I don't think anyone has come up with one yet
that meets our other requirements.

I think a better way to address these issues would be making this into
a test suite, so that you can run some command that checks "is
everything indented correctly".  Then you can run this locally, on the
build farm, in the cfbot etc. in a uniform way and apply the existing
blaming/encouragement processes like for any other test failure.

Well arguably the new --silent-diff and --show-diff modes are such tests :-)

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#140Andres Freund
andres@anarazel.de
In reply to: Peter Eisentraut (#137)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:

First, as a matter of principle, it would introduce another level of
gatekeeping power. Right now, the committers are as a group in charge of
what gets into the tree. Adding commit hooks that are installed
somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
If we were using something like github or gitlab (not suggesting that, but
for illustration), then you could put this kind of thing under .github/ or
similar and then it would be under the same control as the source code
itself.

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Also, pgindent takes tens of seconds to run, so hooking that into the git
push process would slow this down quite a bit. And maybe we want to add
pgperltidy and so on, where would this lead?

Yes, I've been annoyed by this as well. This is painful, even without a push
hook. Not just for pgindent, headerscheck/cpluspluscheck are quite painful as
well. I came to the conclusion that we ought to integrate pgindent etc into
the buildsystem properly. Instead of running such targets serially across all
files, without logic to prevent re-processing files, the relevant targets
ought to be run once for each process, and create a stamp file.

If somehow your local indenting doesn't give you the "correct" result for
some reason, you might sit there for minutes and minutes trying to fix and
push and fix and push.

I was imagining that such a pre-receive hook would spit out the target that
you'd need to run locally to verify that the issue is resolved.

Then, consider the typedefs issue. If you add a typedef but don't add it to
the typedefs list but otherwise pgindent your code perfectly, the push would
be accepted. If then later someone updates the typedefs list, perhaps from
the build farm, it would then reject the indentation of your previously
committed code, thus making it their problem.

I'd like to address this one via the buildsystem as well. We can do the
trickery that the buildfarm uses to extract typedefs as part of the build, and
update typedefs.list with the additional types. There's really no need to
force us to do this manually.

Greetings,

Andres Freund

#141Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#140)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:

First, as a matter of principle, it would introduce another level of
gatekeeping power. Right now, the committers are as a group in charge of
what gets into the tree. Adding commit hooks that are installed
somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
If we were using something like github or gitlab (not suggesting that, but
for illustration), then you could put this kind of thing under .github/ or
similar and then it would be under the same control as the source code
itself.

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah. I don't think we are seriously considering putting any restrictions
in place on gitmaster --- the idea is to offer better tools to committers
to let them check/fix the indentation of what they are working on. If
somebody wants to run that as a local pre-commit hook, that's their choice.

regards, tom lane

#142Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#141)
Re: run pgindent on a regular basis / scripted manner

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:

Also, pgindent takes tens of seconds to run, so hooking that into the git
push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed. The average push
need not take tens of seconds.

If somehow your local
indenting doesn't give you the "correct" result for some reason, you might
sit there for minutes and minutes trying to fix and push and fix and push.

As Andres mentioned, the hook could print the command it used. It could even
print the diff it found, for you to apply.

On Mon, Feb 06, 2023 at 04:36:07PM -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:

First, as a matter of principle, it would introduce another level of
gatekeeping power. Right now, the committers are as a group in charge of
what gets into the tree. Adding commit hooks that are installed
somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
If we were using something like github or gitlab (not suggesting that, but
for illustration), then you could put this kind of thing under .github/ or
similar and then it would be under the same control as the source code
itself.

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah. I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

#143Andrew Dunstan
andrew@dunslane.net
In reply to: Noah Misch (#142)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah. I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

That's one idea that's been put forward, but it seems clear that some
people are nervous about it.

Maybe a better course would be to continue improving the toolset and get
more people comfortable with using it locally and then talk about
integrating it upstream.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#144Amit Kapila
amit.kapila16@gmail.com
In reply to: Andrew Dunstan (#143)
Re: run pgindent on a regular basis / scripted manner

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah. I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

That's one idea that's been put forward, but it seems clear that some people are nervous about it.

Maybe a better course would be to continue improving the toolset and get more people comfortable with using it locally and then talk about integrating it upstream.

Yeah, that sounds more reasonable to me as well.

--
With Regards,
Amit Kapila.

#145Magnus Hagander
magnus@hagander.net
In reply to: Amit Kapila (#144)
Re: run pgindent on a regular basis / scripted manner

On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding

the

pre-receive we're discussing here to src/tools/something.

Yeah. I don't think we are seriously considering putting any

restrictions

in place on gitmaster

I could have sworn that was exactly what we were discussing, a

pre-receive

hook on gitmaster.

That's one idea that's been put forward, but it seems clear that some

people are nervous about it.

Maybe a better course would be to continue improving the toolset and get

more people comfortable with using it locally and then talk about
integrating it upstream.

Yeah, that sounds more reasonable to me as well.

If we wanted something "in between" we could perhaps also have a async ci
job that runs after each commit and sends an emali to the committer if the
commit doesn't match up, instead of rejecting it hard but still getting
some relatively fast feedback.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#146Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#131)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-06 Mo 09:40, Robert Haas wrote:

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

My git-fu is probably not all that it should be. I think we could
possibly get at this list of files by running

  git status --porcelain --untracked-files=no --ignored=no -- .

And then your --dirty list would be lines beginning with ' M' while your
--cached list would be lines beginning with 'A[ M]'

Does that seem plausible?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#147Andrew Dunstan
andrew@dunslane.net
In reply to: Magnus Hagander (#145)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-07 Tu 07:59, Magnus Hagander wrote:

On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan
<andrew@dunslane.net> wrote:

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Well, we did talk about adding a pre-commit hook to the

repository, with

instructions for how to enable it. And I don't see a problem

with adding the

pre-receive we're discussing here to src/tools/something.

Yeah.  I don't think we are seriously considering putting any

restrictions

in place on gitmaster

I could have sworn that was exactly what we were discussing, a

pre-receive

hook on gitmaster.

That's one idea that's been put forward, but it seems clear that

some people are nervous about it.

Maybe a better course would be to continue improving the toolset

and get more people comfortable with using it locally and then
talk about integrating it upstream.

Yeah, that sounds more reasonable to me as well.

If we wanted something "in between" we could perhaps also have a async
ci job that runs after each commit and sends an emali to the committer
if the commit doesn't match up, instead of rejecting it hard but still
getting some relatively fast feedback.

Sure, worth trying. We can always turn it off and no harm done if it
doesn't suit. I'd probably start by having it email a couple of guinea
pigs like you and me before turning it loose on committers generally.
LMK if you need help with it.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#148Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Kapila (#144)
Re: run pgindent on a regular basis / scripted manner

Amit Kapila <amit.kapila16@gmail.com> writes:

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Yeah. I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

That's one idea that's been put forward, but it seems clear that some people are nervous about it.
Maybe a better course would be to continue improving the toolset and get more people comfortable with using it locally and then talk about integrating it upstream.

Yeah, that sounds more reasonable to me as well.

+1. Even if we end up with such a hook, we need to walk before we
can run. The tooling of which we speak doesn't even exist today,
so it's unlikely to be bug-free, fast, and convenient to use
just two months from now. Let's have some people use whatever
is proposed for awhile locally, and see what their experience is.

regards, tom lane

#149Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#129)
Re: run pgindent on a regular basis / scripted manner

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that). I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports. But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool. The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code. That could be
supported in pg-bsd-indent with a one line change. I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.

On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".

--
Justin

#150Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#149)
Re: run pgindent on a regular basis / scripted manner

Justin Pryzby <pryzby@telsasoft.com> writes:

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:

But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports. But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

[ shrug... ] If you want to put cycles into that, nobody is stopping
you. For me, it sounds like make-work.

regards, tom lane

#151Andrew Dunstan
andrew@dunslane.net
In reply to: Justin Pryzby (#149)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-07 Tu 10:25, Justin Pryzby wrote:

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:

Justin Pryzby<pryzby@telsasoft.com> writes:
Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that). I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports. But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool. The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code. That could be
supported in pg-bsd-indent with a one line change. I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.

Historically we used to do a heck of a lot more in pgindent that is
currently done in the pre_indent and post_indent functions. If you want
to spend time implementing that logic in pg_bsd_indent so we can remove
the remaining bits of that processing then go for it.

On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".

Meh, git submodules can be a pain in the neck in my limited experience.
I'd rather steer clear.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#152Robert Haas
robertmhaas@gmail.com
In reply to: Andrew Dunstan (#146)
Re: run pgindent on a regular basis / scripted manner

On Tue, Feb 7, 2023 at 8:17 AM Andrew Dunstan <andrew@dunslane.net> wrote:

My git-fu is probably not all that it should be. I think we could possibly get at this list of files by running

git status --porcelain --untracked-files=no --ignored=no -- .

And then your --dirty list would be lines beginning with ' M' while your --cached list would be lines beginning with 'A[ M]'

Does that seem plausible?

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

So to indent files that are dirty, you would look at:

git diff --name-status

For what's cached:

git diff --name-status --cached

For the combination of the two:

git diff --name-status HEAD

For a prior commit:

git show --name-status $COMMITID

--
Robert Haas
EDB: http://www.enterprisedb.com

#153Jelte Fennema
postgres@jeltef.nl
In reply to: Robert Haas (#152)
Re: run pgindent on a regular basis / scripted manner

On Tue, 7 Feb 2023 at 17:11, Robert Haas <robertmhaas@gmail.com> wrote:

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

#154Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema (#153)
Re: run pgindent on a regular basis / scripted manner

On Tue, Feb 7, 2023 at 11:32 AM Jelte Fennema <postgres@jeltef.nl> wrote:

On Tue, 7 Feb 2023 at 17:11, Robert Haas <robertmhaas@gmail.com> wrote:

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

Thanks, that sounds nicer than what I suggested.

--
Robert Haas
EDB: http://www.enterprisedb.com

#155Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#138)
Re: run pgindent on a regular basis / scripted manner

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Here's a quick patch for 1 and 3. Would also need to adjust the docco.

This time with patch.

When supplying the --commit flag it still formats all files for me. I
was able to fix that by replacing:
# no non-option arguments given. so do everything in the current directory
$code_base ||= '.' unless @ARGV;

with:
# no files, dirs or commits given. so do everything in the current directory
$code_base ||= '.' unless @ARGV || @commits;

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

#156Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#155)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-07 Tu 12:21, Jelte Fennema wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan<andrew@dunslane.net> wrote:

Here's a quick patch for 1 and 3. Would also need to adjust the docco.

This time with patch.

When supplying the --commit flag it still formats all files for me. I
was able to fix that by replacing:
# no non-option arguments given. so do everything in the current directory
$code_base ||= '.' unless @ARGV;

with:
# no files, dirs or commits given. so do everything in the current directory
$code_base ||= '.' unless @ARGV || @commits;

Yeah, thanks for testing. Here's a new patch with that change and the
comment adjusted.

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

Probably not. I'll look into removing it.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgindent-directory+git-enhancements-v2.patchtext/x-patch; charset=UTF-8; name=pgindent-directory+git-enhancements-v2.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 56640e576a..34fb7d604d 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,12 +23,14 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, $code_base,
 	@excludes,      $indent,      $build,
-	$show_diff,     $silent_diff, $help);
+	$show_diff,     $silent_diff, $help,
+	@commits,);
 
 $help = 0;
 
 my %options = (
 	"help"               => \$help,
+	"commit=s"           => \@commits,
 	"typedefs=s"         => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
 	"code-base=s"        => \$code_base,
@@ -44,6 +46,9 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
+usage("Cannot use --commit with --code-base or command line file list")
+  if (@commits && ($code_base || @ARGV));
+
 run_build($code_base) if ($build);
 
 # command line option wins, then environment (which is how --build sets it) ,
@@ -53,8 +58,9 @@ $typedefs_file ||= $ENV{PGTYPEDEFS};
 # build mode sets PGINDENT
 $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
-# no non-option arguments given. so do everything in the current directory
-$code_base ||= '.' unless @ARGV;
+# if no non-option arguments or commits are given, default to looking in the
+# current directory
+$code_base ||= '.' unless (@ARGV || @commits);
 
 my $sourcedir = locate_sourcedir();
 
@@ -388,6 +394,7 @@ Usage:
 pgindent [OPTION]... [FILE]...
 Options:
 	--help                  show this message and quit
+    --commit=gitref         use files modified by the named commit
 	--typedefs=FILE         file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
 	--code-base=DIR         path to the base of PostgreSQL source code
@@ -396,7 +403,7 @@ Options:
 	--build                 build the pg_bsd_indent program
 	--show-diff             show the changes that would be made
 	--silent-diff           exit with status 2 if any changes would be made
-The --excludes option can be given more than once.
+The --excludes and --commit options can be given more than once.
 EOF
 	if ($help)
 	{
@@ -412,27 +419,38 @@ EOF
 
 # main
 
-# get the list of files under code base, if it's set
-File::Find::find(
-	{
-		wanted => sub {
-			my ($dev, $ino, $mode, $nlink, $uid, $gid);
-			(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
-			  && -f _
-			  && /^.*\.[ch]\z/s
-			  && push(@files, $File::Find::name);
-		}
-	},
-	$code_base) if $code_base;
-
 $filtered_typedefs_fh = load_typedefs();
 
 check_indent();
 
-# any non-option arguments are files to be processed
-push(@files, @ARGV);
+build_clean($code_base) if $build;
 
-# the exclude list applies to command line arguments as well as found files
+my $wanted = sub
+{
+	my ($dev, $ino, $mode, $nlink, $uid, $gid);
+	(($dev, $ino, $mode, $nlink, $uid, $gid) = lstat($_))
+	  && -f _
+	  && /^.*\.[ch]\z/s
+	  && push(@files, $File::Find::name);
+};
+
+# get the list of files under code base, if it's set
+File::Find::find({wanted => $wanted }, $code_base) if $code_base;
+
+# any non-option arguments are files or directories to be processed
+File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
+
+# process named commits by comparing each with their immediate ancestor
+foreach my $commit (@commits)
+{
+	my $prev="$commit~";
+	my @affected=`git diff-tree --no-commit-id --name-only -r $commit $prev`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
+# remove excluded files from the file list
 process_exclude();
 
 foreach my $source_filename (@files)
@@ -481,6 +499,4 @@ foreach my $source_filename (@files)
 	}
 }
 
-build_clean($code_base) if $build;
-
 exit 0;
#157Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#156)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-08 We 07:41, Andrew Dunstan wrote:

On 2023-02-07 Tu 12:21, Jelte Fennema wrote:

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

Probably not. I'll look into removing it.

What we should probably do is remove all the build stuff along with
$code_base. It dates back to the time when I developed this as an out of
tree replacement for the old pgindent, and is just basically wasted
space now. After I get done with the current round of enhancements I'll
reorganize the script and get rid of things we don't need any more.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#158Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#157)
Re: run pgindent on a regular basis / scripted manner

With the new patch --commit works as expected for me now. And sounds
good to up the script a bit afterwards.

Show quoted text

On Wed, 8 Feb 2023 at 14:27, Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-02-08 We 07:41, Andrew Dunstan wrote:

On 2023-02-07 Tu 12:21, Jelte Fennema wrote:

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

Probably not. I'll look into removing it.

What we should probably do is remove all the build stuff along with $code_base. It dates back to the time when I developed this as an out of tree replacement for the old pgindent, and is just basically wasted space now. After I get done with the current round of enhancements I'll reorganize the script and get rid of things we don't need any more.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#159Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#158)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-08 We 12:06, Jelte Fennema wrote:

With the new patch --commit works as expected for me now. And sounds
good to up the script a bit afterwards.

Thanks, I have committed this. Still looking at Robert's other request.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

In reply to: Andrew Dunstan (#159)
1 attachment(s)
RE: run pgindent on a regular basis / scripted manner

Hi,
I tried the committed pgindent.
The attached small patch changes spaces in the usage message to tabs.
Options other than --commit start with a tab.

Regards,
Noriyoshi Shinoda

From: Andrew Dunstan <andrew@dunslane.net>
Sent: Thursday, February 9, 2023 7:10 AM
To: Jelte Fennema <postgres@jeltef.nl>
Cc: Robert Haas <robertmhaas@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>; Andres Freund <andres@anarazel.de>; Noah Misch <noah@leadboat.com>; Peter Geoghegan <pg@bowt.ie>; Bruce Momjian <bruce@momjian.us>; Magnus Hagander <magnus@hagander.net>; Alvaro Herrera <alvherre@2ndquadrant.com>; Stephen Frost <sfrost@snowman.net>; Jesse Zhang <sbjesse@gmail.com>; pgsql-hackers@postgresql.org
Subject: Re: run pgindent on a regular basis / scripted manner

On 2023-02-08 We 12:06, Jelte Fennema wrote:

With the new patch --commit works as expected for me now. And sounds

good to up the script a bit afterwards.

Thanks, I have committed this. Still looking at Robert's other request.

cheers

andrew

--

Andrew Dunstan

EDB: https://www.enterprisedb.com&lt;https://www.enterprisedb.com&gt;

Attachments:

pgindent_usage_v1.diffapplication/octet-stream; name=pgindent_usage_v1.diffDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 34fb7d604d..07970f758c 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -394,7 +394,7 @@ Usage:
 pgindent [OPTION]... [FILE]...
 Options:
 	--help                  show this message and quit
-    --commit=gitref         use files modified by the named commit
+	--commit=gitref         use files modified by the named commit
 	--typedefs=FILE         file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
 	--code-base=DIR         path to the base of PostgreSQL source code
#161Andrew Dunstan
andrew@dunslane.net
In reply to: Shinoda, Noriyoshi (PN Japan FSIP) (#160)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-08 We 21:29, Shinoda, Noriyoshi (PN Japan FSIP) wrote:

Hi,
I tried the committed pgindent.
The attached small patch changes spaces in the usage message to tabs.
Options other than --commit start with a tab.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#162shiy.fnst@fujitsu.com
shiy.fnst@fujitsu.com
In reply to: Andrew Dunstan (#159)
RE: run pgindent on a regular basis / scripted manner

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Thanks, I have committed this. Still looking at Robert's other request.

Hi,

I tried the new option --commit and found that it seems to try to indent files
which are deleted in the specified commit and reports an error.

cannot open file "src/backend/access/brin/brin.c": No such file or directory

It looks we should filter such files.

Regards,
Shi Yu

#163Jelte Fennema
postgres@jeltef.nl
In reply to: shiy.fnst@fujitsu.com (#162)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.

On Fri, 10 Feb 2023 at 03:37, shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:

Show quoted text

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Thanks, I have committed this. Still looking at Robert's other request.

Hi,

I tried the new option --commit and found that it seems to try to indent files
which are deleted in the specified commit and reports an error.

cannot open file "src/backend/access/brin/brin.c": No such file or directory

It looks we should filter such files.

Regards,
Shi Yu

Attachments:

0001-Fix-pgindent-commit-flag-and-add-dirty-and-staged-fl.patchapplication/octet-stream; name=0001-Fix-pgindent-commit-flag-and-add-dirty-and-staged-fl.patchDownload
From 72b448450142ac788c686e9bd96d5b604649a169 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Fri, 10 Feb 2023 10:14:07 +0100
Subject: [PATCH] Fix pgindent commit flag and add dirty and staged flags

Fixes a bug in --commit where pgindent would try to indent deleted
files. Also implement a --staged and a --dirty flag, which were
requested by Robert Haas.
---
 src/tools/pgindent/pgindent | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 07970f758c2..9245e06d04b 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -24,7 +24,7 @@ my $devnull = File::Spec->devnull;
 my ($typedefs_file, $typedef_str, $code_base,
 	@excludes,      $indent,      $build,
 	$show_diff,     $silent_diff, $help,
-	@commits,);
+	@commits,       $dirty,       $staged);
 
 $help = 0;
 
@@ -37,6 +37,8 @@ my %options = (
 	"excludes=s"         => \@excludes,
 	"indent=s"           => \$indent,
 	"build"              => \$build,
+	"dirty"              => \$dirty,
+	"staged"             => \$staged,
 	"show-diff"          => \$show_diff,
 	"silent-diff"        => \$silent_diff,);
 GetOptions(%options) || usage("bad command line argument");
@@ -60,7 +62,7 @@ $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
 # if no non-option arguments or commits are given, default to looking in the
 # current directory
-$code_base ||= '.' unless (@ARGV || @commits);
+$code_base ||= '.' unless (@ARGV || @commits || $dirty || $staged);
 
 my $sourcedir = locate_sourcedir();
 
@@ -401,6 +403,8 @@ Options:
 	--excludes=PATH         file containing list of filename patterns to ignore
 	--indent=PATH           path to pg_bsd_indent program
 	--build                 build the pg_bsd_indent program
+	--staged                use files that are staged for commit
+	--dirty                 use files that are modified (but not staged)
 	--show-diff             show the changes that would be made
 	--silent-diff           exit with status 2 if any changes would be made
 The --excludes and --commit options can be given more than once.
@@ -444,7 +448,23 @@ File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
 foreach my $commit (@commits)
 {
 	my $prev="$commit~";
-	my @affected=`git diff-tree --no-commit-id --name-only -r $commit $prev`;
+	my @affected=`git diff --diff-filter=ACMR --name-only $prev $commit`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
+if ($staged)
+{
+	my @affected=`git diff --diff-filter=ACMR --name-only --staged`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
+if ($dirty)
+{
+	my @affected=`git diff --diff-filter=ACMR --name-only`;
 	die "git error" if $?;
 	chomp(@affected);
 	push(@files,@affected);
-- 
2.34.1

#164Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#153)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-07 Tu 11:32, Jelte Fennema wrote:

On Tue, 7 Feb 2023 at 17:11, Robert Haas<robertmhaas@gmail.com> wrote:

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

OK, here's a patch based on Robert's and Jelte's ideas.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgindent-dirty-cached.patchtext/x-patch; charset=UTF-8; name=pgindent-dirty-cached.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 07970f758c..c9f42302c9 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -24,13 +24,15 @@ my $devnull = File::Spec->devnull;
 my ($typedefs_file, $typedef_str, $code_base,
 	@excludes,      $indent,      $build,
 	$show_diff,     $silent_diff, $help,
-	@commits,);
+	@commits, $dirty, $cached);
 
 $help = 0;
 
 my %options = (
 	"help"               => \$help,
 	"commit=s"           => \@commits,
+	"dirty"              => \$dirty,
+	"cached"             => \$cached,
 	"typedefs=s"         => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
 	"code-base=s"        => \$code_base,
@@ -46,8 +48,11 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
-usage("Cannot use --commit with --code-base or command line file list")
-  if (@commits && ($code_base || @ARGV));
+usage("Cannot use --commit, --dirty or --cached with --code-base or command line file list")
+  if (($dirty || $cached || @commits) && ($code_base || @ARGV));
+
+usage( "Cannot use --dirty or --cached with --commit")
+  if (($dirty || $cached) && @commits);
 
 run_build($code_base) if ($build);
 
@@ -395,6 +400,8 @@ pgindent [OPTION]... [FILE]...
 Options:
 	--help                  show this message and quit
 	--commit=gitref         use files modified by the named commit
+	--dirty                 use modified files under current directory
+	--cached                use staged files under current directory
 	--typedefs=FILE         file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
 	--code-base=DIR         path to the base of PostgreSQL source code
@@ -450,6 +457,19 @@ foreach my $commit (@commits)
 	push(@files,@affected);
 }
 
+# look for dirty / cached files under the CWD
+if ($dirty || $cached)
+{
+	my $gitcmd = "git diff --diff-filter=ACMR --name-only";
+	$gitcmd .= " --cached" if ($cached && !$dirty);
+	$gitcmd .= " HEAD" if ($dirty && $cached);
+	$gitcmd .= " -- .";
+	my @affected=`$gitcmd`;
+	die "git error" if $?;
+	chomp(@affected);
+	push(@files,@affected);
+}
+
 # remove excluded files from the file list
 process_exclude();
 
#165Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#163)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-10 Fr 04:25, Jelte Fennema wrote:

Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.

[please don't top-post]

I don't think just adding a diff filter is really a sufficient fix. The
file might have been deleted since the commit(s) in question. Here's a
more general fix for missing files.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgindent-missing-file-fix.patchtext/x-patch; charset=UTF-8; name=pgindent-missing-file-fix.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 07970f758c..64284e7ea2 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -444,7 +444,7 @@ File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
 foreach my $commit (@commits)
 {
 	my $prev="$commit~";
-	my @affected=`git diff-tree --no-commit-id --name-only -r $commit $prev`;
+	my @affected=`git diff --diff-filter=ACMR --name-only $prev $commit`;
 	die "git error" if $?;
 	chomp(@affected);
 	push(@files,@affected);
@@ -458,6 +458,12 @@ foreach my $source_filename (@files)
 	# ignore anything that's not a .c or .h file
 	next unless $source_filename =~ /\.[ch]$/;
 
+	# don't try to indent a file that doesn't exist
+	unless (-f $source_filename)
+	{
+		warn "Could not find $source_filename";
+		next;
+	}
 	# Automatically ignore .c and .h files that correspond to a .y or .l
 	# file.  indent tends to get badly confused by Bison/flex output,
 	# and there's no value in indenting derived files anyway.
#166Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#165)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-10 Fr 10:21, Andrew Dunstan wrote:

On 2023-02-10 Fr 04:25, Jelte Fennema wrote:

Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.

I don't think just adding a diff filter is really a sufficient fix.
The file might have been deleted since the commit(s) in question.
Here's a more general fix for missing files.

OK, I've pushed this along with a check to make sure we only process
each file once.

I'm not sure how much more I really want to do here. Given the way
pgindent now processes command line arguments, maybe the best thing is
for people to use that. Use of git aliases can help. Something like
these for example

[alias]

    dirty = diff --name-only --diff-filter=ACMU -- .
    staged = diff --name-only --cached --diff-filter=ACMU -- .
    dstaged = diff --name-only --diff-filter=ACMU HEAD -- .

and then you could do

    pgindent `git dirty`

The only danger would be if there were no dirty files. Maybe we need a
switch to inhibit using the current directory if there are no command
line files.

Thoughts?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#167Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#166)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

... then you could do
    pgindent `git dirty`
The only danger would be if there were no dirty files. Maybe we need a
switch to inhibit using the current directory if there are no command
line files.

It seems like "indent the whole tree" is about to become a minority
use-case. Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

regards, tom lane

#168Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#167)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-12 Su 11:24, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

... then you could do
    pgindent `git dirty`
The only danger would be if there were no dirty files. Maybe we need a
switch to inhibit using the current directory if there are no command
line files.

It seems like "indent the whole tree" is about to become a minority
use-case. Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

I don't think we need --all-files. The attached gets rid of the build
and code-base cruft, which is now in any case obsolete given we've put
pg_bsd_indent in our code base. So the way to spell this instead of
"pgindent --all-files" would be "pgindent ."

I added a warning if there are no files at all specified.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

Attachments:

pgindent-remove-build-and-code-base.patchtext/x-patch; charset=UTF-8; name=pgindent-remove-build-and-code-base.patchDownload
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index a793971e07..b098cef02a 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -21,7 +21,7 @@ my $indent_opts =
 
 my $devnull = File::Spec->devnull;
 
-my ($typedefs_file, $typedef_str, $code_base,
+my ($typedefs_file, $typedef_str,
 	@excludes,      $indent,      $build,
 	$show_diff,     $silent_diff, $help,
 	@commits,);
@@ -33,10 +33,8 @@ my %options = (
 	"commit=s"           => \@commits,
 	"typedefs=s"         => \$typedefs_file,
 	"list-of-typedefs=s" => \$typedef_str,
-	"code-base=s"        => \$code_base,
 	"excludes=s"         => \@excludes,
 	"indent=s"           => \$indent,
-	"build"              => \$build,
 	"show-diff"          => \$show_diff,
 	"silent-diff"        => \$silent_diff,);
 GetOptions(%options) || usage("bad command line argument");
@@ -46,22 +44,16 @@ usage() if $help;
 usage("Cannot have both --silent-diff and --show-diff")
   if $silent_diff && $show_diff;
 
-usage("Cannot use --commit with --code-base or command line file list")
-  if (@commits && ($code_base || @ARGV));
+usage("Cannot use --commit with command line file list")
+  if (@commits && @ARGV);
 
-run_build($code_base) if ($build);
-
-# command line option wins, then environment (which is how --build sets it) ,
-# then locations. based on current dir, then default location
+# command line option wins, then environment, then locations based on current
+# dir, then default location
 $typedefs_file ||= $ENV{PGTYPEDEFS};
 
-# build mode sets PGINDENT
+# get indent location for environment or default
 $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
-# if no non-option arguments or commits are given, default to looking in the
-# current directory
-$code_base ||= '.' unless (@ARGV || @commits);
-
 my $sourcedir = locate_sourcedir();
 
 # if it's the base of a postgres tree, we will exclude the files
@@ -121,8 +113,7 @@ sub check_indent
 sub locate_sourcedir
 {
 	# try fairly hard to locate the sourcedir
-	my $where = $code_base || '.';
-	my $sub = "$where/src/tools/pgindent";
+	my $sub = "./src/tools/pgindent";
 	return $sub if -d $sub;
 	# try to find it from an ancestor directory
 	$sub = "../src/tools/pgindent";
@@ -320,72 +311,6 @@ sub show_diff
 	return $diff;
 }
 
-sub run_build
-{
-	eval "use LWP::Simple;";    ## no critic (ProhibitStringyEval);
-
-	my $code_base = shift || '.';
-	my $save_dir = getcwd();
-
-	# look for the code root
-	foreach (1 .. 5)
-	{
-		last if -d "$code_base/src/tools/pgindent";
-		$code_base = "$code_base/..";
-	}
-
-	die "cannot locate src/tools/pgindent directory in \"$code_base\"\n"
-	  unless -d "$code_base/src/tools/pgindent";
-
-	chdir "$code_base/src/tools/pgindent";
-
-	my $typedefs_list_url =
-	  "https://buildfarm.postgresql.org/cgi-bin/typedefs.pl";
-
-	my $rv = getstore($typedefs_list_url, "tmp_typedefs.list");
-
-	die "cannot fetch typedefs list from $typedefs_list_url\n"
-	  unless is_success($rv);
-
-	$ENV{PGTYPEDEFS} = abs_path('tmp_typedefs.list');
-
-	my $indentrepo = "https://git.postgresql.org/git/pg_bsd_indent.git";
-	system("git clone $indentrepo >$devnull 2>&1");
-	die "could not fetch pg_bsd_indent sources from $indentrepo\n"
-	  unless $? == 0;
-
-	chdir "pg_bsd_indent" || die;
-	system("make all check >$devnull");
-	die "could not build pg_bsd_indent from source\n"
-	  unless $? == 0;
-
-	$ENV{PGINDENT} = abs_path('pg_bsd_indent');
-
-	chdir $save_dir;
-	return;
-}
-
-sub build_clean
-{
-	my $code_base = shift || '.';
-
-	# look for the code root
-	foreach (1 .. 5)
-	{
-		last if -d "$code_base/src/tools/pgindent";
-		$code_base = "$code_base/..";
-	}
-
-	die "cannot locate src/tools/pgindent directory in \"$code_base\"\n"
-	  unless -d "$code_base/src/tools/pgindent";
-
-	chdir "$code_base";
-
-	system("rm -rf src/tools/pgindent/pg_bsd_indent");
-	system("rm -f src/tools/pgindent/tmp_typedefs.list");
-	return;
-}
-
 sub usage
 {
 	my $message  = shift;
@@ -397,10 +322,8 @@ Options:
 	--commit=gitref         use files modified by the named commit
 	--typedefs=FILE         file containing a list of typedefs
 	--list-of-typedefs=STR  string containing typedefs, space separated
-	--code-base=DIR         path to the base of PostgreSQL source code
 	--excludes=PATH         file containing list of filename patterns to ignore
 	--indent=PATH           path to pg_bsd_indent program
-	--build                 build the pg_bsd_indent program
 	--show-diff             show the changes that would be made
 	--silent-diff           exit with status 2 if any changes would be made
 The --excludes and --commit options can be given more than once.
@@ -423,8 +346,6 @@ $filtered_typedefs_fh = load_typedefs();
 
 check_indent();
 
-build_clean($code_base) if $build;
-
 my $wanted = sub
 {
 	my ($dev, $ino, $mode, $nlink, $uid, $gid);
@@ -434,9 +355,6 @@ my $wanted = sub
 	  && push(@files, $File::Find::name);
 };
 
-# get the list of files under code base, if it's set
-File::Find::find({wanted => $wanted }, $code_base) if $code_base;
-
 # any non-option arguments are files or directories to be processed
 File::Find::find({wanted => $wanted}, @ARGV) if @ARGV;
 
@@ -450,6 +368,8 @@ foreach my $commit (@commits)
 	push(@files,@affected);
 }
 
+warn "No files to process" unless @files;
+
 # remove excluded files from the file list
 process_exclude();
 
diff --git a/src/tools/pgindent/pgindent.man b/src/tools/pgindent/pgindent.man
index 7406794ba3..fe411ee699 100644
--- a/src/tools/pgindent/pgindent.man
+++ b/src/tools/pgindent/pgindent.man
@@ -8,14 +8,9 @@ You can see all the options by running:
 	pgindent --help
 
 In its simplest form, if all the required objects are installed, simply run
-it without any parameters at the top of the source tree you want to process.
+it at the top of the source tree you want to process like this:
 
-	pgindent
-
-If you don't have all the requirements installed, pgindent will fetch and build
-them for you, if you're in a PostgreSQL source tree:
-
-	pgindent --build
+	pgindent .
 
 If your pg_bsd_indent program is not installed in your path, you can specify
 it by setting the environment variable INDENT, or PGINDENT, or by giving the
@@ -28,9 +23,6 @@ specified using the PGTYPEDEFS environment variable, or via the command line
 --typedefs option. If neither is used, it will look for it within the
 current source tree, or in /usr/local/etc/typedefs.list.
 
-If you want to indent a source tree other than the current working directory,
-you can specify it via the --code-base command line option.
-
 We don't want to indent certain files in the PostgreSQL source. pgindent
 will honor a file containing a list of patterns of files to avoid. This
 file can be specified using the --excludes command line option. If indenting
#169Justin Pryzby
pryzby@telsasoft.com
In reply to: Tom Lane (#167)
Re: run pgindent on a regular basis / scripted manner

On Sun, Feb 12, 2023 at 11:24:14AM -0500, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

... then you could do
��� pgindent `git dirty`
The only danger would be if there were no dirty files. Maybe we need a
switch to inhibit using the current directory if there are no command
line files.

It seems like "indent the whole tree" is about to become a minority
use-case. Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

It seems like if pgindent knows about git, it ought to process only
tracked files. Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

At least --commit doesn't seem to work when run outside of the root
source dir.

--
Justin

#170Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#168)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-02-12 Su 11:24, Tom Lane wrote:

It seems like "indent the whole tree" is about to become a minority
use-case. Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

I don't think we need --all-files. The attached gets rid of the build
and code-base cruft, which is now in any case obsolete given we've put
pg_bsd_indent in our code base. So the way to spell this instead of
"pgindent --all-files" would be "pgindent ."

Ah, of course.

I added a warning if there are no files at all specified.

LGTM.

regards, tom lane

#171Andrew Dunstan
andrew@dunslane.net
In reply to: Justin Pryzby (#169)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-12 Su 15:59, Justin Pryzby wrote:

It seems like if pgindent knows about git, it ought to process only
tracked files. Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

for vpath builds use an exclude file that excludes the vpath you use.

I don't really want restrict this to tracked files because it would mean
you can't pgindent files before you `git add` them. And we would still
need to do manual exclusion for some files that are tracked, e.g. the
snowball files.

At least --commit doesn't seem to work when run outside of the root
source dir.

Yeah, I'll fix that, thanks for mentioning.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#172Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#170)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-12 Su 16:13, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-02-12 Su 11:24, Tom Lane wrote:

It seems like "indent the whole tree" is about to become a minority
use-case. Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

I don't think we need --all-files. The attached gets rid of the build
and code-base cruft, which is now in any case obsolete given we've put
pg_bsd_indent in our code base. So the way to spell this instead of
"pgindent --all-files" would be "pgindent ."

Ah, of course.

I added a warning if there are no files at all specified.

LGTM.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#173Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#166)
Re: run pgindent on a regular basis / scripted manner

On Sun, 12 Feb 2023 at 15:16, Andrew Dunstan <andrew@dunslane.net> wrote:

I'm not sure how much more I really want to do here. Given the way pgindent now processes command line arguments, maybe the best thing is for people to use that. Use of git aliases can help. Something like these for example

[alias]

dirty = diff --name-only --diff-filter=ACMU -- .
staged = diff --name-only --cached --diff-filter=ACMU -- .
dstaged = diff --name-only --diff-filter=ACMU HEAD -- .

and then you could do

pgindent `git dirty`

The only danger would be if there were no dirty files. Maybe we need a switch to inhibit using the current directory if there are no command line files.

Thoughts?

I think indenting staged or dirty files is probably the most common
operation that people want to do with pgindent. So I think that having
dedicated flags makes sense. I agree that it's not strictly necessary
and git aliases help a lot. But the git aliases require you to set
them up. To me making the most common operation as easy as possible to
do, seems worth the few extra lines to pgindent.

Sidenote: You mentioned untracked files in another email. I think that
the --dirty flag should probably also include untracked files. A
command to do so is: git ls-files --others --exclude-standard

#174Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#173)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-13 Mo 09:02, Jelte Fennema wrote:

On Sun, 12 Feb 2023 at 15:16, Andrew Dunstan<andrew@dunslane.net> wrote:

I'm not sure how much more I really want to do here. Given the way pgindent now processes command line arguments, maybe the best thing is for people to use that. Use of git aliases can help. Something like these for example

[alias]

dirty = diff --name-only --diff-filter=ACMU -- .
staged = diff --name-only --cached --diff-filter=ACMU -- .
dstaged = diff --name-only --diff-filter=ACMU HEAD -- .

and then you could do

pgindent `git dirty`

The only danger would be if there were no dirty files. Maybe we need a switch to inhibit using the current directory if there are no command line files.

Thoughts?

I think indenting staged or dirty files is probably the most common
operation that people want to do with pgindent. So I think that having
dedicated flags makes sense. I agree that it's not strictly necessary
and git aliases help a lot. But the git aliases require you to set
them up. To me making the most common operation as easy as possible to
do, seems worth the few extra lines to pgindent.

OK, but I'd like to hear from more people about what they want.
Experience tells me that making assumptions about how people work is not
a good idea. I doubt anyone's work pattern is like mine. I don't want to
implement an option that three people are going to use.

Sidenote: You mentioned untracked files in another email. I think that
the --dirty flag should probably also include untracked files. A
command to do so is: git ls-files --others --exclude-standard

Thanks for the info.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#175Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#174)
Re: run pgindent on a regular basis / scripted manner

On Mon, 13 Feb 2023 at 17:47, Andrew Dunstan <andrew@dunslane.net> wrote:

OK, but I'd like to hear from more people about what they want. Experience tells me that making assumptions about how people work is not a good idea. I doubt anyone's work pattern is like mine. I don't want to implement an option that three people are going to use.

In the general case I agree with you. But in this specific case I
don't. To me the whole point of this email thread is to nudge people
towards indenting the changes that they are committing. Thus indenting
those changes (either before or after adding) is the workflow that we
want to make as easy as possible. Because even if it's not people
their current workflow, by adding the flag it hopefully becomes their
workflow, because it's so easy to use. So my point is we want to
remove as few hurdles as possible for people to indent their changes
(and setting up git aliases or pre-commit hooks are all hurdles).

#176Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#175)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-13 Mo 13:29, Jelte Fennema wrote:

On Mon, 13 Feb 2023 at 17:47, Andrew Dunstan<andrew@dunslane.net> wrote:

OK, but I'd like to hear from more people about what they want. Experience tells me that making assumptions about how people work is not a good idea. I doubt anyone's work pattern is like mine. I don't want to implement an option that three people are going to use.

In the general case I agree with you. But in this specific case I
don't. To me the whole point of this email thread is to nudge people
towards indenting the changes that they are committing. Thus indenting
those changes (either before or after adding) is the workflow that we
want to make as easy as possible. Because even if it's not people
their current workflow, by adding the flag it hopefully becomes their
workflow, because it's so easy to use. So my point is we want to
remove as few hurdles as possible for people to indent their changes
(and setting up git aliases or pre-commit hooks are all hurdles).

(ITYM "remove as many hurdles as possible"). It remains to be seen how
much easier any of this will make life for committers, at least. But I
concede this might make life a bit simpler for developers generally.

Anyway, let's talk about the details of what is proposed.

So far, we have had the following categories suggested: dirty, staged,
dirty+staged, untracked. Are there any others?

Another issue is whether or not to restrict these to files under the
current directory. I think we probably should, or at least provide a
--relative option.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#177Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#176)
Re: run pgindent on a regular basis / scripted manner

(ITYM "remove as many hurdles as possible").

yes, I messed up rewriting that sentence from "having as few hurdles
as possible" to "removing as many hurdles as possible"

So far, we have had the following categories suggested: dirty, staged, dirty+staged, untracked. Are there any others?

The two workflows that make most sense to me personally are:
1. staged (indent anything that you're staging for a commit)
2. dirty+staged+untracked (indent anything you've been working on that
is not committed yet)

The obvious way of having --dirty, --staged, and --untracked flags
would require 3 flags for this second (to me seemingly) common
operation. That seems quite unfortunate. So I would propose the
following two flags for those purposes:
1. --staged/--cached (--cached is in line with git, but I personally
think --staged is clearer, git has --staged-only but that seems long
for no reason)
2. --uncommitted

And maybe for completeness we could have the following flags, so you
could target any combination of staged/untracked/dirty files:
3. --untracked (untracked files only)
4. --dirty (tracked files with changes that are not staged)

But I don't know in what workflow people would actually use them.

Another issue is whether or not to restrict these to files under the current directory. I think we probably should, or at least provide a --relative option.

Good point, I think it makes sense to restrict it to the current
directory by default. You can always cd to the root of the repo if you
want to format everything.

#178Justin Pryzby
pryzby@telsasoft.com
In reply to: Andrew Dunstan (#171)
Re: run pgindent on a regular basis / scripted manner

On Mon, Feb 13, 2023 at 07:57:25AM -0500, Andrew Dunstan wrote:

On 2023-02-12 Su 15:59, Justin Pryzby wrote:

It seems like if pgindent knows about git, it ought to process only
tracked files. Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

I don't really want restrict this to tracked files because it would mean you
can't pgindent files before you `git add` them.

I think you'd allow indenting files which were either tracked *or*
specified on the command line.

Also, it makes a more sense to "add" the file before indenting it, to
allow checking the output and remove unrelated changes. So that doesn't
seem to me like a restriction of any significance.

But I would never want to indent an untracked file unless I specified
it.

--
Justin

#179Jelte Fennema
postgres@jeltef.nl
In reply to: Justin Pryzby (#178)
Re: run pgindent on a regular basis / scripted manner

Also, it makes a more sense to "add" the file before indenting it, to
allow checking the output and remove unrelated changes. So that doesn't
seem to me like a restriction of any significance.

For my workflow it would be the same, but afaik there's two ways that
people commonly use git (mine is 1):
1. Adding changes/files to the staging area using and then committing
those changes:
git add (-p)/emacs magit/some other editor integration
2. Just add everything that's changed and commit all of it:
git add -A + git commit/git commit -a

For workflow 1, a --staged/--cached flag would be enough IMHO. But
that's not at all helpful for workflow 2. That's why I proposed
--uncommitted too, to make indenting easier for workflow 2.

But I would never want to indent an untracked file unless I specified
it.

Would the --uncommitted flag I proposed be enough of an explicit way
of specifying that you want to indent untracked files?

#180Noah Misch
noah@leadboat.com
In reply to: Justin Pryzby (#178)
Re: run pgindent on a regular basis / scripted manner

On Wed, Feb 15, 2023 at 12:45:52PM -0600, Justin Pryzby wrote:

On Mon, Feb 13, 2023 at 07:57:25AM -0500, Andrew Dunstan wrote:

On 2023-02-12 Su 15:59, Justin Pryzby wrote:

It seems like if pgindent knows about git, it ought to process only
tracked files. Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

I don't really want restrict this to tracked files because it would mean you
can't pgindent files before you `git add` them.

I think you'd allow indenting files which were either tracked *or*
specified on the command line.

Also, it makes a more sense to "add" the file before indenting it, to
allow checking the output and remove unrelated changes. So that doesn't
seem to me like a restriction of any significance.

But I would never want to indent an untracked file unless I specified
it.

Agreed. I use pgindent three ways:

1. Indent everything that changed between master and the current branch. Most
common, since I develop nontrivial patches on branches.
2. Indent all staged files. For trivial changes.
3. Indent all tracked files. For typedefs.list changes.

That said, pre-2023 pgindent changed untracked files if called without a file
list. I've lived with that and could continue to do so.

#181shiy.fnst@fujitsu.com
shiy.fnst@fujitsu.com
In reply to: Andrew Dunstan (#159)
1 attachment(s)
RE: run pgindent on a regular basis / scripted manner

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Thanks, I have committed this. Still looking at Robert's other request.

Hi,

Commit #068a243b7 supported directories to be non-option arguments of pgindent.
But the help text doesn't mention that. Should we update it? Attach a small
patch which did that.

Regards,
Shi Yu

Attachments:

v1-0001-Update-help-text-of-pgindent.patchapplication/octet-stream; name=v1-0001-Update-help-text-of-pgindent.patchDownload
From 4782ac969a002d64e55083a526d9466d491846ad Mon Sep 17 00:00:00 2001
From: Shi Yu <shiy.fnst@fujitsu.com>
Date: Thu, 16 Feb 2023 16:27:04 +0800
Subject: [PATCH v1] Update help text of pgindent

---
 src/tools/pgindent/pgindent | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index 8a2d39c4f6..3c48f3c46b 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -316,7 +316,7 @@ sub usage
 	my $message  = shift;
 	my $helptext = <<'EOF';
 Usage:
-pgindent [OPTION]... [FILE]...
+pgindent [OPTION]... [FILE|DIR]...
 Options:
 	--help                  show this message and quit
 	--commit=gitref         use files modified by the named commit
-- 
2.31.1

#182Andrew Dunstan
andrew@dunslane.net
In reply to: shiy.fnst@fujitsu.com (#181)
Re: run pgindent on a regular basis / scripted manner

On 2023-02-16 Th 03:26, shiy.fnst@fujitsu.com wrote:

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan<andrew@dunslane.net> wrote:

Thanks, I have committed this. Still looking at Robert's other request.

Hi,

Commit #068a243b7 supported directories to be non-option arguments of pgindent.
But the help text doesn't mention that. Should we update it? Attach a small
patch which did that.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#183Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#182)
Re: run pgindent on a regular basis / scripted manner

Now that the PG16 feature freeze happened I think it's time to bump
this thread again. As far as I remember everyone that responded (even
previously silent people) were themselves proponents of being more
strict around pgindent.

I think there's two things needed to actually start doing this:
1. We need to reindent the tree to create an indented baseline
2. We need some automation to complain about unindented code being committed

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b. If that doesn't achieve 2 for some reason, or turns
out to have problems we can always change to a afterwards.

#184Michael Paquier
michael@paquier.xyz
In reply to: Jelte Fennema (#183)
Re: run pgindent on a regular basis / scripted manner

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them. Anyway, isn't 'a'
exactly the same as 'b' in design? Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched. I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.
--
Michael

#185Andrew Dunstan
andrew@dunslane.net
In reply to: Michael Paquier (#184)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them. Anyway, isn't 'a'
exactly the same as 'b' in design? Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched. I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.

Perhaps we should start with a buildfarm module, which would run
pg_indent --show-diff. That would only need to run on one animal, so a
failure wouldn't send the whole buildfarm red. This would be pretty easy
to do.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#186Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#185)
Re: run pgindent on a regular basis / scripted manner

On Sat, Apr 22, 2023 at 07:42:36AM -0400, Andrew Dunstan wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent
--show-diff.

Nice, I didn't know this one and it has been mentioned a bit on this
thread. Indeed, it is possible to just rely on that.
--
Michael

#187Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#185)
Re: run pgindent on a regular basis / scripted manner

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them. Anyway, isn't 'a'
exactly the same as 'b' in design? Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched. I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear, you guys are aware we already have a git repo
that's supposed to track "head + pg_indent" at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
right?

I see it is currently not working and this has not been noticed by
anyone, so I guess it kind of indicates nobody is using it today. The
reason appears to be that it uses pg_bsd_indent that's in our apt
repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
service that would actually be useful, this could certainly be ficked
pretty easy.

But bottom line is that if pgindent is as predictable as it should be,
it might be easier to use that one central place that already does it
rather than have to build a buildfarm module?

//Magnus

#188Magnus Hagander
magnus@hagander.net
In reply to: Noah Misch (#142)
Re: run pgindent on a regular basis / scripted manner

On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:

Also, pgindent takes tens of seconds to run, so hooking that into the git
push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed. The average push
need not take tens of seconds.

It would probably ont be tens of seconds, but it would be slow. It
would need to do a clean git checkout into an isolated environment and
spawn in there, and just that takes time. And it would have to also
know to rebuild pg_bsd_indent on demand, which would require a full
./configure run (or meson equivalent). etc.

So while it might not be tens of seconds, it most definitely won't be fast.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#189Andrew Dunstan
andrew@dunslane.net
In reply to: Magnus Hagander (#187)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 08:47, Magnus Hagander wrote:

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan<andrew@dunslane.net> wrote:

On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them. Anyway, isn't 'a'
exactly the same as 'b' in design? Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched. I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear, you guys are aware we already have a git repo
that's supposed to track "head + pg_indent" at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
right?

I see it is currently not working and this has not been noticed by
anyone, so I guess it kind of indicates nobody is using it today. The
reason appears to be that it uses pg_bsd_indent that's in our apt
repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
service that would actually be useful, this could certainly be ficked
pretty easy.

But bottom line is that if pgindent is as predictable as it should be,
it might be easier to use that one central place that already does it
rather than have to build a buildfarm module?

Now that pg_bsd_indent is in the core code why not just use that?

Happy if you can make something work without further effort on my part :-)

cheers

andew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#190Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#187)
Re: run pgindent on a regular basis / scripted manner

Magnus Hagander <magnus@hagander.net> writes:

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

I am absolutely against a pre-receive hook on gitmaster. A buildfarm
check seems far more appropriate.

regards, tom lane

#191Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema (#183)
Re: run pgindent on a regular basis / scripted manner

Jelte Fennema <postgres@jeltef.nl> writes:

I think there's two things needed to actually start doing this:
1. We need to reindent the tree to create an indented baseline

As far as (1) goes, I've been holding off on that because there
are some large patches that still seem in danger of getting
reverted, notably 2489d76c4 and follow-ups. A pgindent run
would change any such reversions from being mechanical into
possibly a fair bit of work. We still have a couple of weeks
before it's necessary to make such decisions, so I don't want
to do the pgindent run before that.

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options. But I don't believe
we have a final proposal on that, much less committed code.

regards, tom lane

#192Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#191)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 10:39, Tom Lane wrote:

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options. But I don't believe
we have a final proposal on that, much less committed code.

Well, I posted a fairly concrete suggestion with an example patch
upthread at

</messages/by-id/47011581-ddec-1a87-6828-6edfabe6b7b6@dunslane.net&gt;

I still think that's worth doing.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#193Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#192)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-22 Sa 10:39, Tom Lane wrote:

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options. But I don't believe
we have a final proposal on that, much less committed code.

Well, I posted a fairly concrete suggestion with an example patch
upthread at
</messages/by-id/47011581-ddec-1a87-6828-6edfabe6b7b6@dunslane.net&gt;
I still think that's worth doing.

OK, so plan is (a) update perltidyrc to add --valign-exclusion-list,
(b) adjust pgindent/README to recommend perltidy version 20221112.

Questions:

* I see that there's now a 20230309 release, should we consider that
instead?

* emacs.samples provides pgsql-perl-style that claims to match
perltidy's rules. Does that need any adjustment? I don't see
anything in it that looks relevant, but I'm not terribly familiar
with emacs' Perl formatting options.

regards, tom lane

#194Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#193)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 11:37, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-04-22 Sa 10:39, Tom Lane wrote:

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options. But I don't believe
we have a final proposal on that, much less committed code.

Well, I posted a fairly concrete suggestion with an example patch
upthread at
</messages/by-id/47011581-ddec-1a87-6828-6edfabe6b7b6@dunslane.net&gt;
I still think that's worth doing.

OK, so plan is (a) update perltidyrc to add --valign-exclusion-list,
(b) adjust pgindent/README to recommend perltidy version 20221112.

Questions:

* I see that there's now a 20230309 release, should we consider that
instead?

A test I just ran gave identical results to those from 20221112

* emacs.samples provides pgsql-perl-style that claims to match
perltidy's rules. Does that need any adjustment? I don't see
anything in it that looks relevant, but I'm not terribly familiar
with emacs' Perl formatting options.

At least w.r.t. the vertical alignment issue, AFAICT the emacs style
does not attempt to align anything vertically except the first non-space
thing on the line. So if anything, by abandoning a lot of vertical
alignment it would actually be closer to what the sample emacs style does.

The great advantage of not doing this alignment is that there is far
less danger of perltidy trying to realign lines that have not in fact
changed, because some nearby line has changed. So we'd have a good deal
less pointless churn.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#195Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#194)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-22 Sa 11:37, Tom Lane wrote:

* I see that there's now a 20230309 release, should we consider that
instead?

A test I just ran gave identical results to those from 20221112

Cool, let's use perltidy 20230309 then.

The great advantage of not doing this alignment is that there is far
less danger of perltidy trying to realign lines that have not in fact
changed, because some nearby line has changed. So we'd have a good deal
less pointless churn.

Yes, exactly.

regards, tom lane

#196Noah Misch
noah@leadboat.com
In reply to: Magnus Hagander (#188)
Re: run pgindent on a regular basis / scripted manner

(Given that another commentator is "absolutely against" a hook, this message
is mostly for readers considering this for other projects.)

On Sat, Apr 22, 2023 at 03:23:59PM +0200, Magnus Hagander wrote:

On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:

Also, pgindent takes tens of seconds to run, so hooking that into the git
push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed. The average push
need not take tens of seconds.

It would probably ont be tens of seconds, but it would be slow. It
would need to do a clean git checkout into an isolated environment and
spawn in there, and just that takes time.

That would be slow, but I wouldn't do it that way. I'd make "pg_bsd_ident
--pre-receive --show-diff" that, instead of reading from the filesystem, gets
the bytes to check from the equivalent of this Perl-like pseudocode:

while (<>) {
my($old_hash, $new_hash, $ref) = split;
foreach my $filename (split /\n/, `git diff --name-only $old_hash..$new_hash`) {
$file_content = `git show $new_hash $filename`;
}
}

I just ran pgindent on the file name lists of the last 1000 commits, and
runtime was less than 0.5s for each of 998/1000 commits. There's more a real
implementation might handle:

- pg_bsd_indent changes
- typedefs.list changes
- skip if the break-glass "pgindent: no" appears in a commit message
- commits changing so many files that a clean "git checkout" would be faster

And it would have to also
know to rebuild pg_bsd_indent on demand, which would require a full
./configure run (or meson equivalent). etc.

So while it might not be tens of seconds, it most definitely won't be fast.

A project more concerned about elapsed time than detecting all defects might
even choose to take no synchronous action for pg_bsd_indent and typedefs.list
changes. When a commit changes either of those, the probability that the
committer already ran pgindent rises substantially.

#197Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#196)
Re: run pgindent on a regular basis / scripted manner

Noah Misch <noah@leadboat.com> writes:

- skip if the break-glass "pgindent: no" appears in a commit message

There are two things that bother me about putting this functionality
into a server hook, beyond the possible speed issue:

* The risk of failure. I don't have a terribly difficult time imagining
situations where things get sufficiently wedged that the server accepts
*no* commits, not even ones fixing the problem. An override such as
you suggest here could assuage that fear, perhaps.

* The lack of user-friendliness. AFAIK, if a pre-receive hook fails
you learn little except that it failed. This could be extremely
frustrating to debug, especially in a situation where your local
pgindent is giving you different results than the server gets.

The idea of a buildfarm animal failing if --show-diff isn't empty
is attractive to me mainly because it is far nicer from the
debuggability standpoint.

Maybe, after we get some amount of experience with trying to keep
things always indent-clean, we will decide that it's reliable enough
to enforce in a server hook. I think going for that right away is
sheer folly though.

regards, tom lane

#198Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#197)
Re: run pgindent on a regular basis / scripted manner

On Sat, Apr 22, 2023 at 04:15:23PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

- skip if the break-glass "pgindent: no" appears in a commit message

There are two things that bother me about putting this functionality
into a server hook, beyond the possible speed issue:

* The risk of failure. I don't have a terribly difficult time imagining
situations where things get sufficiently wedged that the server accepts
*no* commits, not even ones fixing the problem. An override such as
you suggest here could assuage that fear, perhaps.

I agree that deserves some worry.

* The lack of user-friendliness. AFAIK, if a pre-receive hook fails
you learn little except that it failed.

That is incorrect. The client gets whatever the hook prints. I'd probably
make it print the first 10000 lines of the diff.

I'm okay with a buildfarm animal. It's going to result in a more-cluttered
git history as people push things, break that animal, and push followup fixes.
While that's sad, I expect the level of clutter will go down pretty quickly
and will soon be no worse than we already get from typo-fix pushes.

#199Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#195)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 15:58, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-04-22 Sa 11:37, Tom Lane wrote:

* I see that there's now a 20230309 release, should we consider that
instead?

A test I just ran gave identical results to those from 20221112

Cool, let's use perltidy 20230309 then.

OK, so when would we do this? The change to 20230309 + valign changes is
fairly large:

188 files changed, 3657 insertions(+), 3395 deletions(-)

Maybe right before we fork the tree?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#200Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#197)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-22 Sa 16:15, Tom Lane wrote:

Noah Misch<noah@leadboat.com> writes:

- skip if the break-glass "pgindent: no" appears in a commit message

There are two things that bother me about putting this functionality
into a server hook, beyond the possible speed issue:

* The risk of failure. I don't have a terribly difficult time imagining
situations where things get sufficiently wedged that the server accepts
*no* commits, not even ones fixing the problem. An override such as
you suggest here could assuage that fear, perhaps.

* The lack of user-friendliness. AFAIK, if a pre-receive hook fails
you learn little except that it failed. This could be extremely
frustrating to debug, especially in a situation where your local
pgindent is giving you different results than the server gets.

The idea of a buildfarm animal failing if --show-diff isn't empty
is attractive to me mainly because it is far nicer from the
debuggability standpoint.

Maybe, after we get some amount of experience with trying to keep
things always indent-clean, we will decide that it's reliable enough
to enforce in a server hook. I think going for that right away is
sheer folly though.

What I'll do for now is set up a buildfarm module that will log the
differences but won't error out if there are any. At least that way
we'll know better what we're dealing with.

I don't really like Noah's idea of having a pre-receive hook possibly
spew thousands of diff lines to the terminal.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#201Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#199)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-22 Sa 15:58, Tom Lane wrote:

Cool, let's use perltidy 20230309 then.

OK, so when would we do this? The change to 20230309 + valign changes is
fairly large:

I think we could go ahead and commit the perltidyrc and README changes
now. But the ensuing reformatting should happen as part of the mass
pgindent run, probably next month sometime.

regards, tom lane

#202Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#201)
Re: run pgindent on a regular basis / scripted manner

On Sun, 23 Apr 2023 at 17:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think we could go ahead and commit the perltidyrc and README changes
now. But the ensuing reformatting should happen as part of the mass
pgindent run, probably next month sometime.

I think it's better to make the changes close together, not with a
month in between. Otherwise no-one will be able to run perltidy on
their patches, because the config and the files are even more out of
sync than they are now. So I'd propose to commit the perltidyrc
changes right before the pgindent run.

#203Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#189)
Re: run pgindent on a regular basis / scripted manner

On Sat, Apr 22, 2023 at 4:12 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-04-22 Sa 08:47, Magnus Hagander wrote:

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:

On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them. Anyway, isn't 'a'
exactly the same as 'b' in design? Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched. I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear, you guys are aware we already have a git repo
that's supposed to track "head + pg_indent" at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
right?

I see it is currently not working and this has not been noticed by
anyone, so I guess it kind of indicates nobody is using it today. The
reason appears to be that it uses pg_bsd_indent that's in our apt
repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
service that would actually be useful, this could certainly be ficked
pretty easy.

But bottom line is that if pgindent is as predictable as it should be,
it might be easier to use that one central place that already does it
rather than have to build a buildfarm module?

Now that pg_bsd_indent is in the core code why not just use that?

yeah, it just required building. And the lazy approach was to use the DEB :)

For a quick fix I've built the current HEAD and have it just using
that one -- right now it'll fail again when a change is made to it,
but I'll get that cleaned up.

It's back up and running, and results are at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#204Magnus Hagander
magnus@hagander.net
In reply to: Noah Misch (#196)
Re: run pgindent on a regular basis / scripted manner

On Sat, Apr 22, 2023 at 9:59 PM Noah Misch <noah@leadboat.com> wrote:

(Given that another commentator is "absolutely against" a hook, this message
is mostly for readers considering this for other projects.)

On Sat, Apr 22, 2023 at 03:23:59PM +0200, Magnus Hagander wrote:

On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:

Also, pgindent takes tens of seconds to run, so hooking that into the git
push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed. The average push
need not take tens of seconds.

It would probably ont be tens of seconds, but it would be slow. It
would need to do a clean git checkout into an isolated environment and
spawn in there, and just that takes time.

That would be slow, but I wouldn't do it that way. I'd make "pg_bsd_ident
--pre-receive --show-diff" that, instead of reading from the filesystem, gets
the bytes to check from the equivalent of this Perl-like pseudocode:

while (<>) {
my($old_hash, $new_hash, $ref) = split;
foreach my $filename (split /\n/, `git diff --name-only $old_hash..$new_hash`) {
$file_content = `git show $new_hash $filename`;
}
}

I just ran pgindent on the file name lists of the last 1000 commits, and
runtime was less than 0.5s for each of 998/1000 commits. There's more a real
implementation might handle:

- pg_bsd_indent changes
- typedefs.list changes
- skip if the break-glass "pgindent: no" appears in a commit message
- commits changing so many files that a clean "git checkout" would be faster

Wouldn't there also be the case of a header file change that could
potentially invalidate a whole lot of C files?

There's also the whole potential problem of isolations. We need to run
the whole thing in an isolated environment (because any way in at this
stage could lead to an exploit if a committer key is compromised at
any point). And at least in the second case, it might not have access
to view that data yet because it's not in... Could probably be worked
around, but not trivially so.

(But as mentioned above, I think the conclusion is we don't want this
enforced in a receive hook anyway)

And it would have to also
know to rebuild pg_bsd_indent on demand, which would require a full
./configure run (or meson equivalent). etc.

So while it might not be tens of seconds, it most definitely won't be fast.

A project more concerned about elapsed time than detecting all defects might
even choose to take no synchronous action for pg_bsd_indent and typedefs.list
changes. When a commit changes either of those, the probability that the
committer already ran pgindent rises substantially.

True, but it's far from 100% -- and if you got something in that
didn't work, then the *next* committer would have to clean it up....

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#205Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Jelte Fennema (#202)
Re: run pgindent on a regular basis / scripted manner

On 23.04.23 17:29, Jelte Fennema wrote:

On Sun, 23 Apr 2023 at 17:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I think we could go ahead and commit the perltidyrc and README changes
now. But the ensuing reformatting should happen as part of the mass
pgindent run, probably next month sometime.

I think it's better to make the changes close together, not with a
month in between. Otherwise no-one will be able to run perltidy on
their patches, because the config and the files are even more out of
sync than they are now. So I'd propose to commit the perltidyrc
changes right before the pgindent run.

Does anyone find perltidy useful? To me, it functions more like a
JavaScript compiler in that once you process the source code, it is no
longer useful for manual editing. If we are going to have the buildfarm
check indentation and that is going to be extended to Perl code, I have
some concerns about that.

#206Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#205)
Re: run pgindent on a regular basis / scripted manner

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Does anyone find perltidy useful? To me, it functions more like a
JavaScript compiler in that once you process the source code, it is no
longer useful for manual editing. If we are going to have the buildfarm
check indentation and that is going to be extended to Perl code, I have
some concerns about that.

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines. But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future. Did you have other specific concerns?

regards, tom lane

#207Peter Eisentraut
peter.eisentraut@enterprisedb.com
In reply to: Tom Lane (#206)
Re: run pgindent on a regular basis / scripted manner

On 24.04.23 16:14, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

Does anyone find perltidy useful? To me, it functions more like a
JavaScript compiler in that once you process the source code, it is no
longer useful for manual editing. If we are going to have the buildfarm
check indentation and that is going to be extended to Perl code, I have
some concerns about that.

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines. But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future. Did you have other specific concerns?

I think the worst is how it handles multi-line data structures like

$newnode->command_ok(
[
'psql', '-X',
'-v', 'ON_ERROR_STOP=1',
'-c', $upcmds,
'-d', $oldnode->connstr($updb),
],
"ran version adaptation commands for database $updb");

or

$node->command_fails_like(
[
'pg_basebackup', '-D',
"$tempdir/backup", '--compress',
$cft->[0]
],
qr/$cfail/,
'client ' . $cft->[2]);

Perhaps that is included in the upcoming changes you are referring to?

#208Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#207)
Re: run pgindent on a regular basis / scripted manner

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 24.04.23 16:14, Tom Lane wrote:

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines. But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future. Did you have other specific concerns?

I think the worst is how it handles multi-line data structures like

$newnode->command_ok(
[
'psql', '-X',
'-v', 'ON_ERROR_STOP=1',
'-c', $upcmds,
'-d', $oldnode->connstr($updb),
],
"ran version adaptation commands for database $updb");

Yeah, I agree, there is no case where that doesn't suck. I don't
mind it imposing specific placements of brackets and so on ---
that's very analogous to what pgindent will do. But it likes to
re-flow comma-separated lists, and generally manages to make a
complete logical hash of them when it does, as in your other
example:

$node->command_fails_like(
[
'pg_basebackup', '-D',
"$tempdir/backup", '--compress',
$cft->[0]
],
qr/$cfail/,
'client ' . $cft->[2]);

Can we fix it to preserve the programmer's choices of line breaks
in comma-separated lists?

regards, tom lane

#209Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#208)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-26 We 09:27, Tom Lane wrote:

Peter Eisentraut<peter.eisentraut@enterprisedb.com> writes:

On 24.04.23 16:14, Tom Lane wrote:

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines. But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future. Did you have other specific concerns?

I think the worst is how it handles multi-line data structures like
$newnode->command_ok(
[
'psql', '-X',
'-v', 'ON_ERROR_STOP=1',
'-c', $upcmds,
'-d', $oldnode->connstr($updb),
],
"ran version adaptation commands for database $updb");

Yeah, I agree, there is no case where that doesn't suck. I don't
mind it imposing specific placements of brackets and so on ---
that's very analogous to what pgindent will do. But it likes to
re-flow comma-separated lists, and generally manages to make a
complete logical hash of them when it does, as in your other
example:

$node->command_fails_like(
[
'pg_basebackup', '-D',
"$tempdir/backup", '--compress',
$cft->[0]
],
qr/$cfail/,
'client ' . $cft->[2]);

Can we fix it to preserve the programmer's choices of line breaks
in comma-separated lists?

I doubt there's something like that. You can freeze arbitrary blocks of
code like this (from the manual)

#<<<  format skipping: do not let perltidy change my nice formatting
        my @list = (1,
                    1, 1,
                    1, 2, 1,
                    1, 3, 3, 1,
                    1, 4, 6, 4, 1,);
#>>>

But that gets old and ugly pretty quickly.

There is a --freeze-newlines option, but it's global. I don't think we
want that.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#210Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#209)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-26 We 09:27, Tom Lane wrote:

Yeah, I agree, there is no case where that doesn't suck. I don't
mind it imposing specific placements of brackets and so on ---
that's very analogous to what pgindent will do. But it likes to
re-flow comma-separated lists, and generally manages to make a
complete logical hash of them when it does, as in your other
example:

I doubt there's something like that.

I had a read-through of the latest version's man page, and found
this promising-looking entry:

-boc, --break-at-old-comma-breakpoints

The -boc flag is another way to prevent comma-separated lists from
being reformatted. Using -boc on the above example, plus additional
flags to retain the original style, yields

# perltidy -boc -lp -pt=2 -vt=1 -vtc=1
my @list = (1,
1, 1,
1, 2, 1,
1, 3, 3, 1,
1, 4, 6, 4, 1,);

A disadvantage of this flag compared to the methods discussed above is
that all tables in the file must already be nicely formatted.

I've not tested this, but it looks like it would do what we need,
modulo needing to fix all the existing damage by hand ...

regards, tom lane

#211Alvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Andrew Dunstan (#130)
Re: run pgindent on a regular basis / scripted manner

On 2023-Feb-05, Andrew Dunstan wrote:

So here's a diff made from running perltidy v20221112 with the additional
setting --valign-exclusion-list=", = => || && if unless"

I ran this experimentally with perltidy 20230309, and compared that with
the --novalign behavior (not to propose the latter -- just to be aware
of what else is vertical alignment doing.)

Based on the differences between both, I think we'll definitely want to
include =~ and |= in this list, and I think we should discuss whether to
also include "or" (for "do_stuff or die()" type of constructs) and "qw"
(mainly used in 'use Foo qw(one two)' import lists). All these have
effects (albeit smaller than the list you gave) on our existing code.

If you change from an exclusion list to --novalign then you lose
alignment of trailing # comments, which personally I find a loss, even
though they're still a multi-line effect. Another change would be that
it ditches alignment of "{" but that only changes msvc/Install.pm, so I
think we shouldn't worry; and then there's this one:

-use PostgreSQL::Test::Utils          ();
+use PostgreSQL::Test::Utils ();
 use PostgreSQL::Test::BackgroundPsql ();

which I think we could just change to qw() if we cared enough (but I bet
we don't).

All in all, I think sticking to
--valign-exclusion-list=", = => =~ |= || && if or qw unless"
is a good deal.

--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
"Ellos andaban todos desnudos como su madre los parió, y también las mujeres,
aunque no vi más que una, harto moza, y todos los que yo vi eran todos
mancebos, que ninguno vi de edad de más de XXX años" (Cristóbal Colón)

#212Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#209)
Re: run pgindent on a regular basis / scripted manner

On Wed, Apr 26, 2023 at 03:44:47PM -0400, Andrew Dunstan wrote:

On 2023-04-26 We 09:27, Tom Lane wrote:
I doubt there's something like that. You can freeze arbitrary blocks of code
like this (from the manual)

#<<<  format skipping: do not let perltidy change my nice formatting
        my @list = (1,
                    1, 1,
                    1, 2, 1,
                    1, 3, 3, 1,
                    1, 4, 6, 4, 1,);
#>>>   

But that gets old and ugly pretty quickly.

Can those comments be added by a preprocessor before calling perltidy,
and then removed on completion?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#213Andrew Dunstan
andrew@dunslane.net
In reply to: Alvaro Herrera (#211)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-28 Fr 05:25, Alvaro Herrera wrote:

On 2023-Feb-05, Andrew Dunstan wrote:

So here's a diff made from running perltidy v20221112 with the additional
setting --valign-exclusion-list=", = => || && if unless"

I ran this experimentally with perltidy 20230309, and compared that with
the --novalign behavior (not to propose the latter -- just to be aware
of what else is vertical alignment doing.)

Thanks for looking.

Based on the differences between both, I think we'll definitely want to
include =~ and |= in this list, and I think we should discuss whether to
also include "or" (for "do_stuff or die()" type of constructs) and "qw"
(mainly used in 'use Foo qw(one two)' import lists). All these have
effects (albeit smaller than the list you gave) on our existing code.

I'm good with all of these I think

If you change from an exclusion list to --novalign then you lose
alignment of trailing # comments, which personally I find a loss, even
though they're still a multi-line effect. Another change would be that
it ditches alignment of "{" but that only changes msvc/Install.pm, so I
think we shouldn't worry; and then there's this one:

-use PostgreSQL::Test::Utils          ();
+use PostgreSQL::Test::Utils ();
use PostgreSQL::Test::BackgroundPsql ();

which I think we could just change to qw() if we cared enough (but I bet
we don't).

Yeah, me too.

All in all, I think sticking to
--valign-exclusion-list=", = => =~ |= || && if or qw unless"
is a good deal.

wfm

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#214Andrew Dunstan
andrew@dunslane.net
In reply to: Bruce Momjian (#212)
Re: run pgindent on a regular basis / scripted manner

On 2023-04-28 Fr 14:08, Bruce Momjian wrote:

On Wed, Apr 26, 2023 at 03:44:47PM -0400, Andrew Dunstan wrote:

On 2023-04-26 We 09:27, Tom Lane wrote:
I doubt there's something like that. You can freeze arbitrary blocks of code
like this (from the manual)

#<<<  format skipping: do not let perltidy change my nice formatting
        my @list = (1,
                    1, 1,
                    1, 2, 1,
                    1, 3, 3, 1,
                    1, 4, 6, 4, 1,);
#>>>

But that gets old and ugly pretty quickly.

Can those comments be added by a preprocessor before calling perltidy,
and then removed on completion?

I imagine so, but we'd need a way of determining algorithmically which
lines to protect. That might not be at all simple. And then we'd have
the maintenance burden of the preprocessor.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#215Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#214)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-28 Fr 14:08, Bruce Momjian wrote:

Can those comments be added by a preprocessor before calling perltidy,
and then removed on completion?

I imagine so, but we'd need a way of determining algorithmically which
lines to protect. That might not be at all simple. And then we'd have
the maintenance burden of the preprocessor.

Yeah, it's hard to see how you'd do that without writing a full Perl
parser.

regards, tom lane

#216Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#210)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

I wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I doubt there's something like that.

I had a read-through of the latest version's man page, and found
this promising-looking entry:
-boc, --break-at-old-comma-breakpoints

Sadly, this seems completely not ready for prime time. I experimented
with it under perltidy 20230309, and found that it caused hundreds
of kilobytes of gratuitous changes that don't seem to have a direct
connection to the claimed purpose. Most of these seemed to be from
forcing a line break after a function call's open paren, like

@@ -50,10 +50,12 @@ detects_heap_corruption(
 #
 fresh_test_table('test');
 $node->safe_psql('postgres', q(VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test));
-detects_no_corruption("verify_heapam('test')",
+detects_no_corruption(
+	"verify_heapam('test')",
 	"all-frozen not corrupted table");
 corrupt_first_page('test');
-detects_heap_corruption("verify_heapam('test')",
+detects_heap_corruption(
+	"verify_heapam('test')",
 	"all-frozen corrupted table");
 detects_no_corruption(
 	"verify_heapam('test', skip := 'all-frozen')",

although in some places it just wanted to insert a space, like this:

@@ -77,9 +81,9 @@ print "standby 2: $result\n";
is($result, qq(33|0|t), 'check streamed sequence content on standby 2');

 # Check that only READ-only queries can run on standbys
-is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
 	3, 'read-only queries on standby 1');
-is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
 	3, 'read-only queries on standby 2');

# Tests for connection parameter target_session_attrs

So I don't think we want that. Maybe in some future version it'll
be more under control.

Barring objections, I'll use the attached on Friday.

regards, tom lane

Attachments:

v1-perltidy-option-changes.patchtext/x-diff; charset=us-ascii; name=v1-perltidy-option-changes.patchDownload
commit 7874d0f178f2bcdc889ce410d3e126e6750d96b4
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed May 17 16:43:38 2023 -0400

    Make agreed updates in perltidy options.
    
    Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql

diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index 43c736b0a1..08874d12eb 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -14,16 +14,16 @@ PREREQUISITES:
    sibling directory src/tools/pg_bsd_indent; see the directions
    in that directory's README file.
 
-2) Install perltidy.  Please be sure it is version 20170521 (older and newer
+2) Install perltidy.  Please be sure it is version 20230309 (older and newer
    versions make different formatting choices, and we want consistency).
    You can get the correct version from
    https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/
    To install, follow the usual install process for a Perl module
    ("man perlmodinstall" explains it).  Or, if you have cpan installed,
    this should work:
-   cpan SHANCOCK/Perl-Tidy-20170521.tar.gz
+   cpan SHANCOCK/Perl-Tidy-20230309.tar.gz
    Or if you have cpanm installed, you can just use:
-   cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20170521.tar.gz
+   cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz
 
 DOING THE INDENT RUN:
 
diff --git a/src/tools/pgindent/perltidyrc b/src/tools/pgindent/perltidyrc
index 9f09f0a64e..589d6e1f06 100644
--- a/src/tools/pgindent/perltidyrc
+++ b/src/tools/pgindent/perltidyrc
@@ -14,3 +14,4 @@
 --paren-vertical-tightness=2
 --paren-vertical-tightness-closing=2
 --noblanks-before-comments
+--valign-exclusion-list=", = => =~ |= || && if or qw unless"
#217Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#216)
Re: run pgindent on a regular basis / scripted manner

On 2023-05-17 We 17:10, Tom Lane wrote:

I wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

I doubt there's something like that.

I had a read-through of the latest version's man page, and found
this promising-looking entry:
-boc, --break-at-old-comma-breakpoints

Sadly, this seems completely not ready for prime time. I experimented
with it under perltidy 20230309, and found that it caused hundreds
of kilobytes of gratuitous changes that don't seem to have a direct
connection to the claimed purpose. Most of these seemed to be from
forcing a line break after a function call's open paren, like

@@ -50,10 +50,12 @@ detects_heap_corruption(
#
fresh_test_table('test');
$node->safe_psql('postgres', q(VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test));
-detects_no_corruption("verify_heapam('test')",
+detects_no_corruption(
+	"verify_heapam('test')",
"all-frozen not corrupted table");
corrupt_first_page('test');
-detects_heap_corruption("verify_heapam('test')",
+detects_heap_corruption(
+	"verify_heapam('test')",
"all-frozen corrupted table");
detects_no_corruption(
"verify_heapam('test', skip := 'all-frozen')",

although in some places it just wanted to insert a space, like this:

@@ -77,9 +81,9 @@ print "standby 2: $result\n";
is($result, qq(33|0|t), 'check streamed sequence content on standby 2');

# Check that only READ-only queries can run on standbys
-is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
3, 'read-only queries on standby 1');
-is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
3, 'read-only queries on standby 2');

# Tests for connection parameter target_session_attrs

So I don't think we want that. Maybe in some future version it'll
be more under control.

Barring objections, I'll use the attached on Friday.

LGTM

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#218Jelte Fennema
postgres@jeltef.nl
In reply to: Andrew Dunstan (#185)
Re: run pgindent on a regular basis / scripted manner

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan <andrew@dunslane.net> wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

#219Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#218)
Re: run pgindent on a regular basis / scripted manner

On 2023-06-15 Th 11:26, Jelte Fennema wrote:

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan<andrew@dunslane.net> wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

The short answer is that some high priority demands from $dayjob got in
the way. However, I hope to have it done soon.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#220Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#219)
Re: run pgindent on a regular basis / scripted manner

On 2023-06-15 Th 12:12, Andrew Dunstan wrote:

On 2023-06-15 Th 11:26, Jelte Fennema wrote:

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan<andrew@dunslane.net> wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

The short answer is that some high priority demands from $dayjob got
in the way. However, I hope to have it done soon.

See
<https://github.com/PGBuildFarm/client-code/commit/f9c1c15048b412d34ccda8020d989b3a7b566c05&gt;

I have set up a new buildfarm animal called koel which will run the module.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#221Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#220)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

I have set up a new buildfarm animal called koel which will run the module.

Is koel tracking the right repo? It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=koel&amp;dt=2023-06-19%2019%3A49%3A03

regards, tom lane

#222Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#220)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jun 17, 2023 at 10:08:32AM -0400, Andrew Dunstan wrote:

See <https://github.com/PGBuildFarm/client-code/commit/f9c1c15048b412d34ccda8020d989b3a7b566c05&gt;
I have set up a new buildfarm animal called koel which will run the module.

That's really cool! Thanks for taking the time to do that!
--
Michael

#223Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#221)
Re: run pgindent on a regular basis / scripted manner

On 2023-06-19 Mo 17:07, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

I have set up a new buildfarm animal called koel which will run the module.

Is koel tracking the right repo? It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=koel&amp;dt=2023-06-19%2019%3A49%3A03

Yeah, I changed it so that instead of just checking new commits it would
check the whole tree. The problem with the incremental approach is that
the next run it might turn green again but the issue would not have been
fixed.

I think this is a one-off issue. Once we clean up the tree the problem
would disappear and the commits it shows would be correct. I imaging
that's going to happen any day now?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#224Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#223)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-06-19 Mo 17:07, Tom Lane wrote:

Is koel tracking the right repo? It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

Yeah, I changed it so that instead of just checking new commits it would
check the whole tree. The problem with the incremental approach is that
the next run it might turn green again but the issue would not have been
fixed.

Ah.

I think this is a one-off issue. Once we clean up the tree the problem
would disappear and the commits it shows would be correct. I imaging
that's going to happen any day now?

I can go fix the problems now that we know there are some (already).
However, if what you're saying is that koel only checks recently-changed
files, that's going to be pretty misleading in future too. If people
don't react to such reports right away, they'll disappear, no?

regards, tom lane

#225Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#224)
Re: run pgindent on a regular basis / scripted manner

On 2023-06-20 Tu 09:08, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 2023-06-19 Mo 17:07, Tom Lane wrote:

Is koel tracking the right repo? It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

Yeah, I changed it so that instead of just checking new commits it would
check the whole tree. The problem with the incremental approach is that
the next run it might turn green again but the issue would not have been
fixed.

Ah.

I think this is a one-off issue. Once we clean up the tree the problem
would disappear and the commits it shows would be correct. I imaging
that's going to happen any day now?

I can go fix the problems now that we know there are some (already).
However, if what you're saying is that koel only checks recently-changed
files, that's going to be pretty misleading in future too. If people
don't react to such reports right away, they'll disappear, no?

That's what would have happened if I hadn't changed the way it worked
(and that's why I changed it). Now it doesn't just check recent commits,
it checks the whole tree, and will stay red until the tree is fixed.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

In reply to: Andrew Dunstan (#220)
Re: run pgindent on a regular basis / scripted manner

On Sat, Jun 17, 2023 at 7:08 AM Andrew Dunstan <andrew@dunslane.net> wrote:

I have set up a new buildfarm animal called koel which will run the module.

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

I don't think that it makes sense to invent yet another rule for
.git-blame-ignore-revs, though. Will we need another buildfarm member
to enforce that rule, too?

--
Peter Geoghegan

#227Andres Freund
andres@anarazel.de
In reply to: Peter Geoghegan (#226)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-08-11 13:59:40 -0700, Peter Geoghegan wrote:

On Sat, Jun 17, 2023 at 7:08 AM Andrew Dunstan <andrew@dunslane.net> wrote:

I have set up a new buildfarm animal called koel which will run the module.

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

I'm not sure I buy that that's going to be a huge problem - most of the time
such fixups are pretty small compared to larger reindents.

I don't think that it makes sense to invent yet another rule for
.git-blame-ignore-revs, though. Will we need another buildfarm member
to enforce that rule, too?

We could a test that fails when there's some mis-indented code. That seems
like it might catch things earlier?

Greetings,

Andres Freund

In reply to: Andres Freund (#227)
Re: run pgindent on a regular basis / scripted manner

On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:

I don't think that it makes sense to invent yet another rule for
.git-blame-ignore-revs, though. Will we need another buildfarm member
to enforce that rule, too?

We could a test that fails when there's some mis-indented code. That seems
like it might catch things earlier?

It definitely would. That would go a long way towards addressing my
concerns. But I suspect that that would run into problems that stem
from the fact that the buildfarm is testing something that isn't all
that simple. Don't typedefs need to be downloaded from some other
blessed buildfarm animal?

--
Peter Geoghegan

#229Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#228)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:

We could a test that fails when there's some mis-indented code. That seems
like it might catch things earlier?

+1 for including this in CI tests

It definitely would. That would go a long way towards addressing my
concerns. But I suspect that that would run into problems that stem
from the fact that the buildfarm is testing something that isn't all
that simple. Don't typedefs need to be downloaded from some other
blessed buildfarm animal?

No. I presume koel is using src/tools/pgindent/typedefs.list,
which has always been the "canonical" list but up to now we've
been lazy about maintaining it. Part of the new regime is that
typedefs.list should now be updated on-the-fly by patches that
add new typedefs.

We should still compare against the buildfarm's list periodically;
but I imagine that the primary result of that will be to remove
no-longer-used typedefs from typedefs.list.

regards, tom lane

In reply to: Tom Lane (#229)
Re: run pgindent on a regular basis / scripted manner

On Fri, Aug 11, 2023 at 3:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

No. I presume koel is using src/tools/pgindent/typedefs.list,
which has always been the "canonical" list but up to now we've
been lazy about maintaining it. Part of the new regime is that
typedefs.list should now be updated on-the-fly by patches that
add new typedefs.

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

We should still compare against the buildfarm's list periodically;
but I imagine that the primary result of that will be to remove
no-longer-used typedefs from typedefs.list.

I believe that I came up with my current workflow due to the
difficulty of maintaining the typedef file itself. Random
platform/binutils implementation details created a lot of noise,
presumably because my setup wasn't exactly the same as Bruce's setup,
in whatever way. For example, the order of certain lines would change,
in a way that had nothing whatsoever to do with structs that my patch
added.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

--
Peter Geoghegan

#231Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#230)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

I don't think it'll be a problem. If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.

regards, tom lane

#232Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#226)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

FWIW, I'm much more optimistic than that. I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that. If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.

regards, tom lane

#233Jelte Fennema
postgres@jeltef.nl
In reply to: Peter Geoghegan (#226)
Re: run pgindent on a regular basis / scripted manner

On Fri, 11 Aug 2023 at 23:00, Peter Geoghegan <pg@bowt.ie> wrote:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about.

I think one thing that would help a lot in reducing the is for
committers to set up the local git commit hook that's on the wiki:
https://wiki.postgresql.org/wiki/Working_with_Git

That one fails the commit if there's wrongly indented files in the
commit. And if you still want to opt out for whatever reason you can
use git commit --no-verify

#234Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#229)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:

Peter Geoghegan <pg@bowt.ie> writes:

On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:

We could a test that fails when there's some mis-indented code. That seems
like it might catch things earlier?

+1 for including this in CI tests

I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.

It definitely would. That would go a long way towards addressing my
concerns. But I suspect that that would run into problems that stem
from the fact that the buildfarm is testing something that isn't all
that simple. Don't typedefs need to be downloaded from some other
blessed buildfarm animal?

No. I presume koel is using src/tools/pgindent/typedefs.list,
which has always been the "canonical" list but up to now we've
been lazy about maintaining it. Part of the new regime is that
typedefs.list should now be updated on-the-fly by patches that
add new typedefs.

Yea. Otherwise nobody else can indent reliably, without repeating the work of
adding typedefs.list entries of all the patches since the last time it was
updated in the repository.

Greetings,

Andres Freund

#235Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#234)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:

+1 for including this in CI tests

I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.

Hmm. I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Maybe we could automate it, but not as part of check-world per se?

regards, tom lane

#236Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#235)
Re: run pgindent on a regular basis / scripted manner

On Fri, Aug 11, 2023 at 08:11:34PM -0400, Tom Lane wrote:

Hmm. I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Maybe we could automate it, but not as part of check-world per se?

It does not have to be part of check-world by default, as we could
make it optional with PG_TEST_EXTRA. I bet that most committers set
this option for most of the test suites anyway, so the extra cost is
OK from here. I don't find a single indent run to be that costly,
especially with parallelism:
$ time ./src/tools/pgindent/pgindent .
real 0m5.039s
user 0m3.403s
sys 0m1.540s
--
Michael

#237Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#235)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-08-11 20:11:34 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:

+1 for including this in CI tests

I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.

Hmm. I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Hm, compared to the cost of check-world it's not that large, but still,
annoying to make it larger.

We can make it lot cheaper, but perhaps not in a general enough fashion that
it's suitable for a test.

pgindent already can query git (for --commit). We could teach pgindent to
ask git what remote branch is being tracked, and constructed a list of files
of the difference between the remote branch and the local branch?

That option could do something like:
git diff --name-only $(git rev-parse --abbrev-ref --symbolic-full-name @{upstream})

That's pretty quick, even for a relatively large delta.

Maybe we could automate it, but not as part of check-world per se?

We should definitely do that. Another related thing that'd be useful to
script is updating typedefs.list with the additional typedefs found
locally. Right now the script for that still lives in the

Greetings,

Andres Freund

#238Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#232)
Re: run pgindent on a regular basis / scripted manner

On 2023-08-11 Fr 19:17, Tom Lane wrote:

Peter Geoghegan<pg@bowt.ie> writes:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

FWIW, I'm much more optimistic than that. I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that. If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.

Yeah, part of the point of creating koel was to give committers a bit of
a nudge in that direction.

With a git pre-commit hook it's pretty painless.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#239Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#231)
Re: run pgindent on a regular basis / scripted manner

On 2023-08-11 Fr 19:02, Tom Lane wrote:

Peter Geoghegan<pg@bowt.ie> writes:

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

I don't think it'll be a problem. If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.

My recollection is that missing typedefs cause indentation that kinda
sticks out like a sore thumb.

The reason we moved to a buildfarm based typedefs list was that some
typedefs are platform dependent, so any list really needs to be the
union of the found typedefs on various platforms, and the buildfarm was
a convenient vehicle for doing that. But that doesn't mean you shouldn't
manually add a typedef you have added in your code.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#240Andres Freund
andres@anarazel.de
In reply to: Andrew Dunstan (#239)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:

On 2023-08-11 Fr 19:02, Tom Lane wrote:

Peter Geoghegan<pg@bowt.ie> writes:

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

I don't think it'll be a problem. If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.

My recollection is that missing typedefs cause indentation that kinda sticks
out like a sore thumb.

The reason we moved to a buildfarm based typedefs list was that some
typedefs are platform dependent, so any list really needs to be the union of
the found typedefs on various platforms, and the buildfarm was a convenient
vehicle for doing that. But that doesn't mean you shouldn't manually add a
typedef you have added in your code.

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that *adds* (but doesn't remove) local typedefs would make this less
painful.

Greetings,

Andres Freund

#241Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#240)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:

My recollection is that missing typedefs cause indentation that kinda sticks
out like a sore thumb.

Yeah, it's usually pretty obvious: "typedef *var" gets changed to
"typedef * var", and similar oddities.

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that *adds* (but doesn't remove) local typedefs would make this less
painful.

My practice has always been "add typedefs until pgindent doesn't do
anything I don't want". If you have a new typedef that doesn't happen
to be used in a way that pgindent mangles, it's not that critical
to get it into the file right away.

regards, tom lane

In reply to: Tom Lane (#241)
Re: run pgindent on a regular basis / scripted manner

On Sat, Aug 12, 2023 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that *adds* (but doesn't remove) local typedefs would make this less
painful.

My practice has always been "add typedefs until pgindent doesn't do
anything I don't want". If you have a new typedef that doesn't happen
to be used in a way that pgindent mangles, it's not that critical
to get it into the file right away.

We seem to be seriously contemplating making every patch author do
this every time they need to get the tests to pass (after adding or
renaming a struct). Is that really an improvement over the old status
quo?

In principle I'm in favor of strictly enforcing indentation rules like
this. But it seems likely that our current tooling just isn't up to
the task.

--
Peter Geoghegan

#243Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#242)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

We seem to be seriously contemplating making every patch author do
this every time they need to get the tests to pass (after adding or
renaming a struct). Is that really an improvement over the old status
quo?

Hm. I was envisioning that we should expect committers to deal
with this, not original patch submitters. So that's an argument
against including it in the CI tests. But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

regards, tom lane

In reply to: Tom Lane (#243)
Re: run pgindent on a regular basis / scripted manner

On Sat, Aug 12, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm. I was envisioning that we should expect committers to deal
with this, not original patch submitters. So that's an argument
against including it in the CI tests. But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

Making it a special responsibility for committers comes with the same
set of problems that we see with catversion bumps. People are much
more likely to forget to do something that must happen last.

Maybe I'm wrong -- maybe the new policy is practicable. It might even
turn out to be worth the bother. Time will tell.

--
Peter Geoghegan

#245Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Geoghegan (#244)
Re: run pgindent on a regular basis / scripted manner

On 2023-08-12 Sa 20:53, Peter Geoghegan wrote:

On Sat, Aug 12, 2023 at 5:20 PM Tom Lane<tgl@sss.pgh.pa.us> wrote:

Hm. I was envisioning that we should expect committers to deal
with this, not original patch submitters. So that's an argument
against including it in the CI tests. But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

I agree with this.

Making it a special responsibility for committers comes with the same
set of problems that we see with catversion bumps. People are much
more likely to forget to do something that must happen last.

After I'd been caught by this once or twice I implemented a git hook
test for that too - in fact it was the first hook I did. It's not
perfect but it's saved me a couple of times:

check_catalog_version () {

    # only do this on master
    test  "$branch" = "master" || return 0

    case "$files" in
        *src/include/catalog/catversion.h*)
            return 0;
            ;;
        *src/include/catalog/*)
            ;;
        *)
            return 0;
            ;;
    esac

    # changes include catalog but not catversion.h, so warn about it
    {
        echo 'Commit on master alters catalog but catversion not bumped'
        echo 'It can be forced with git commit --no-verify'
    } >&2

    exit 1
}

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#246Michael Paquier
michael@paquier.xyz
In reply to: Andrew Dunstan (#245)
Re: run pgindent on a regular basis / scripted manner

On Sun, Aug 13, 2023 at 10:33:21AM -0400, Andrew Dunstan wrote:

After I'd been caught by this once or twice I implemented a git hook test
for that too - in fact it was the first hook I did. It's not perfect but
it's saved me a couple of times:

check_catalog_version () {

I find that pretty cool. Thanks for sharing.
--
Michael

#247Peter Eisentraut
peter@eisentraut.org
In reply to: Andres Freund (#240)
Re: run pgindent on a regular basis / scripted manner

On 12.08.23 23:14, Andres Freund wrote:

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that*adds* (but doesn't remove) local typedefs would make this less
painful.

I was puzzled once that there does not appear to be such a script
available. Whatever the buildfarm does (before it merges it all
together) should be available locally. Then the workflow could be

type type type
compile
update typedefs
pgindent
commit

#248Peter Eisentraut
peter@eisentraut.org
In reply to: Tom Lane (#235)
Re: run pgindent on a regular basis / scripted manner

On 12.08.23 02:11, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:

+1 for including this in CI tests

I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.

Hmm. I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Maybe we could automate it, but not as part of check-world per se?

Also, during development, the code in progress is not always perfectly
formatted, but I do want to keep running the test suites.

#249Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#247)
Re: run pgindent on a regular basis / scripted manner

On 2023-08-14 Mo 10:04, Peter Eisentraut wrote:

On 12.08.23 23:14, Andres Freund wrote:

It's a somewhat annoying task though, find all the typedefs, add them
to the
right place in the file (we have an out of order entry right now). I
think a
script that*adds*  (but doesn't remove) local typedefs would make
this less
painful.

I was puzzled once that there does not appear to be such a script
available.  Whatever the buildfarm does (before it merges it all
together) should be available locally.  Then the workflow could be

type type type
compile
update typedefs
pgindent
commit

It's a bit more complicated :-)

You can see what the buildfarm does at
<https://github.com/PGBuildFarm/client-code/blob/ec4cf43613a74cb88f228efcde09931cf9fd57e7/run_build.pl#L2562&gt;
It's been somewhat fragile over the years, which most people other than
Tom and I have probably not noticed.

On most platforms it needs postgres to have been installed before
looking for the typedefs.

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

#250Nathan Bossart
nathandbossart@gmail.com
In reply to: Peter Geoghegan (#226)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On Fri, Aug 11, 2023 at 01:59:40PM -0700, Peter Geoghegan wrote:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

Should we add those? Patch attached.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Add-a-few-recent-commits-to-.git-blame-ignore-rev.patchtext/x-diff; charset=us-asciiDownload
From f4f158e153f9027330ce841ca79b5650dfd24a0e Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 15 Aug 2023 13:24:18 -0700
Subject: [PATCH v1 1/1] Add a few recent commits to .git-blame-ignore-revs.

---
 .git-blame-ignore-revs | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/.git-blame-ignore-revs b/.git-blame-ignore-revs
index 00b548c7b0..2ff86c8d0d 100644
--- a/.git-blame-ignore-revs
+++ b/.git-blame-ignore-revs
@@ -14,6 +14,27 @@
 #
 # $ git log --pretty=format:"%H # %cd%n# %s" $PGINDENTGITHASH -1 --date=iso
 
+89be0b89ae60c63856fd26d82a104781540e2312 # 2023-08-15 17:45:00 +0900
+# Fix code indentation vioaltion introduced in commit 9e9931d2b.
+
+5dc456b7dda4f7d0d7735b066607c190c74d174a # 2023-08-11 20:43:34 +0900
+# Fix code indentation violations introduced by recent commit
+
+62e9af4c63fbd36fb9af8450fb44bece76d7766f # 2023-07-25 12:35:58 +0530
+# Fix code indentation vioaltion introduced in commit d38ad8e31d.
+
+4e465aac36ce9a9533c68dbdc83e67579880e628 # 2023-07-18 14:04:31 +0900
+# Fix indentation in twophase.c
+
+328f492d2565cfbe383f13a69425d751fd79415f # 2023-07-13 22:26:10 +0900
+# Fix code indentation violation in commit b6e1157e7d
+
+69a674a170058e63e8178aec8a36a673efce8801 # 2023-07-06 11:49:18 +0530
+# Fix code indentation vioaltion introduced in commit cc32ec24fd.
+
+a4cfeeca5a97f2b5969c31aa69ba775af95ee5a3 # 2023-07-03 12:47:49 +0200
+# Fix code indentation violations
+
 b334612b8aee9f9a34378982d8938b201dfad323 # 2023-06-20 09:50:43 -0400
 # Pre-beta2 mechanical code beautification.
 
-- 
2.25.1

In reply to: Nathan Bossart (#250)
Re: run pgindent on a regular basis / scripted manner

On Tue, Aug 15, 2023 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

Should we add those? Patch attached.

I think that that makes sense. I just don't want to normalize updating
.git-blame-ignore-revs very frequently. (Actually, it's more like I
don't want to normalize any scheme that makes updating the ignore list
very frequently start to seem reasonable.)

--
Peter Geoghegan

#252Nathan Bossart
nathandbossart@gmail.com
In reply to: Peter Geoghegan (#251)
Re: run pgindent on a regular basis / scripted manner

On Wed, Aug 16, 2023 at 01:15:55PM -0700, Peter Geoghegan wrote:

On Tue, Aug 15, 2023 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

Should we add those? Patch attached.

I think that that makes sense.

Committed.

I just don't want to normalize updating
.git-blame-ignore-revs very frequently. (Actually, it's more like I
don't want to normalize any scheme that makes updating the ignore list
very frequently start to seem reasonable.)

Agreed. I've found myself habitually running pgindent since becoming a
committer, but I'm sure I'll forget it one of these days. From a quick
skim of this thread, it sounds like a pre-commit hook [0]https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks might be the best
option at the moment.

[0]: https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

In reply to: Peter Geoghegan (#244)
Re: run pgindent on a regular basis / scripted manner

On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

Maybe I'm wrong -- maybe the new policy is practicable. It might even
turn out to be worth the bother. Time will tell.

(Two months pass.)

There were two independent fixup commits to address complaints from
koel just today (from two different committers). Plus there was a
third issue (involving a third committer) this past Wednesday.

This policy isn't working.

--
Peter Geoghegan

#254Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#253)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

There were two independent fixup commits to address complaints from
koel just today (from two different committers). Plus there was a
third issue (involving a third committer) this past Wednesday.

This policy isn't working.

Two thoughts about that:

1. We should not commit indent fixups on behalf of somebody else's
misindented commit. Peer pressure on committers who aren't being
careful about this is the only way to improve matters; so complain
to the person at fault until they fix it.

2. We could raise awareness of this issue by adding indent verification
to CI testing. I hesitate to suggest that, though, for a couple of
reasons:
2a. It seems fairly expensive, though I might be misjudging.
2b. It's often pretty handy to submit patches that aren't fully
indent-clean; I have such a patch in flight right now at [1]/messages/by-id/2617358.1697501956@sss.pgh.pa.us.

2b could be ameliorated by making the indent check be a separate
test process that doesn't obscure the results of other testing.

regards, tom lane

[1]: /messages/by-id/2617358.1697501956@sss.pgh.pa.us

In reply to: Tom Lane (#254)
Re: run pgindent on a regular basis / scripted manner

On Mon, Oct 16, 2023 at 5:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Two thoughts about that:

1. We should not commit indent fixups on behalf of somebody else's
misindented commit. Peer pressure on committers who aren't being
careful about this is the only way to improve matters; so complain
to the person at fault until they fix it.

Thomas Munro's recent commit 01529c704008 was added to
.git-blame-ignore-revs by Michael Paquier, despite the fact that
Munro's commit technically isn't just a pure indentation fix (it also
fixed some typos). It's hard to judge Michael too harshly for this,
since in general it's harder to commit things when koel is already
complaining about existing misindetation -- I get why he'd prefer to
take care of that first.

2. We could raise awareness of this issue by adding indent verification
to CI testing. I hesitate to suggest that, though, for a couple of
reasons:
2a. It seems fairly expensive, though I might be misjudging.
2b. It's often pretty handy to submit patches that aren't fully
indent-clean; I have such a patch in flight right now at [1].

It's also often handy to make a minor change to a comment or something
at the last minute, without necessarily having the comment indented
perfectly.

2b could be ameliorated by making the indent check be a separate
test process that doesn't obscure the results of other testing.

I was hoping that "go back to the old status quo" would also appear as
an option.

My main objection to the new policy is that it's not quite clear what
process I should go through in order to be 100% confident that koel
won't start whining (short of waiting around for koel to whine). I
know how to run pgindent, of course, and haven't had any problems so
far...but it still seems quite haphazard. If we're going to make this
a hard rule, enforced on every commit, it should be dead easy to
comply with the rule.

--
Peter Geoghegan

#256Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#255)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

My main objection to the new policy is that it's not quite clear what
process I should go through in order to be 100% confident that koel
won't start whining (short of waiting around for koel to whine). I
know how to run pgindent, of course, and haven't had any problems so
far...but it still seems quite haphazard. If we're going to make this
a hard rule, enforced on every commit, it should be dead easy to
comply with the rule.

But it's *not* a hard rule --- we explicitly rejected mechanisms
that would make it so (such as a precommit hook). I view "koel
is unhappy" as something that you ought to clean up, but if you
don't get to it for a day or three there's not much harm done.

In theory koel might complain even if you'd locally gotten
clean results from pgindent (as a consequence of skew in the
typedef lists being used, for example). We've not seen cases
of that so far though. Right now I think we just need to raise
committers' awareness of this enough that they routinely run
pgindent on the files they're touching. In the problem cases
so far, they very clearly didn't. I don't see much point in
worrying about second-order problems until that first-order
problem is tamped down.

regards, tom lane

In reply to: Tom Lane (#256)
Re: run pgindent on a regular basis / scripted manner

On Mon, Oct 16, 2023 at 6:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

But it's *not* a hard rule --- we explicitly rejected mechanisms
that would make it so (such as a precommit hook). I view "koel
is unhappy" as something that you ought to clean up, but if you
don't get to it for a day or three there's not much harm done.

It's hard to square that with what you said about needing greater peer
pressure on committers.

Right now I think we just need to raise
committers' awareness of this enough that they routinely run
pgindent on the files they're touching. In the problem cases
so far, they very clearly didn't. I don't see much point in
worrying about second-order problems until that first-order
problem is tamped down.

Realistically, if you're the committer that broke koel, you are at
least the subject of mild disapproval -- you have likely
inconvenienced others. I always try to avoid that -- it pretty much
rounds up to "hard rule" in my thinking. Babysitting koel really does
seem like it could cut into my dinner plans or what have you.

--
Peter Geoghegan

#258Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#254)
Re: run pgindent on a regular basis / scripted manner

On Mon, Oct 16, 2023 at 08:45:00PM -0400, Tom Lane wrote:

2. We could raise awareness of this issue by adding indent verification
to CI testing. I hesitate to suggest that, though, for a couple of
reasons:
2a. It seems fairly expensive, though I might be misjudging.
2b. It's often pretty handy to submit patches that aren't fully
indent-clean; I have such a patch in flight right now at [1].

2b could be ameliorated by making the indent check be a separate
test process that doesn't obscure the results of other testing.

I see an extra reason with not doing that: this increases the
difficulty when it comes to send and maintain patches to the lists and
newcomers would need to learn more tooling. I don't think that we
should make that more complicated for code-formatting reasons.
--
Michael

#259Jelte Fennema
postgres@jeltef.nl
In reply to: Peter Geoghegan (#255)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 03:23, Peter Geoghegan <pg@bowt.ie> wrote:

My main objection to the new policy is that it's not quite clear what
process I should go through in order to be 100% confident that koel
won't start whining (short of waiting around for koel to whine). I
know how to run pgindent, of course, and haven't had any problems so
far...but it still seems quite haphazard. If we're going to make this
a hard rule, enforced on every commit, it should be dead easy to
comply with the rule.

I think *it is* dead easy to comply. If you run the following commands
before committing/after rebasing, then koel should always be happy:

src/tools/pgindent/pgindent src # works always but a bit slow
src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
# much faster, but only works if you DID NOT change typedefs.list

If you have specific cases where it does not work. Then I think we
should talk about those/fix them. But looking at the last few commits
in .git-blame-ignore-revs I only see examples of people simply not
running pgindent before they commit.

I guess it's easy to forget, but that's why the wiki contains a
pre-commit hook[1]https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks that you can use to remind yourself/run pgindent
automatically. The only annoying thing is that it does not trigger
when rebasing, but you can work around that by using rebase its -x
flag[2]https://adamj.eu/tech/2022/11/07/pre-commit-run-hooks-rebase/.

[1]: https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks
[2]: https://adamj.eu/tech/2022/11/07/pre-commit-run-hooks-rebase/

#260Jelte Fennema
postgres@jeltef.nl
In reply to: Michael Paquier (#258)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 04:57, Michael Paquier <michael@paquier.xyz> wrote:

I see an extra reason with not doing that: this increases the
difficulty when it comes to send and maintain patches to the lists and
newcomers would need to learn more tooling. I don't think that we
should make that more complicated for code-formatting reasons.

Honestly, I don't think it's a huge hurdle for newcomers. Most open
source projects have a CI job that runs automatic code formatting, so
it's a pretty common thing for contributors to deal with. And as long
as we keep it a separate CI job from the normal tests, committers can
even choose to commit the patch if the formatting job fails, after
running pgindent themselves.

And personally as a contributor it's a much nicer experience to see
quickly in CI that I messed up the code style, then to hear it a
week/month later in an email when someone took the time to review and
mentions the styling is way off all over the place.

#261Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#253)
Re: run pgindent on a regular basis / scripted manner

On Sun, Oct 15, 2023 at 8:52 PM Peter Geoghegan <pg@bowt.ie> wrote:

On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

Maybe I'm wrong -- maybe the new policy is practicable. It might even
turn out to be worth the bother. Time will tell.

(Two months pass.)

There were two independent fixup commits to address complaints from
koel just today (from two different committers). Plus there was a
third issue (involving a third committer) this past Wednesday.

This policy isn't working.

+1. I think this is more annoying than the status quo ante.

--
Robert Haas
EDB: http://www.enterprisedb.com

#262Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema (#259)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 6:34 AM Jelte Fennema <postgres@jeltef.nl> wrote:

I think *it is* dead easy to comply. If you run the following commands
before committing/after rebasing, then koel should always be happy:

src/tools/pgindent/pgindent src # works always but a bit slow
src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
# much faster, but only works if you DID NOT change typedefs.list

In isolation, that's true, but the list of mistakes that you can make
while committing which will inconvenience everyone working on the
project is very long. Another one that comes up frequently is
forgetting to bump CATALOG_VERSION_NO, but you also need a good commit
message, and good comments, and a good Discussion link in the commit
message, and the right list of authors and reviewers, and to update
the docs (with spaces, not tabs) and the Makefiles (with tabs, not
spaces) and the meson stuff and, as if that weren't enough already,
you actually need the code to work! And that includes not only working
regularly but also with CLOBBER_CACHE_ALWAYS and debug_parallel_query
and so on. It's very easy to miss something somewhere. I put a LOT of
work into polishing my commits before I push them, and it's still not
that uncommon that I screw something up.

--
Robert Haas
EDB: http://www.enterprisedb.com

#263Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#261)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:

This policy isn't working.

+1. I think this is more annoying than the status quo ante.

Although ... I do think it's spared me some rebasing pain, and that
does have some real value. I wonder if we could think of other
alternatives. For example, maybe we could have a bot. If you push a
commit that's not indented properly, the bot reindents the tree,
updates git-blame-ignore-revs, and sends you an email admonishing you
for your error. Or we could have a server-side hook that will refuse
the misindented commit, with some kind of override for emergency
situations. What I really dislike about the current situation is that
it's doubling down on the idea that committers have to be perfect and
get everything right every time. Turns out, that's hard to do. If not,
why do people keep screwing things up? Somebody could theorize - and
this seems to be Tom and Jelte's theory, though perhaps I'm
misinterpreting their comments - that the people who have made
mistakes here are just lazy, and what they need to do is up their
game.

But I don't buy that. First, I think that most of our committers are
pretty intelligent and hard-working people who are trying to do the
right thing. We can't all be Tom Lane, no matter how hard we may try.
Second, even if it were true that the offending committers are "just
lazy," all of our contributors and many senior non-committer
contributors are people who have put thousands, if not tens of
thousands, of hours into the project. Making them feel bad serves us
poorly. At the end of the day, it doesn't matter whether it's too much
of a pain for the perfect committers we'd like to have. It matters
whether it's too much of a pain for the human committers that we do
have.

--
Robert Haas
EDB: http://www.enterprisedb.com

#264Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#263)
Re: run pgindent on a regular basis / scripted manner

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:

+1. I think this is more annoying than the status quo ante.

Although ... I do think it's spared me some rebasing pain, and that
does have some real value. I wonder if we could think of other
alternatives.

An alternative I was thinking about after reading your earlier email
was going back to the status quo ante, but doing the manual tree-wide
reindents significantly more often than once a year. Adding one at
the conclusion of each commitfest would be a natural thing to do,
for instance. It's hard to say what frequency would lead to the
least rebasing pain, but we know once-a-year isn't ideal.

For example, maybe we could have a bot. If you push a
commit that's not indented properly, the bot reindents the tree,
updates git-blame-ignore-revs, and sends you an email admonishing you
for your error.

I'm absolutely not in favor of completely-automated reindents.
pgindent is a pretty stupid tool and it will sometimes do stupid
things, which you have to correct for by tweaking the input
formatting. The combination of the tool and human supervision
generally produces pretty good results, but the tool alone
not so much.

Or we could have a server-side hook that will refuse
the misindented commit, with some kind of override for emergency
situations.

Even though I'm in the camp that would like the tree correctly
indented at all times, I remain very much against a commit hook.
I think that'd be significantly more annoying than the current
situation, which you're already unhappy about the annoying-ness of.

The bottom line here, I think, is that there's a subset of committers
that would like perfectly mechanically-indented code at all times,
and there's another subset that just doesn't care that much.
We don't (and shouldn't IMO) have a mechanism to let one set force
their views on the other set. The current approach is clearly
insufficient for that, and I don't think trying to institute stronger
enforcement is going to make anybody happy.

regards, tom lane

In reply to: Tom Lane (#264)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:

+1. I think this is more annoying than the status quo ante.

Although ... I do think it's spared me some rebasing pain, and that
does have some real value. I wonder if we could think of other
alternatives.

An alternative I was thinking about after reading your earlier email
was going back to the status quo ante, but doing the manual tree-wide
reindents significantly more often than once a year. Adding one at
the conclusion of each commitfest would be a natural thing to do,
for instance. It's hard to say what frequency would lead to the
least rebasing pain, but we know once-a-year isn't ideal.

That seems like the best alternative we have. The old status quo did
occasionally allow code with indentation that *clearly* wasn't up to
project standards to slip in. It could stay that way for quite a few
months at a time. That wasn't great either.

--
Peter Geoghegan

#266Jelte Fennema
postgres@jeltef.nl
In reply to: Tom Lane (#264)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 16:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Or we could have a server-side hook that will refuse
the misindented commit, with some kind of override for emergency
situations.

Even though I'm in the camp that would like the tree correctly
indented at all times, I remain very much against a commit hook.
I think that'd be significantly more annoying than the current
situation, which you're already unhappy about the annoying-ness of.

Why do you think that would be significantly more annoying than the
current situation? Instead of getting delayed feedback you get instant
feedback when you push.

#267Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#264)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

An alternative I was thinking about after reading your earlier email
was going back to the status quo ante, but doing the manual tree-wide
reindents significantly more often than once a year. Adding one at
the conclusion of each commitfest would be a natural thing to do,
for instance. It's hard to say what frequency would lead to the
least rebasing pain, but we know once-a-year isn't ideal.

Yes. I suspect once a commitfest still wouldn't be often enough. Maybe
once a month or something would be. But I'm not sure. You might rebase
once over the misindented commit and then have to rebase again over
the indent that fixed it. There's not really anything that quite
substitutes for doing it right on every commit.

The bottom line here, I think, is that there's a subset of committers
that would like perfectly mechanically-indented code at all times,
and there's another subset that just doesn't care that much.
We don't (and shouldn't IMO) have a mechanism to let one set force
their views on the other set. The current approach is clearly
insufficient for that, and I don't think trying to institute stronger
enforcement is going to make anybody happy.

I mean, I think we DO have such a mechanism. Everyone agrees that the
buildfarm has to stay green, and we have a buildfarm member that
checks pgindent, so that means everyone has to pgindent. We could
decide to kill that buildfarm member, in which case we go back to
people not having to pgindent, but right now they do.

And if it's going to remain the policy, it's better to enforce that
policy earlier rather than later. I mean, what is the point of having
a system where we let people do the wrong thing and then publicly
embarrass them afterwards? How is that better than preventing them
from doing the wrong thing in the first place? Even if they don't
subjectively feel embarrassed, nobody likes having to log back on in
the evening or the weekend and clean up after something they thought
they were done with.

In fact, that particular experience is one of the worst things about
being a committer. It actively discourages me, at least, from trying
to get other people's patches committed. This particular problem is
minor, but the overall experience of trying to get things committed is
that you have to check 300 things for every patch and if you get every
one of them right then nothing happens and if you get one of them
wrong then you get a bunch of irritated emails criticizing your
laziness, sloppiness, or whatever, and you have to drop everything to
go fix it immediately. What a deal! I'm sure this isn't the only
reason why we have such a huge backlog of patches needing committer
attention, but it sure doesn't help. And there is absolutely zero need
for this to be yet another thing that you can find out you did wrong
in the 1-24 hour period AFTER you type 'git push'.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#267)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 8:01 AM Robert Haas <robertmhaas@gmail.com> wrote:

In fact, that particular experience is one of the worst things about
being a committer. It actively discourages me, at least, from trying
to get other people's patches committed. This particular problem is
minor, but the overall experience of trying to get things committed is
that you have to check 300 things for every patch and if you get every
one of them right then nothing happens and if you get one of them
wrong then you get a bunch of irritated emails criticizing your
laziness, sloppiness, or whatever, and you have to drop everything to
go fix it immediately. What a deal!

Yep. Enforcing perfect indentation on koel necessitates rechecking
indentation after each and every last-minute fixup affecting C code --
the interactions makes it quite a bit harder to get everything right
on the first push. For example, if I spot an issue with a comment
during final pre-commit review, and fixup that commit, I have to run
pgindent again. On a long enough timeline, I'm going to forget to do
that.

--
Peter Geoghegan

#269Jelte Fennema
postgres@jeltef.nl
In reply to: Robert Haas (#263)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 16:04, Robert Haas <robertmhaas@gmail.com> wrote:

What I really dislike about the current situation is that
it's doubling down on the idea that committers have to be perfect and
get everything right every time. Turns out, that's hard to do. If not,
why do people keep screwing things up? Somebody could theorize - and
this seems to be Tom and Jelte's theory, though perhaps I'm
misinterpreting their comments - that the people who have made
mistakes here are just lazy, and what they need to do is up their
game.

To clarify, I did not intend to imply people that commit unindented
code are lazy. It's expected that humans forget to run pgindent before
committing from time to time (I do too). That's why I proposed a
server side git hook to reject badly indented commits very early in
this thread. But some others said that buildfarm animals were the way
to go for Postgres development flow. And since I'm not a committer I
left it at that. I was already happy enough that there was consensus
on indenting continuously, so that the semi-regular rebases for the
few open CF entries that I have are a lot less annoying.

But based on the current feedback I think we should seriously consider
a server-side "update" git hook again. People are obviously not
perfect machines. And for whatever reason not everyone installs the
pre-commit hook from the wiki. So the koel keeps complaining. A
server-side hook would solve all of this IMHO.

#270Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#268)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 11:16 AM Peter Geoghegan <pg@bowt.ie> wrote:

Yep. Enforcing perfect indentation on koel necessitates rechecking
indentation after each and every last-minute fixup affecting C code --
the interactions makes it quite a bit harder to get everything right
on the first push. For example, if I spot an issue with a comment
during final pre-commit review, and fixup that commit, I have to run
pgindent again. On a long enough timeline, I'm going to forget to do
that.

I also just discovered that my pre-commit hook doesn't work if I pull
commits into master by cherry-picking. I had thought that I could have
my hook just check my commits to master and not all of my local dev
branches where I really don't want to mess with this when I'm just
banging out a rough draft of something. But now I see that I'm going
to need to work harder on this if I actually want it to catch all the
ways I might screw this up.

--
Robert Haas
EDB: http://www.enterprisedb.com

#271Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema (#269)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 11:23 AM Jelte Fennema <postgres@jeltef.nl> wrote:

To clarify, I did not intend to imply people that commit unindented
code are lazy. It's expected that humans forget to run pgindent before
committing from time to time (I do too). That's why I proposed a
server side git hook to reject badly indented commits very early in
this thread. But some others said that buildfarm animals were the way
to go for Postgres development flow. And since I'm not a committer I
left it at that. I was already happy enough that there was consensus
on indenting continuously, so that the semi-regular rebases for the
few open CF entries that I have are a lot less annoying.

Thanks for clarifying. I didn't really think you were trying to be
accusatory, but I didn't really understand what else to think either,
so this is helpful context.

But based on the current feedback I think we should seriously consider
a server-side "update" git hook again. People are obviously not
perfect machines. And for whatever reason not everyone installs the
pre-commit hook from the wiki. So the koel keeps complaining. A
server-side hook would solve all of this IMHO.

One potential problem with a server-side hook is that if you back-port
a commit to older branches and then push the commits all together
(which is my workflow) then you might get failure to push on some
branches but not others. I don't know if there's any way to avoid
that, but it seems not great. You could think of enforcing the policy
only on master to try to avoid this, but that still leaves a risk that
you manage to push to all the back-branches and not to master.

--
Robert Haas
EDB: http://www.enterprisedb.com

In reply to: Robert Haas (#270)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 8:24 AM Robert Haas <robertmhaas@gmail.com> wrote:

I also just discovered that my pre-commit hook doesn't work if I pull
commits into master by cherry-picking. I had thought that I could have
my hook just check my commits to master and not all of my local dev
branches where I really don't want to mess with this when I'm just
banging out a rough draft of something. But now I see that I'm going
to need to work harder on this if I actually want it to catch all the
ways I might screw this up.

Once you figure all that out, you're still obligated to hand-polish
typedefs.list to be consistent with whatever Bruce's machine's copy of
objdump does (or is it Tom's?). You need to sort the entries so they
kinda look like they originated from the same source as existing
entries, since my Debian machine seems to produce somewhat different
results to RHEL, for whatever reason. It's hard to imagine a worse use
of committer time.

I think that something like this new policy could work if the
underlying tooling was very easy to use and gave perfectly consistent
results on everybody's development machine. Obviously, that just isn't
the case.

--
Peter Geoghegan

#273Jelte Fennema
postgres@jeltef.nl
In reply to: Peter Geoghegan (#272)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 17:47, Peter Geoghegan <pg@bowt.ie> wrote:

Once you figure all that out, you're still obligated to hand-polish
typedefs.list to be consistent with whatever Bruce's machine's copy of
objdump does (or is it Tom's?). You need to sort the entries so they
kinda look like they originated from the same source as existing
entries, since my Debian machine seems to produce somewhat different
results to RHEL, for whatever reason. It's hard to imagine a worse use
of committer time.

I think that something like this new policy could work if the
underlying tooling was very easy to use and gave perfectly consistent
results on everybody's development machine. Obviously, that just isn't
the case.

To make koel pass you don't need to worry about hand-polishing
typedefs.list. koel uses the typedefs.list that's committed into the
repo, just like when you run pgindent yourself. If you forget to
update the typedefs.list with new types, then worst case the pgindent
output will look weird. But it will look weird both on your own
machine and on koel. So afaik the current tooling should give
perfectly consistent results on everybody's development machine. If
you have an example of where it doesn't then we should fix that
problem.

Show quoted text

On Tue, 17 Oct 2023 at 17:47, Peter Geoghegan <pg@bowt.ie> wrote:

On Tue, Oct 17, 2023 at 8:24 AM Robert Haas <robertmhaas@gmail.com> wrote:

I also just discovered that my pre-commit hook doesn't work if I pull
commits into master by cherry-picking. I had thought that I could have
my hook just check my commits to master and not all of my local dev
branches where I really don't want to mess with this when I'm just
banging out a rough draft of something. But now I see that I'm going
to need to work harder on this if I actually want it to catch all the
ways I might screw this up.

Once you figure all that out, you're still obligated to hand-polish
typedefs.list to be consistent with whatever Bruce's machine's copy of
objdump does (or is it Tom's?). You need to sort the entries so they
kinda look like they originated from the same source as existing
entries, since my Debian machine seems to produce somewhat different
results to RHEL, for whatever reason. It's hard to imagine a worse use
of committer time.

I think that something like this new policy could work if the
underlying tooling was very easy to use and gave perfectly consistent
results on everybody's development machine. Obviously, that just isn't
the case.

--
Peter Geoghegan

In reply to: Jelte Fennema (#273)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 9:03 AM Jelte Fennema <postgres@jeltef.nl> wrote:

To make koel pass you don't need to worry about hand-polishing
typedefs.list. koel uses the typedefs.list that's committed into the
repo, just like when you run pgindent yourself. If you forget to
update the typedefs.list with new types, then worst case the pgindent
output will look weird. But it will look weird both on your own
machine and on koel.

That's beside the point. The point is that I'm obligated to keep
typedef.list up to date in general, a task that is made significantly
harder by random objdump implementation details. And I probably need
to do this not just once per commit, but several times, since in
practice I need to defensively run and rerun pgindent as the patch is
tweaked.

--
Peter Geoghegan

#275Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Geoghegan (#274)
Re: run pgindent on a regular basis / scripted manner

Peter Geoghegan <pg@bowt.ie> writes:

That's beside the point. The point is that I'm obligated to keep
typedef.list up to date in general, a task that is made significantly
harder by random objdump implementation details. And I probably need
to do this not just once per commit, but several times, since in
practice I need to defensively run and rerun pgindent as the patch is
tweaked.

Hmm, I've not found it that hard to manage the typedefs list.
If I run pgindent and it adds weird spacing around uses of a new
typedef name, I go "oh, I better add that to the list" and do so.
End of problem. There's not a requirement that you remove disused
typedef names, nor that you alphabetize perfectly. I'm content
to update those sorts of details from the buildfarm's list once a
year or so.

This does assume that you inspect pgindent's changes rather than
just accepting them blindly --- but as I commented upthread, the
tool really requires that anyway.

regards, tom lane

#276Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#271)
Re: run pgindent on a regular basis / scripted manner

Robert Haas <robertmhaas@gmail.com> writes:

One potential problem with a server-side hook is that if you back-port
a commit to older branches and then push the commits all together
(which is my workflow) then you might get failure to push on some
branches but not others. I don't know if there's any way to avoid
that, but it seems not great. You could think of enforcing the policy
only on master to try to avoid this, but that still leaves a risk that
you manage to push to all the back-branches and not to master.

Is that actually possible? I had the idea that "git push" is an
atomic operation, ie 100% or nothing. Is it only atomic per-branch?

regards, tom lane

#277Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#275)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 12:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm, I've not found it that hard to manage the typedefs list.
If I run pgindent and it adds weird spacing around uses of a new
typedef name, I go "oh, I better add that to the list" and do so.
End of problem. There's not a requirement that you remove disused
typedef names, nor that you alphabetize perfectly. I'm content
to update those sorts of details from the buildfarm's list once a
year or so.

+1 to all of that. At least for me, managing typedefs.list isn't the
problem. The problem is remembering to actually do it, and keep it
updated as the patch set is adjusted and rebased.

--
Robert Haas
EDB: http://www.enterprisedb.com

#278Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#276)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 12:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

One potential problem with a server-side hook is that if you back-port
a commit to older branches and then push the commits all together
(which is my workflow) then you might get failure to push on some
branches but not others. I don't know if there's any way to avoid
that, but it seems not great. You could think of enforcing the policy
only on master to try to avoid this, but that still leaves a risk that
you manage to push to all the back-branches and not to master.

Is that actually possible? I had the idea that "git push" is an
atomic operation, ie 100% or nothing. Is it only atomic per-branch?

I believe so. For instance:

[rhaas pgsql]$ git push rhaas
Enumerating objects: 2980, done.
Counting objects: 100% (2980/2980), done.
Delta compression using up to 16 threads
Compressing objects: 100% (940/940), done.
Writing objects: 100% (2382/2382), 454.52 KiB | 7.70 MiB/s, done.
Total 2382 (delta 2024), reused 1652 (delta 1429), pack-reused 0
remote: Resolving deltas: 100% (2024/2024), completed with 579 local objects.
To ssh://git.postgresql.org/users/rhaas/postgres.git
e434e21e11..2406c4e34c master -> master
! [rejected] walsummarizer2 -> walsummarizer2 (non-fast-forward)
error: failed to push some refs to
'ssh://git.postgresql.org/users/rhaas/postgres.git'
hint: Updates were rejected because a pushed branch tip is behind its remote
hint: counterpart. Check out this branch and integrate the remote changes
hint: (e.g. 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

--
Robert Haas
EDB: http://www.enterprisedb.com

#279Maciek Sakrejda
m.sakrejda@gmail.com
In reply to: Robert Haas (#278)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023, 09:22 Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 17, 2023 at 12:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:
Is that actually possible? I had the idea that "git push" is an
atomic operation, ie 100% or nothing. Is it only atomic per-branch?

I believe so.

Git push does have an --atomic flag to treat the entire push as a single
operation.

#280Jelte Fennema
postgres@jeltef.nl
In reply to: Maciek Sakrejda (#279)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 18:53, Maciek Sakrejda <m.sakrejda@gmail.com> wrote:

Git push does have an --atomic flag to treat the entire push as a single operation.

I decided to play around a bit with server hooks. Attached is a git
"update" hook that rejects pushes to the master branch when the new
HEAD of master does not pass pgindent. It tries to do the minimal
amount of work necessary. Together with the --atomic flag of git push
I think this would work quite well.

Note: It does require that pg_bsd_indent is in PATH. While not perfect
seems like it would be acceptable in practice to me. Its version is
not updated very frequently. So manually updating it on the git server
when we do does not seem like a huge issue to me.

The easiest way to try it out is by cloning the postgres repo in two
different local directories, let's call them A and B. And then
configure directory B to be the origin remote of A. By placing the
update script in B/.git/hooks/ it will execute whenever you push
master from A to B.

Attachments:

updateapplication/octet-stream; name=updateDownload
#281Magnus Hagander
magnus@hagander.net
In reply to: Jelte Fennema (#280)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 10:43 PM Jelte Fennema <postgres@jeltef.nl> wrote:

On Tue, 17 Oct 2023 at 18:53, Maciek Sakrejda <m.sakrejda@gmail.com> wrote:

Git push does have an --atomic flag to treat the entire push as a single operation.

I decided to play around a bit with server hooks. Attached is a git
"update" hook that rejects pushes to the master branch when the new
HEAD of master does not pass pgindent. It tries to do the minimal
amount of work necessary. Together with the --atomic flag of git push
I think this would work quite well.

Note: It does require that pg_bsd_indent is in PATH. While not perfect
seems like it would be acceptable in practice to me. Its version is
not updated very frequently. So manually updating it on the git server
when we do does not seem like a huge issue to me.

If it doesn't know how to rebuild it, aren't we going to be stuck in a
catch-22 if we need to change it in certain ways? Since an old version
of pg_bsd_indent would reject the patch that might include updating
it. (And when it does, one should expect the push to take quite a long
time, but given the infrequency I agree that part is probably not an
issue)

And unless we're only enforcing it on master, we'd also need to make
provisions for different versions of it on different branches, I
think?

Other than that, I agree it's fairly simple. It does nede a lot more
sandboxing than what's in there now, but that's not too hard of a
problem to solve, *if* this is what we want.

(And of course needs to be integrated with the existing script since
AFAIK you can't chain git hooks unless you do it manually - but that's
mostliy mechanical)

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#282Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#281)
Re: run pgindent on a regular basis / scripted manner

Magnus Hagander <magnus@hagander.net> writes:

If it doesn't know how to rebuild it, aren't we going to be stuck in a
catch-22 if we need to change it in certain ways? Since an old version
of pg_bsd_indent would reject the patch that might include updating
it. (And when it does, one should expect the push to take quite a long
time, but given the infrequency I agree that part is probably not an
issue)

Everyone who has proposed this has included a caveat that there must
be a way to override the check. Given that minimal expectation, it
shouldn't be too hard to deal with pg_bsd_indent-updating commits.

regards, tom lane

#283David Rowley
dgrowleyml@gmail.com
In reply to: Robert Haas (#261)
Re: run pgindent on a regular basis / scripted manner

On Wed, 18 Oct 2023 at 01:47, Robert Haas <robertmhaas@gmail.com> wrote:

On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:

This policy isn't working.

+1. I think this is more annoying than the status quo ante.

Maybe there are two camps of committers here; ones who care about
committing correctly indented code and ones who do not.

I don't mean that in a bad way, but if a committer just does not care
about correctly pgindented code then he/she likely didn't suffer
enough pain from how things used to be... having to unindent all the
unrelated indent fixes that were committed since the last pgindent run
I personally found slow/annoying/error-prone.

What I do now seems significantly easier. Assuming it's just 1 commit, just:

perl src/tools/pgindent/pgindent --commit HEAD
git diff # manual check to see if everything looks sane.
git commit -a --fixup HEAD
git rebase -i HEAD~2

If we were to go back to how it was before, then why should I go to
the trouble of unindenting all the unrelated indents from code changed
by other committers since the last pgindent run when those committers
are not bothering to and making my job harder each time they commit
incorrectly indented code.

How many of the committers who have broken koel are repeat offenders?
What is their opinion on this?
Did they just forget once or do they hate the process and want to go back?

I'm personally very grateful for all the work that was done to improve
pgindent and set out the new process. I'd really rather not go back to
how things were.

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

I think many of us have scripts we use instead of typing out all the
configure options we want. It's likely pretty easy to add
--enable-indent-checks to those.

David

#284Peter Eisentraut
peter@eisentraut.org
In reply to: David Rowley (#283)
Re: run pgindent on a regular basis / scripted manner

On 18.10.23 06:40, David Rowley wrote:

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

This approach seems like a good improvement, even independent of
everything else we might do about this. Making it easier to use and
less likely to be forgotten. Also, this way, non-committer contributors
can opt-in, if they want to earn bonus points.

#285Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#282)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 11:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

If it doesn't know how to rebuild it, aren't we going to be stuck in a
catch-22 if we need to change it in certain ways? Since an old version
of pg_bsd_indent would reject the patch that might include updating
it. (And when it does, one should expect the push to take quite a long
time, but given the infrequency I agree that part is probably not an
issue)

Everyone who has proposed this has included a caveat that there must
be a way to override the check. Given that minimal expectation, it
shouldn't be too hard to deal with pg_bsd_indent-updating commits.

I haven't managed to fully keep up with the thread, so I missed that.
And i can't directly find it looking back either - but as long as
three's an actual idea for how to do that, the problem goes away :)

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

#286Jelte Fennema
postgres@jeltef.nl
In reply to: Magnus Hagander (#281)
Re: run pgindent on a regular basis / scripted manner

On Tue, 17 Oct 2023 at 23:01, Magnus Hagander <magnus@hagander.net> wrote:

And unless we're only enforcing it on master, we'd also need to make
provisions for different versions of it on different branches, I
think?

Only enforcing on master sounds fine to me, that's what koel is doing
too afaik. In practice this seems to be enough to solve my main issue
of having to manually remove unrelated indents when rebasing my
patches. Enforcing on different branches seems like it would add a lot
of complexity. So I'm not sure that's worth doing at this point, since
currently some committers are proposing to stop enforcing continuous
indentation because of problems with the current flow. I think we
should only enforce it on more branches once we have the flow
mastered.

It does need a lot more
sandboxing than what's in there now, but that's not too hard of a
problem to solve, *if* this is what we want.

Yeah, I didn't bother with that. Since that seems very tightly coupled
with the environment that the git server is running, and I have no
clue what that environment is.

#287Jelte Fennema
postgres@jeltef.nl
In reply to: David Rowley (#283)
Re: run pgindent on a regular basis / scripted manner

On Wed, 18 Oct 2023 at 06:40, David Rowley <dgrowleyml@gmail.com> wrote:

How many of the committers who have broken koel are repeat offenders?

I just checked the commits and there don't seem to be real repeat
offenders. The maximum number of times someone broke koel since its
inception is two. That was the case for only two people. The other 8
people only caused one breakage.

What is their opinion on this?
Did they just forget once or do they hate the process and want to go back?

The commiters that broke koel since its inception are:
- Alexander Korotkov
- Amit Kapila
- Amit Langote
- Andres Freund
- Etsuro Fujita
- Jeff Davis
- Michael Paquier
- Peter Eisentraut
- Tatsuo Ishii
- Tomas Vondra

I included all of them in the To field of this message, in the hope
that they share their viewpoint. Because otherwise it stays guessing
what they think.

But based on the contents of the fixup commits a commonality seems to
be that the fixup only fixes a few lines, quite often touching only
comments. So it seems like the main reason for breaking koel is
forgetting to re-run pgindent after some final cleanup/wording
changes/typo fixes. And that seems like an expected flaw of being
human instead of a robot, which can only be worked around with better
automation.

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

I think --enable-indent-checks sounds like a good improvement to the
status quo. But I'm not confident that it will help remove the cases
where only a comment needs to be re-indented. Do commiters really
always run check-world again when only changing a typo in a comment? I
know I probably wouldn't (or at least not always).

#288David Rowley
dgrowleyml@gmail.com
In reply to: Jelte Fennema (#287)
Re: run pgindent on a regular basis / scripted manner

On Wed, 18 Oct 2023 at 22:07, Jelte Fennema <postgres@jeltef.nl> wrote:

But based on the contents of the fixup commits a commonality seems to
be that the fixup only fixes a few lines, quite often touching only
comments. So it seems like the main reason for breaking koel is
forgetting to re-run pgindent after some final cleanup/wording
changes/typo fixes. And that seems like an expected flaw of being
human instead of a robot, which can only be worked around with better
automation.

I wonder if you might just be assuming these were caused by
last-minute comment adjustments. I may have missed something on the
thread, but it could be that, or it could be due to the fact that
pgindent just simply does more adjustments to comments than it does
with code lines.

On Wed, 18 Oct 2023 at 06:40, David Rowley <dgrowleyml@gmail.com> wrote:

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

I think --enable-indent-checks sounds like a good improvement to the
status quo. But I'm not confident that it will help remove the cases
where only a comment needs to be re-indented. Do commiters really
always run check-world again when only changing a typo in a comment? I
know I probably wouldn't (or at least not always).

I can't speak for others, but I always make edits in a dev branch and
do "make check-world" before doing "git format-patch" before I "git
am" that patch into a clean repo. Before I push, I'll always run
"make check-world" again as sometimes master might have moved on a few
commits from where the dev branch was taken (perhaps I need to update
the expected output of some newly added EXPLAIN tests if say doing a
planner adjustment). I personally never adjust any code or comments
after the git am. I only sometimes adjust the commit message.

So, in theory at least, if --enable-indent-checks existed and I used
it, I shouldn't break koel... let's see if I just jinxed myself.

It would be good to learn how many of the committers out of the ones
you listed that --enable-indent-checks would have saved from breaking
koel.

David

#289Vik Fearing
vik@postgresfriends.org
In reply to: Tom Lane (#264)
Re: run pgindent on a regular basis / scripted manner

On 10/17/23 16:23, Tom Lane wrote:

An alternative I was thinking about after reading your earlier email was
going back to the status quo ante, but doing the manual tree-wide
reindents significantly more often than once a year. Adding one at the
conclusion of each commitfest would be a natural thing to do, for
instance. It's hard to say what frequency would lead to the least
rebasing pain, but we know once-a-year isn't ideal.

This is basically how the SQL Committee functions. The Change Proposals
(patches) submitted every meeting (commitfest) are always against the
documents as they exist after the application of papers (commits) from
the previous meeting.

One major difference is that Change Proposals are against the text, and
patches are against the code. It is not dissimilar to people saying
what our documentation should say, and then someone implementing that
change.

So I am in favor of a pgindent run *at least* at the end of each
commitfest, giving a full month for patch authors to rebase before the
next fest.
--
Vik Fearing

#290Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#284)
Re: run pgindent on a regular basis / scripted manner

On Wed, Oct 18, 2023 at 3:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:

On 18.10.23 06:40, David Rowley wrote:

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

This approach seems like a good improvement, even independent of
everything else we might do about this. Making it easier to use and
less likely to be forgotten. Also, this way, non-committer contributors
can opt-in, if they want to earn bonus points.

Yeah. I'm not going to push anything that doesn't pass make
check-world, so this is appealing in that something that I'm already
doing would (or could be configured to) catch this problem also.

--
Robert Haas
EDB: http://www.enterprisedb.com

#291Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#267)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 17, 2023 at 11:01:44AM -0400, Robert Haas wrote:

In fact, that particular experience is one of the worst things about
being a committer. It actively discourages me, at least, from trying
to get other people's patches committed. This particular problem is
minor, but the overall experience of trying to get things committed is
that you have to check 300 things for every patch and if you get every
one of them right then nothing happens and if you get one of them
wrong then you get a bunch of irritated emails criticizing your
laziness, sloppiness, or whatever, and you have to drop everything to
go fix it immediately. What a deal! I'm sure this isn't the only
reason why we have such a huge backlog of patches needing committer
attention, but it sure doesn't help. And there is absolutely zero need
for this to be yet another thing that you can find out you did wrong
in the 1-24 hour period AFTER you type 'git push'.

This comment resonated with me. I do all my git operations with shell
scripts so I can check for all the mistakes I have made in the past and
generate errors. Even with all of that, committing is an
anxiety-producing activity because any small mistake is quickly revealed
to the world. There aren't many things I do in a day where mistakes are
so impactful.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#292Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#254)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-10-16 20:45:00 -0400, Tom Lane wrote:

Peter Geoghegan <pg@bowt.ie> writes:
2. We could raise awareness of this issue by adding indent verification
to CI testing. I hesitate to suggest that, though, for a couple of
reasons:
2a. It seems fairly expensive, though I might be misjudging.

Compared to other things it's not that expensive. On my workstation, which is
slower on a per-core basis than CI, a whole tree pgindent --silent-diff takes
6.8s. For That's doing things serially, it shouldn't be that hard to parallelize
the per-file processing.

For comparison, the current compiler warnings task takes 6-15min, depending on
the state of the ccache "database". Even when ccache is primed, running
cpluspluscheck or headerscheck is ~30s each. Adding a few more seconds for an
indentation check wouldn't be a problem.

2b. It's often pretty handy to submit patches that aren't fully
indent-clean; I have such a patch in flight right now at [1].

2b could be ameliorated by making the indent check be a separate
test process that doesn't obscure the results of other testing.

The compiler warnings task already executes a number of tests even if prior
tests have failed (to be able to find compiler warnings in different compilers
at once). Adding pgindent cleanliness to that would be fairly simple.

I still think that one of the more important things we ought to do is to make
it trivial to check if code is correctly indented and reindent it for the
user. I've posted a preliminary patch to add a 'indent-tree' target a few
months back, at
/messages/by-id/20230527184201.2zdorrijg2inqt6v@alap3.anarazel.de

I've updated that patch, now it has
- indent-tree, reindents the entire tree
- indent-head, which pgindent --commit=HEAD
- indent-check, fails if the tree isn't correctly indented
- indent-diff, like indent-check, but also shows the diff

If we tought pgindent to emit the list of files it processes to a dependency
file, we could make it cheap to call indent-check repeatedly, by teaching
meson/ninja to not reinvoke it if the input files haven't changed. Personally
that'd make it more bearable to script indentation checks to happen
frequently.

I'll look into writing a command to update typedefs.list with all the local
changes.

Greetings,

Andres Freund

Attachments:

v2-0001-Add-meson-targets-to-run-pgindent.patchtext/x-diff; charset=us-asciiDownload
From 288c109286fda4dbd40fab345fbaf71d91d2db39 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 27 May 2023 11:38:27 -0700
Subject: [PATCH v2] Add meson targets to run pgindent

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 meson.build                 | 36 ++++++++++++++++++++++++++++++++++++
 src/tools/pgindent/pgindent | 12 ++++++++++--
 2 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 862c955453f..7d4d02e7624 100644
--- a/meson.build
+++ b/meson.build
@@ -3013,6 +3013,42 @@ run_target('install-test-files',
 
 
 
+###############################################################
+# Indentation and similar targets
+###############################################################
+
+indent_base = [perl, files('src/tools/pgindent/pgindent'),
+    '--indent', pg_bsd_indent.full_path(),
+    '--sourcetree=@SOURCE_ROOT@']
+
+# reindent the entire tree
+run_target('indent-tree',
+  command: indent_base + ['.'],
+  depends: pg_bsd_indent
+)
+
+# reindent the last commit
+run_target('indent-head',
+  command: indent_base + ['--commit=HEAD'],
+  depends: pg_bsd_indent
+)
+
+# silently check if any file is not corectly indented (fails the build if
+# there are differences)
+run_target('indent-check',
+  command: indent_base + ['--silent-diff', '.'],
+  depends: pg_bsd_indent
+)
+
+# show a diff of all the unindented files (doesn't fail the build if there are
+# differences)
+run_target('indent-diff',
+  command: indent_base + ['--show-diff', '.'],
+  depends: pg_bsd_indent
+)
+
+
+
 ###############################################################
 # Test prep
 ###############################################################
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index bce63d95daf..08b0c95814f 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,7 +23,7 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, @excludes,
 	$indent, $build, $show_diff,
-	$silent_diff, $help, @commits,);
+	$silent_diff, $sourcetree, $help, @commits,);
 
 $help = 0;
 
@@ -35,7 +35,8 @@ my %options = (
 	"excludes=s" => \@excludes,
 	"indent=s" => \$indent,
 	"show-diff" => \$show_diff,
-	"silent-diff" => \$silent_diff,);
+	"silent-diff" => \$silent_diff,
+	"sourcetree=s" => \$sourcetree);
 GetOptions(%options) || usage("bad command line argument");
 
 usage() if $help;
@@ -53,6 +54,13 @@ $typedefs_file ||= $ENV{PGTYPEDEFS};
 # get indent location for environment or default
 $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
+# When invoked from the build directory, change into source tree, otherwise
+# the heuristics in locate_sourcedir() don't work.
+if (defined $sourcetree)
+{
+	chdir $sourcetree;
+}
+
 my $sourcedir = locate_sourcedir();
 
 # if it's the base of a postgres tree, we will exclude the files
-- 
2.38.0

#293Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#292)
3 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-10-18 12:15:51 -0700, Andres Freund wrote:

I still think that one of the more important things we ought to do is to make
it trivial to check if code is correctly indented and reindent it for the
user. I've posted a preliminary patch to add a 'indent-tree' target a few
months back, at
/messages/by-id/20230527184201.2zdorrijg2inqt6v@alap3.anarazel.de

I've updated that patch, now it has
- indent-tree, reindents the entire tree
- indent-head, which pgindent --commit=HEAD
- indent-check, fails if the tree isn't correctly indented
- indent-diff, like indent-check, but also shows the diff

If we tought pgindent to emit the list of files it processes to a dependency
file, we could make it cheap to call indent-check repeatedly, by teaching
meson/ninja to not reinvoke it if the input files haven't changed. Personally
that'd make it more bearable to script indentation checks to happen
frequently.

I'll look into writing a command to update typedefs.list with all the local
changes.

It turns out that updating the in-tree typedefs.list would be very noisy. On
my local linux system I get
1 file changed, 422 insertions(+), 1 deletion(-)

On a mac mini I get
1 file changed, 351 insertions(+), 1 deletion(-)

We could possibly address that by updating the in-tree typedefs.list a bit
more aggressively. Sure looks like the source systems are on the older side.

But in the attached patch I've implemented this slightly differently. If the
tooling to do so is available, the indent-* targets explained above,
use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
which is the combination of a src/tools/pgindent/typedefs.list.local generated
for the local binaries/libraries and the source tree
src/tools/pgindent/typedefs.list.

src/tools/pgindent/typedefs.list.local is generated in fragments, with one
fragment for each build target. That way the whole file doesn't have to be
regenerated all the time, which can save a good bit of time (althoug obviously
less when hacking on the backend).

This makes it quite quick to locally indent, without needing to manually
fiddle around with manually modifying typedefs.list or using a separate
typedefs.list.

In a third commit I added a 'nitpick' configure time option, defaulting to
off, which runs an indentation check. The failure mode of that currently isn't
very helpful though, as it just uses --silent-diff.

All of this currently is meson only, largely because I don't feel like
spending the time messing with the configure build, particularly before there
is any agreement on this being the thing to do.

Greetings,

Andres Freund

Attachments:

v3-0001-Add-meson-targets-to-run-pgindent.patchtext/x-diff; charset=us-asciiDownload
From f7520fd87764cddbde9d6c5d25fdfe5314ec5cbc Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 27 May 2023 11:38:27 -0700
Subject: [PATCH v3 1/3] Add meson targets to run pgindent

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 meson.build                 | 38 +++++++++++++++++++++++++++++++++++++
 src/tools/pgindent/pgindent | 12 ++++++++++--
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 862c955453f..2cfeb60eb07 100644
--- a/meson.build
+++ b/meson.build
@@ -3013,6 +3013,44 @@ run_target('install-test-files',
 
 
 
+###############################################################
+# Indentation and similar targets
+###############################################################
+
+indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
+    '--indent', pg_bsd_indent.full_path(),
+    '--sourcetree=@SOURCE_ROOT@']
+indent_depend = [pg_bsd_indent]
+
+# reindent the entire tree
+# Reindent the entire tree
+run_target('indent-tree',
+  command: indent_base_cmd + ['.'],
+  depends: indent_depend
+)
+
+# Reindent the last commit
+run_target('indent-head',
+  command: indent_base_cmd + ['--commit=HEAD'],
+  depends: indent_depend
+)
+
+# Silently check if any file is not corectly indented (fails the build if
+# there are differences)
+run_target('indent-check',
+  command: indent_base_cmd + ['--silent-diff', '.'],
+  depends: indent_depend
+)
+
+# Show a diff of all the unindented files (doesn't fail the build if there are
+# differences)
+run_target('indent-diff',
+  command: indent_base_cmd + ['--show-diff', '.'],
+  depends: indent_depend
+)
+
+
+
 ###############################################################
 # Test prep
 ###############################################################
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index bce63d95daf..08b0c95814f 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,7 +23,7 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, @excludes,
 	$indent, $build, $show_diff,
-	$silent_diff, $help, @commits,);
+	$silent_diff, $sourcetree, $help, @commits,);
 
 $help = 0;
 
@@ -35,7 +35,8 @@ my %options = (
 	"excludes=s" => \@excludes,
 	"indent=s" => \$indent,
 	"show-diff" => \$show_diff,
-	"silent-diff" => \$silent_diff,);
+	"silent-diff" => \$silent_diff,
+	"sourcetree=s" => \$sourcetree);
 GetOptions(%options) || usage("bad command line argument");
 
 usage() if $help;
@@ -53,6 +54,13 @@ $typedefs_file ||= $ENV{PGTYPEDEFS};
 # get indent location for environment or default
 $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
+# When invoked from the build directory, change into source tree, otherwise
+# the heuristics in locate_sourcedir() don't work.
+if (defined $sourcetree)
+{
+	chdir $sourcetree;
+}
+
 my $sourcedir = locate_sourcedir();
 
 # if it's the base of a postgres tree, we will exclude the files
-- 
2.38.0

v3-0002-heavily-wip-update-typedefs.patchtext/x-diff; charset=us-asciiDownload
From b8d31613acfc9d554640b4e292f366d79084bb8e Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 18 Oct 2023 16:18:58 -0700
Subject: [PATCH v3 2/3] heavily wip: update typedefs

---
 meson.build                           | 14 +++++-
 src/backend/jit/llvm/meson.build      |  2 +-
 src/backend/snowball/meson.build      |  2 +-
 src/timezone/meson.build              |  2 +-
 src/tools/pgindent/merge_typedefs     | 24 +++++++++
 src/tools/pgindent/meson.build        | 70 +++++++++++++++++++++++++++
 src/tools/pgindent/typedefs_from_objs | 42 ++++++++++++++++
 7 files changed, 152 insertions(+), 4 deletions(-)
 create mode 100755 src/tools/pgindent/merge_typedefs
 create mode 100644 src/tools/pgindent/meson.build
 create mode 100755 src/tools/pgindent/typedefs_from_objs

diff --git a/meson.build b/meson.build
index 2cfeb60eb07..70f6b680c2c 100644
--- a/meson.build
+++ b/meson.build
@@ -2579,6 +2579,7 @@ add_project_link_arguments(ldflags, language: ['c', 'cpp'])
 
 # list of targets for various alias targets
 backend_targets = []
+data_targets = []
 bin_targets = []
 pl_targets = []
 contrib_targets = []
@@ -2894,6 +2895,8 @@ subdir('src/interfaces/ecpg/test')
 
 subdir('doc/src/sgml')
 
+subdir('src/tools/pgindent')
+
 generated_sources_ac += {'': ['GNUmakefile']}
 
 # After processing src/test, add test_install_libs to the testprep_targets
@@ -2982,6 +2985,7 @@ endif
 all_built = [
   backend_targets,
   bin_targets,
+  data_targets,
   libpq_st,
   pl_targets,
   contrib_targets,
@@ -3017,12 +3021,20 @@ run_target('install-test-files',
 # Indentation and similar targets
 ###############################################################
 
+# If the dependencies for generating a local typedefs.list are fulfilled, we
+# use a combination of a locally built and the source tree's typededefs.list
+# file (see src/tools/pgindent/meson.build) for reindenting. That ensures
+# newly added typedefs are indented correctly.
 indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
     '--indent', pg_bsd_indent.full_path(),
     '--sourcetree=@SOURCE_ROOT@']
 indent_depend = [pg_bsd_indent]
 
-# reindent the entire tree
+if typedefs_supported
+  indent_base_cmd += ['--typedefs', typedefs_merged.full_path()]
+  indent_depend += typedefs_merged
+endif
+
 # Reindent the entire tree
 run_target('indent-tree',
   command: indent_base_cmd + ['.'],
diff --git a/src/backend/jit/llvm/meson.build b/src/backend/jit/llvm/meson.build
index 8ffaf414609..7e2f6d5bd5c 100644
--- a/src/backend/jit/llvm/meson.build
+++ b/src/backend/jit/llvm/meson.build
@@ -82,4 +82,4 @@ llvmjit_types = custom_target('llvmjit_types.bc',
   install_dir: dir_lib_pkg,
   depfile: '@BASENAME@.c.bc.d',
 )
-backend_targets += llvmjit_types
+data_targets += llvmjit_types
diff --git a/src/backend/snowball/meson.build b/src/backend/snowball/meson.build
index 0f669c0bf3c..69e777399fd 100644
--- a/src/backend/snowball/meson.build
+++ b/src/backend/snowball/meson.build
@@ -94,4 +94,4 @@ install_subdir('stopwords',
 )
 
 backend_targets += dict_snowball
-backend_targets += snowball_create
+data_targets += snowball_create
diff --git a/src/timezone/meson.build b/src/timezone/meson.build
index 7b85a01c6bd..779a51d85a4 100644
--- a/src/timezone/meson.build
+++ b/src/timezone/meson.build
@@ -50,7 +50,7 @@ if get_option('system_tzdata') == ''
     install_dir: dir_data,
   )
 
-  bin_targets += tzdata
+  data_targets += tzdata
 endif
 
 subdir('tznames')
diff --git a/src/tools/pgindent/merge_typedefs b/src/tools/pgindent/merge_typedefs
new file mode 100755
index 00000000000..c2e1f446c15
--- /dev/null
+++ b/src/tools/pgindent/merge_typedefs
@@ -0,0 +1,24 @@
+#!/usr/bin/env python3
+
+# Tool to merge multiple typedefs files into one
+
+import argparse
+import os
+import subprocess
+import sys
+
+parser = argparse.ArgumentParser(
+    description='merge multiple typedef files')
+
+parser.add_argument('--output', type=argparse.FileType('w'), required=True)
+parser.add_argument('input', type=argparse.FileType('r'), nargs='*')
+
+args = parser.parse_args()
+
+typedefs = set()
+
+for input in args.input:
+    for typedef in input.readlines():
+        typedefs.add(typedef.strip())
+
+print('\n'.join(sorted(typedefs)), file=args.output)
diff --git a/src/tools/pgindent/meson.build b/src/tools/pgindent/meson.build
new file mode 100644
index 00000000000..c0801a6c86c
--- /dev/null
+++ b/src/tools/pgindent/meson.build
@@ -0,0 +1,70 @@
+# Currently the code requires meson 0.63 or upwards. This could likely be
+# lifted (in fact, some older versions actually work, but only some), but for
+# now it's not crucial.
+typedefs_supported = meson.version().version_compare('>=0.63')
+if not typedefs_supported
+  subdir_done()
+endif
+
+typedefs_from_objs = files('typedefs_from_objs')
+typedefs_from_objs_cmd = [typedefs_from_objs, '--host', host_system, '--output', '@OUTPUT@', '@INPUT@']
+merge_typedefs = files('merge_typedefs')
+merge_typedefs_cmd = [merge_typedefs, '--output', '@OUTPUT@', '@INPUT@']
+
+# XXX: This list of targets should likely not be maintained here
+typedef_src_tgts = [
+  backend_targets,
+  bin_targets,
+  [libpq_st],
+  pl_targets,
+  contrib_targets,
+  ecpg_targets,
+]
+
+# We generate partial typedefs files for each binary/library. That makes using
+# this during incremental development much faster.
+#
+# The reason we process the object files instead of executables is that that
+# doesn't work on macos. There doesn't seem to be a reason to target
+# executables directly on other platforms.
+typedef_tgts = []
+foreach tgts : typedef_src_tgts
+  foreach tgt : tgts
+    # can't use tgt.name(), as we have have targets that differ just in suffix
+    name = fs.name(tgt.full_path())
+    tdname = 'typedefs.list.local-@0@'.format(name)
+    objs = tgt.extract_all_objects(recursive: true)
+    typedef_tgts += custom_target(tdname,
+                                  output: tdname,
+                                  command: typedefs_from_objs_cmd,
+                                  input: objs,
+                                  build_by_default: false,
+                                  )
+
+  endforeach
+endforeach
+
+# A typedefs.list file for the local build tree. This may not contain typedefs
+# referenced e.g. by platform specific code for other platforms.
+typedefs_local = custom_target('typedefs.list.local', output: 'typedefs.list.local',
+  command: merge_typedefs_cmd,
+  input: typedef_tgts,
+  build_by_default: false,
+)
+
+# The locally generated typedef.list, merge with the source tree version. This
+# is useful for reindenting code, particularly when the local tree has new
+# types added.
+typedefs_merged = custom_target('typedefs.list.merged', output: 'typedefs.list.merged',
+  command: merge_typedefs_cmd,
+  input: [typedefs_local, files('typedefs.list')],
+  build_by_default: false,
+)
+
+# FIXME: As-is this is rarely useful, as the local typedefs file will likely
+# contain too many changes.
+if cp.found()
+  run_target('update-typedefs',
+    command: [cp, typedefs_merged, '@SOURCE_ROOT@/src/tools/pgindent/typedefs.list'],
+  )
+endif
diff --git a/src/tools/pgindent/typedefs_from_objs b/src/tools/pgindent/typedefs_from_objs
new file mode 100755
index 00000000000..821c84cb3d8
--- /dev/null
+++ b/src/tools/pgindent/typedefs_from_objs
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+
+# Tool to extract typedefs from object files.
+
+import argparse
+import re
+import shutil
+import subprocess
+import sys
+
+parser = argparse.ArgumentParser(
+    description='generate typedefs file for a set of object files')
+
+parser.add_argument('--output', type=argparse.FileType('w'), required=True)
+parser.add_argument('--host', type=str, required=True)
+parser.add_argument('input', nargs='*')
+
+args = parser.parse_args()
+
+if args.host == 'linux':
+    find_td_re = re.compile(r'^[^\n]+TAG_typedef\)\n[^\n]+DW_AT_name\s*:\s*\([^\)]+\): ([^\n]+)$', re.MULTILINE)
+    # FIXME: should probably be set by the caller? Except that the same binary
+    # name behaves very differently on different platforms :/
+    cmd = [shutil.which('objdump'), '-Wi']
+elif args.host == 'darwin':
+    find_td_re = re.compile(r'^[^\n]+TAG_typedef\n\s*DW_AT_type[^\n]+\n\s+DW_AT_name\s*\(\"([^\n]+)\"\)$', re.MULTILINE)
+    cmd = [shutil.which('dwarfdump')]
+else:
+    raise f'unsupported platform: {args.host}'
+
+lcmd = cmd + args.input
+sp = subprocess.run(lcmd, stdout=subprocess.PIPE, universal_newlines=True)
+if sp.returncode != 0:
+    print(f'{lcmd} failed with return code {sp.returncode}', file=sys.stderr)
+    sys.exit(sp.returncode)
+
+fa = find_td_re.findall(sp.stdout)
+
+for typedef in fa:
+    print(typedef, file=args.output)
+
+sys.exit(0)
-- 
2.38.0

v3-0003-WIP-meson-nitpick-mode-test-checking-indentation.patchtext/x-diff; charset=us-asciiDownload
From 6688a89700af997b5adaa55f349c8db53a86290b Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 18 Oct 2023 17:51:21 -0700
Subject: [PATCH v3 3/3] WIP: meson: nitpick mode + test checking indentation

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch:
---
 meson.build       | 17 +++++++++++++++--
 meson_options.txt |  3 +++
 .cirrus.tasks.yml |  1 +
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 70f6b680c2c..86a5d0995a1 100644
--- a/meson.build
+++ b/meson.build
@@ -44,6 +44,7 @@ cc = meson.get_compiler('c')
 not_found_dep = dependency('', required: false)
 thread_dep = dependency('threads')
 auto_features = get_option('auto_features')
+nitpick_mode = get_option('nitpick')
 
 
 
@@ -3025,9 +3026,10 @@ run_target('install-test-files',
 # use a combination of a locally built and the source tree's typededefs.list
 # file (see src/tools/pgindent/meson.build) for reindenting. That ensures
 # newly added typedefs are indented correctly.
-indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
+indent_base_args = [files('src/tools/pgindent/pgindent'),
     '--indent', pg_bsd_indent.full_path(),
-    '--sourcetree=@SOURCE_ROOT@']
+    '--sourcetree=@0@'.format(meson.current_source_dir())]
+indent_base_cmd = [perl, indent_base_args]
 indent_depend = [pg_bsd_indent]
 
 if typedefs_supported
@@ -3364,6 +3366,17 @@ if meson.version().version_compare('>=0.57')
 endif
 
 
+# If we were asked to be nitpicky, add a test that fails if sources aren't
+# properly indented
+if nitpick_mode
+  test('indent-check',
+    perl, args: [indent_base_args, '--silent-diff', '.'],
+    depends: indent_depend,
+    suite: 'source',
+    priority: 10)
+endif
+
+
 
 ###############################################################
 # Pseudo targets
diff --git a/meson_options.txt b/meson_options.txt
index d2f95cfec36..90cb2177a44 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -52,6 +52,9 @@ option('atomics', type: 'boolean', value: true,
 option('spinlocks', type: 'boolean', value: true,
   description: 'Use spinlocks')
 
+option('nitpick', type: 'boolean', value: false,
+  description: 'annoy the hell out of you, committers ought to enable this')
+
 
 # Compilation options
 
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850d..5dd29bed4cb 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -358,6 +358,7 @@ task:
           meson setup \
             --buildtype=debug \
             -Dcassert=true \
+            -Dnitpick=true \
             ${LINUX_MESON_FEATURES} \
             -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
             build
-- 
2.38.0

#294Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#293)
Re: run pgindent on a regular basis / scripted manner

Andres Freund <andres@anarazel.de> writes:

It turns out that updating the in-tree typedefs.list would be very noisy. On
my local linux system I get
1 file changed, 422 insertions(+), 1 deletion(-)
On a mac mini I get
1 file changed, 351 insertions(+), 1 deletion(-)

That seems like it needs a considerably closer look. What exactly
is getting added/deleted?

We could possibly address that by updating the in-tree typedefs.list a bit
more aggressively. Sure looks like the source systems are on the older side.

Really? Per [1]https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?show_list we've currently got contributions from calliphoridae
which is Debian sid, crake which is Fedora 38, indri/sifaka which are
macOS Sonoma. Were you really expecting something newer, and if so what?

But in the attached patch I've implemented this slightly differently. If the
tooling to do so is available, the indent-* targets explained above,
use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
which is the combination of a src/tools/pgindent/typedefs.list.local generated
for the local binaries/libraries and the source tree
src/tools/pgindent/typedefs.list.

Hmm ... that allows indenting your C files, but how do you get from that
to a non-noisy patch to commit to typedefs.list?

regards, tom lane

[1]: https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?show_list

#295Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#294)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-10-18 21:29:37 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

It turns out that updating the in-tree typedefs.list would be very noisy. On
my local linux system I get
1 file changed, 422 insertions(+), 1 deletion(-)
On a mac mini I get
1 file changed, 351 insertions(+), 1 deletion(-)

That seems like it needs a considerably closer look. What exactly
is getting added/deleted?

Types from bison, openssl, libxml, libxslt, icu and libc, at least. If I
enable LLVM, there are even more.

(I think I figured out what's happening further down)

We could possibly address that by updating the in-tree typedefs.list a bit
more aggressively. Sure looks like the source systems are on the older side.

Really? Per [1] we've currently got contributions from calliphoridae
which is Debian sid, crake which is Fedora 38, indri/sifaka which are
macOS Sonoma. Were you really expecting something newer, and if so what?

It's quite odd, I see plenty more types than those. I can't really explain why
they're not being picked up on those animals.

E.g. for me bison generated files contain typedefs like

typedef int_least8_t yytype_int8;
typedef signed char yytype_int8;
typedef yytype_int8 yy_state_t;

yet they don't show up in the buildfarm typedefs output. It's not a thing of
the binary, I checked that the symbols are present.

I have a hard time parsing the buildfarm code for generating the typedefs
file, tbh.

<stare>

Ah, I see. If I interpret that correctly, the code filters out symbols it
doesn't find in in some .[chly] file in the *source* directory. This code is,
uh, barely readable and massively underdocumented.

I guess I need to reimplement that :/. Don't immediately see how this could
be implemented for in-tree autoconf builds...

But in the attached patch I've implemented this slightly differently. If the
tooling to do so is available, the indent-* targets explained above,
use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
which is the combination of a src/tools/pgindent/typedefs.list.local generated
for the local binaries/libraries and the source tree
src/tools/pgindent/typedefs.list.

Hmm ... that allows indenting your C files, but how do you get from that
to a non-noisy patch to commit to typedefs.list?

It doesn't provide that on its own. Being able to painlessly indent the files
seems pretty worthwhile already. But clearly it'd much better if we can
automatically update typedefs.list.

Greetings,

Andres Freund

#296Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#295)
4 attachment(s)
Re: run pgindent on a regular basis / scripted manner

Hi,

On 2023-10-18 19:18:13 -0700, Andres Freund wrote:

On 2023-10-18 21:29:37 -0400, Tom Lane wrote:
Ah, I see. If I interpret that correctly, the code filters out symbols it
doesn't find in in some .[chly] file in the *source* directory. This code is,
uh, barely readable and massively underdocumented.

I guess I need to reimplement that :/. Don't immediately see how this could
be implemented for in-tree autoconf builds...

But in the attached patch I've implemented this slightly differently. If the
tooling to do so is available, the indent-* targets explained above,
use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
which is the combination of a src/tools/pgindent/typedefs.list.local generated
for the local binaries/libraries and the source tree
src/tools/pgindent/typedefs.list.

Hmm ... that allows indenting your C files, but how do you get from that
to a non-noisy patch to commit to typedefs.list?

It doesn't provide that on its own. Being able to painlessly indent the files
seems pretty worthwhile already. But clearly it'd much better if we can
automatically update typedefs.list.

With code for that added, things seem to work quite nicely. I added similar
logic to the buildfarm code that builds a list of all tokens in the source
code.

With that, the in-tree typedefs.list can be updated with new tokens found
locally *and* typdefs that aren't used anymore can be removed from the in-tree
typedefs.list (detected by no matching tokens found in the source code).

The only case this approach can't handle is newly referenced typedefs in code
that isn't built locally - which I think isn't particularly common and seems
somewhat fundamental. In those cases typedefs.list still can be updated
manually (and the sorting will still be "fixed" if necessary).

The code is still in a somewhat rough shape and I'll not finish polishing it
today. I've attached the code anyway, don't be too rough :).

The changes from running "ninja update-typedefs indent-tree" on debian and
macos are attached as 0004 - the set of changes looks quite correct to me.

The buildfarm code filtered out a few typedefs manually:
push(@badsyms, 'date', 'interval', 'timestamp', 'ANY');
but I don't really see why? Possibly that was needed with an older
pg_bsd_indent to prevent odd stuff?

Right now building a new unified typedefs.list and copying it to the source
tree are still separate targets, but that probably makes less sense now? Or
perhaps it should be copied to the source tree when reindenting?

I've only handled linux and macos in the typedefs gathering code. But the
remaining OSs should be "just a bit of work" [TM].

Greetings,

Andres Freund

Attachments:

v4-0001-Add-meson-targets-to-run-pgindent.patchtext/x-diff; charset=us-asciiDownload
From f7520fd87764cddbde9d6c5d25fdfe5314ec5cbc Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Sat, 27 May 2023 11:38:27 -0700
Subject: [PATCH v4 1/4] Add meson targets to run pgindent

Author:
Reviewed-By:
Discussion: https://postgr.es/m/
Backpatch:
---
 meson.build                 | 38 +++++++++++++++++++++++++++++++++++++
 src/tools/pgindent/pgindent | 12 ++++++++++--
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/meson.build b/meson.build
index 862c955453f..2cfeb60eb07 100644
--- a/meson.build
+++ b/meson.build
@@ -3013,6 +3013,44 @@ run_target('install-test-files',
 
 
 
+###############################################################
+# Indentation and similar targets
+###############################################################
+
+indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
+    '--indent', pg_bsd_indent.full_path(),
+    '--sourcetree=@SOURCE_ROOT@']
+indent_depend = [pg_bsd_indent]
+
+# reindent the entire tree
+# Reindent the entire tree
+run_target('indent-tree',
+  command: indent_base_cmd + ['.'],
+  depends: indent_depend
+)
+
+# Reindent the last commit
+run_target('indent-head',
+  command: indent_base_cmd + ['--commit=HEAD'],
+  depends: indent_depend
+)
+
+# Silently check if any file is not corectly indented (fails the build if
+# there are differences)
+run_target('indent-check',
+  command: indent_base_cmd + ['--silent-diff', '.'],
+  depends: indent_depend
+)
+
+# Show a diff of all the unindented files (doesn't fail the build if there are
+# differences)
+run_target('indent-diff',
+  command: indent_base_cmd + ['--show-diff', '.'],
+  depends: indent_depend
+)
+
+
+
 ###############################################################
 # Test prep
 ###############################################################
diff --git a/src/tools/pgindent/pgindent b/src/tools/pgindent/pgindent
index bce63d95daf..08b0c95814f 100755
--- a/src/tools/pgindent/pgindent
+++ b/src/tools/pgindent/pgindent
@@ -23,7 +23,7 @@ my $devnull = File::Spec->devnull;
 
 my ($typedefs_file, $typedef_str, @excludes,
 	$indent, $build, $show_diff,
-	$silent_diff, $help, @commits,);
+	$silent_diff, $sourcetree, $help, @commits,);
 
 $help = 0;
 
@@ -35,7 +35,8 @@ my %options = (
 	"excludes=s" => \@excludes,
 	"indent=s" => \$indent,
 	"show-diff" => \$show_diff,
-	"silent-diff" => \$silent_diff,);
+	"silent-diff" => \$silent_diff,
+	"sourcetree=s" => \$sourcetree);
 GetOptions(%options) || usage("bad command line argument");
 
 usage() if $help;
@@ -53,6 +54,13 @@ $typedefs_file ||= $ENV{PGTYPEDEFS};
 # get indent location for environment or default
 $indent ||= $ENV{PGINDENT} || $ENV{INDENT} || "pg_bsd_indent";
 
+# When invoked from the build directory, change into source tree, otherwise
+# the heuristics in locate_sourcedir() don't work.
+if (defined $sourcetree)
+{
+	chdir $sourcetree;
+}
+
 my $sourcedir = locate_sourcedir();
 
 # if it's the base of a postgres tree, we will exclude the files
-- 
2.38.0

v4-0002-heavily-wip-update-typedefs.patchtext/x-diff; charset=us-asciiDownload
From dfe5ad9229e812d0f805d6a39cb296b9284e30f2 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 18 Oct 2023 16:18:58 -0700
Subject: [PATCH v4 2/4] heavily wip: update typedefs

---
 meson.build                           | 14 ++++-
 src/backend/jit/llvm/meson.build      |  2 +-
 src/backend/snowball/meson.build      |  2 +-
 src/timezone/meson.build              |  2 +-
 src/tools/pgindent/merge_typedefs     | 86 +++++++++++++++++++++++++++
 src/tools/pgindent/meson.build        | 69 +++++++++++++++++++++
 src/tools/pgindent/typedefs_from_objs | 42 +++++++++++++
 7 files changed, 213 insertions(+), 4 deletions(-)
 create mode 100755 src/tools/pgindent/merge_typedefs
 create mode 100644 src/tools/pgindent/meson.build
 create mode 100755 src/tools/pgindent/typedefs_from_objs

diff --git a/meson.build b/meson.build
index 2cfeb60eb07..492ce9a48b1 100644
--- a/meson.build
+++ b/meson.build
@@ -2579,6 +2579,7 @@ add_project_link_arguments(ldflags, language: ['c', 'cpp'])
 
 # list of targets for various alias targets
 backend_targets = []
+data_targets = []
 bin_targets = []
 pl_targets = []
 contrib_targets = []
@@ -2894,6 +2895,8 @@ subdir('src/interfaces/ecpg/test')
 
 subdir('doc/src/sgml')
 
+subdir('src/tools/pgindent')
+
 generated_sources_ac += {'': ['GNUmakefile']}
 
 # After processing src/test, add test_install_libs to the testprep_targets
@@ -2982,6 +2985,7 @@ endif
 all_built = [
   backend_targets,
   bin_targets,
+  data_targets,
   libpq_st,
   pl_targets,
   contrib_targets,
@@ -3017,12 +3021,20 @@ run_target('install-test-files',
 # Indentation and similar targets
 ###############################################################
 
+# If the dependencies for generating a local typedefs.list are fulfilled, we
+# use a combination of a locally built and the source tree's typededefs.list
+# file (see src/tools/pgindent/meson.build) for reindenting. That ensures
+# newly added typedefs are indented correctly.
 indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
     '--indent', pg_bsd_indent.full_path(),
     '--sourcetree=@SOURCE_ROOT@']
 indent_depend = [pg_bsd_indent]
 
-# reindent the entire tree
+if typedefs_supported
+  indent_base_cmd += ['--typedefs', typedefs_merged.full_path()]
+  indent_depend += typedefs
+endif
+
 # Reindent the entire tree
 run_target('indent-tree',
   command: indent_base_cmd + ['.'],
diff --git a/src/backend/jit/llvm/meson.build b/src/backend/jit/llvm/meson.build
index 8ffaf414609..7e2f6d5bd5c 100644
--- a/src/backend/jit/llvm/meson.build
+++ b/src/backend/jit/llvm/meson.build
@@ -82,4 +82,4 @@ llvmjit_types = custom_target('llvmjit_types.bc',
   install_dir: dir_lib_pkg,
   depfile: '@BASENAME@.c.bc.d',
 )
-backend_targets += llvmjit_types
+data_targets += llvmjit_types
diff --git a/src/backend/snowball/meson.build b/src/backend/snowball/meson.build
index 0f669c0bf3c..69e777399fd 100644
--- a/src/backend/snowball/meson.build
+++ b/src/backend/snowball/meson.build
@@ -94,4 +94,4 @@ install_subdir('stopwords',
 )
 
 backend_targets += dict_snowball
-backend_targets += snowball_create
+data_targets += snowball_create
diff --git a/src/timezone/meson.build b/src/timezone/meson.build
index 7b85a01c6bd..779a51d85a4 100644
--- a/src/timezone/meson.build
+++ b/src/timezone/meson.build
@@ -50,7 +50,7 @@ if get_option('system_tzdata') == ''
     install_dir: dir_data,
   )
 
-  bin_targets += tzdata
+  data_targets += tzdata
 endif
 
 subdir('tznames')
diff --git a/src/tools/pgindent/merge_typedefs b/src/tools/pgindent/merge_typedefs
new file mode 100755
index 00000000000..9826fd1a675
--- /dev/null
+++ b/src/tools/pgindent/merge_typedefs
@@ -0,0 +1,86 @@
+#!/usr/bin/env python3
+
+# Tool to build a new typedefs.list file from
+# a) the current typedefs.list file in the source tree
+# b) newly generated typedefs.list fragments for build products
+# c) the source code
+#
+# To avoid littering the typedefs file with typedefs that are not in our source
+# code, the list of typedefs found is filtered with tokens found in source
+# files.
+#
+# If the local build references a typedef that's not present in the source
+# typedefs.list, we can discover such typedefs.
+#
+# If the source typedefs.list references a typedef that is not used anymore, it
+# will not appear in the source tree anymore and thus can be filtered out.
+#
+# However, typedefs in newly written code for a different platform (or a
+# disabled build option), will not be discovered.
+
+import argparse
+import os
+import subprocess
+import sys
+import re
+
+parser = argparse.ArgumentParser(
+    description='build new typedefs files from multiple inputs')
+
+parser.add_argument('--output-local', type=argparse.FileType('w'), required=True)
+parser.add_argument('--output-merged', type=argparse.FileType('w'), required=True)
+parser.add_argument('--current-typedefs', type=argparse.FileType('r'), required=True)
+parser.add_argument('--filter-source', type=str, required=False)
+parser.add_argument('input', type=argparse.FileType('r'), nargs='*')
+
+
+def merge_typedefs(files):
+    typedefs = set()
+    for input in files:
+        for typedef in input.readlines():
+            typedefs.add(typedef.strip())
+    return typedefs
+
+def source_list(srcdir):
+    filelist = []
+    os.chdir(srcdir)
+    accepted_files = re.compile(r'\.(c|h|l|y|.cpp)$')
+    for root, dirs, files in os.walk('.', topdown=True):
+        # don't go into hidden dirs or docs
+        for d in dirs:
+            if d.startswith('.') or d == 'doc':
+                dirs.remove(d)
+        for f in files:
+            if accepted_files.search(f):
+                filelist.append((os.path.join(root, f)))
+    return filelist
+
+def tokenize_sources(files):
+    comment_sub_re = re.compile(r'/\*.*?\*/', flags=re.MULTILINE|re.DOTALL)
+    token_find_re = re.compile(r'\b\w+\b')
+
+    tokens = set()
+    for file in files:
+        with open(file, 'r') as fh:
+            content = fh.read()
+            content = comment_sub_re.sub('', content)
+            tokens.update(token_find_re.findall(content))
+    return tokens
+
+def main(args):
+    local_typedefs = merge_typedefs(args.input)
+    current_typedefs = merge_typedefs([args.current_typedefs])
+
+    if args.filter_source:
+        filelist = source_list(args.filter_source)
+        tokens = tokenize_sources(filelist)
+        local_typedefs.intersection_update(tokens)
+        current_typedefs.intersection_update(tokens)
+
+    current_typedefs.update(local_typedefs)
+    print('\n'.join(sorted(local_typedefs)), file=args.output_local)
+    print('\n'.join(sorted(current_typedefs)), file=args.output_merged)
+
+if __name__ == '__main__':
+    main(parser.parse_args())
+    sys.exit(0)
diff --git a/src/tools/pgindent/meson.build b/src/tools/pgindent/meson.build
new file mode 100644
index 00000000000..de02c676a46
--- /dev/null
+++ b/src/tools/pgindent/meson.build
@@ -0,0 +1,69 @@
+# Currently the code requires meson 0.63 or upwards. This could likely be
+# lifted (in fact, some older versions actually work, but only some), but for
+# now it's not crucial.
+typedefs_supported = meson.version().version_compare('>=0.63')
+if not typedefs_supported
+  subdir_done()
+endif
+
+typedefs_from_objs = files('typedefs_from_objs')
+typedefs_from_objs_cmd = [typedefs_from_objs, '--host', host_system, '--output', '@OUTPUT@', '@INPUT@']
+merge_typedefs = files('merge_typedefs')
+
+# XXX: This list of targets should likely not be maintained here
+typedef_src_tgts = [
+  backend_targets,
+  bin_targets,
+  [libpq_st],
+  pl_targets,
+  contrib_targets,
+  ecpg_targets,
+]
+
+# We generate partial typedefs files for each binary/library. That makes using
+# this during incremental development much faster.
+#
+# The reason we process the object files instead of executables is that that
+# doesn't work on macos. There doesn't seem to be a reason to target
+# executables directly on other platforms.
+typedef_tgts = []
+foreach tgts : typedef_src_tgts
+  foreach tgt : tgts
+    # can't use tgt.name(), as we have have targets that differ just in suffix
+    name = fs.name(tgt.full_path())
+    tdname = 'typedefs.list.local-@0@'.format(name)
+    objs = tgt.extract_all_objects(recursive: true)
+    typedef_tgts += custom_target(tdname,
+                                  output: tdname,
+                                  command: typedefs_from_objs_cmd,
+                                  input: objs,
+                                  build_by_default: false,
+                                  )
+
+  endforeach
+endforeach
+
+
+# Build new typedefs.list files. typedefs.list.local will contain only
+# typedefs in the local build, whereas typedefs.list.merged is the combination
+# of typedefs.list.local and the source tree's typedefs.list. However,
+# typedefs that are not used in the code anymore are removed even from
+# typedefs.list.merged.
+typedefs = custom_target('typedefs.list', output: ['typedefs.list.local', 'typedefs.list.merged'],
+  command: [merge_typedefs,
+            '--output-local', '@OUTPUT0@',
+            '--output-merged', '@OUTPUT1@',
+            '--current-typedefs', files('typedefs.list'),
+            '--filter-source', '@SOURCE_ROOT@',
+            '@INPUT@'],
+  input: typedef_tgts,
+  build_by_default: false,
+)
+typedefs_local = typedefs[0]
+typedefs_merged = typedefs[1]
+
+if cp.found()
+  run_target('update-typedefs',
+    command: [cp, typedefs_merged, '@SOURCE_ROOT@/src/tools/pgindent/typedefs.list'],
+  )
+endif
diff --git a/src/tools/pgindent/typedefs_from_objs b/src/tools/pgindent/typedefs_from_objs
new file mode 100755
index 00000000000..821c84cb3d8
--- /dev/null
+++ b/src/tools/pgindent/typedefs_from_objs
@@ -0,0 +1,42 @@
+#!/usr/bin/env python3
+
+# Tool to extract typedefs from object files.
+
+import argparse
+import re
+import shutil
+import subprocess
+import sys
+
+parser = argparse.ArgumentParser(
+    description='generate typedefs file for a set of object files')
+
+parser.add_argument('--output', type=argparse.FileType('w'), required=True)
+parser.add_argument('--host', type=str, required=True)
+parser.add_argument('input', nargs='*')
+
+args = parser.parse_args()
+
+if args.host == 'linux':
+    find_td_re = re.compile(r'^[^\n]+TAG_typedef\)\n[^\n]+DW_AT_name\s*:\s*\([^\)]+\): ([^\n]+)$', re.MULTILINE)
+    # FIXME: should probably be set by the caller? Except that the same binary
+    # name behaves very differently on different platforms :/
+    cmd = [shutil.which('objdump'), '-Wi']
+elif args.host == 'darwin':
+    find_td_re = re.compile(r'^[^\n]+TAG_typedef\n\s*DW_AT_type[^\n]+\n\s+DW_AT_name\s*\(\"([^\n]+)\"\)$', re.MULTILINE)
+    cmd = [shutil.which('dwarfdump')]
+else:
+    raise f'unsupported platform: {args.host}'
+
+lcmd = cmd + args.input
+sp = subprocess.run(lcmd, stdout=subprocess.PIPE, universal_newlines=True)
+if sp.returncode != 0:
+    print(f'{lcmd} failed with return code {sp.returncode}', file=sys.stderr)
+    sys.exit(sp.returncode)
+
+fa = find_td_re.findall(sp.stdout)
+
+for typedef in fa:
+    print(typedef, file=args.output)
+
+sys.exit(0)
-- 
2.38.0

v4-0003-WIP-meson-nitpick-mode-test-checking-indentation.patchtext/x-diff; charset=us-asciiDownload
From 017183e75c7a633dedabdbf57aa1ac80ee955f93 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 18 Oct 2023 21:40:36 -0700
Subject: [PATCH v4 3/4] WIP: meson: nitpick mode + test checking indentation

Author:
Reviewed-by:
Discussion: https://postgr.es/m/
Backpatch:
---
 meson.build       | 20 +++++++++++++++++---
 meson_options.txt |  3 +++
 .cirrus.tasks.yml |  1 +
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/meson.build b/meson.build
index 492ce9a48b1..854b7ecf56c 100644
--- a/meson.build
+++ b/meson.build
@@ -44,6 +44,7 @@ cc = meson.get_compiler('c')
 not_found_dep = dependency('', required: false)
 thread_dep = dependency('threads')
 auto_features = get_option('auto_features')
+nitpick_mode = get_option('nitpick')
 
 
 
@@ -3025,16 +3026,18 @@ run_target('install-test-files',
 # use a combination of a locally built and the source tree's typededefs.list
 # file (see src/tools/pgindent/meson.build) for reindenting. That ensures
 # newly added typedefs are indented correctly.
-indent_base_cmd = [perl, files('src/tools/pgindent/pgindent'),
+indent_base_args = [files('src/tools/pgindent/pgindent'),
     '--indent', pg_bsd_indent.full_path(),
-    '--sourcetree=@SOURCE_ROOT@']
+    '--sourcetree=@0@'.format(meson.current_source_dir())]
 indent_depend = [pg_bsd_indent]
 
 if typedefs_supported
-  indent_base_cmd += ['--typedefs', typedefs_merged.full_path()]
+  indent_base_args += ['--typedefs', typedefs_merged.full_path()]
   indent_depend += typedefs
 endif
 
+indent_base_cmd = [perl, indent_base_args]
+
 # Reindent the entire tree
 run_target('indent-tree',
   command: indent_base_cmd + ['.'],
@@ -3364,6 +3367,17 @@ if meson.version().version_compare('>=0.57')
 endif
 
 
+# If we were asked to be nitpicky, add a test that fails if sources aren't
+# properly indented
+if nitpick_mode
+  test('indent-check',
+    perl, args: [indent_base_args, '--silent-diff', '.'],
+    depends: indent_depend,
+    suite: 'source',
+    priority: 10)
+endif
+
+
 
 ###############################################################
 # Pseudo targets
diff --git a/meson_options.txt b/meson_options.txt
index d2f95cfec36..90cb2177a44 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -52,6 +52,9 @@ option('atomics', type: 'boolean', value: true,
 option('spinlocks', type: 'boolean', value: true,
   description: 'Use spinlocks')
 
+option('nitpick', type: 'boolean', value: false,
+  description: 'annoy the hell out of you, committers ought to enable this')
+
 
 # Compilation options
 
diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850d..5dd29bed4cb 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -358,6 +358,7 @@ task:
           meson setup \
             --buildtype=debug \
             -Dcassert=true \
+            -Dnitpick=true \
             ${LINUX_MESON_FEATURES} \
             -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
             build
-- 
2.38.0

v4-0004-Update-typedefs-reindent-using-the-new-build-targ.patchtext/x-diff; charset=us-asciiDownload
From 7e06ca28697b365561371b2843d88c4caaccd347 Mon Sep 17 00:00:00 2001
From: Andres Freund <andres@anarazel.de>
Date: Wed, 18 Oct 2023 21:31:22 -0700
Subject: [PATCH v4 4/4] Update typedefs, reindent using the new build targets

From debian sid and macos ventura.
---
 contrib/sepgsql/hooks.c          |  2 +-
 contrib/sepgsql/label.c          |  2 +-
 contrib/sepgsql/uavc.c           |  2 +-
 src/tools/pgindent/typedefs.list | 14 +++++++++++---
 4 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/contrib/sepgsql/hooks.c b/contrib/sepgsql/hooks.c
index fa734763602..443e30be994 100644
--- a/contrib/sepgsql/hooks.c
+++ b/contrib/sepgsql/hooks.c
@@ -50,7 +50,7 @@ typedef struct
 	 * command. Elsewhere (including the case of default) NULL.
 	 */
 	const char *createdb_dtemplate;
-}			sepgsql_context_info_t;
+} sepgsql_context_info_t;
 
 static sepgsql_context_info_t sepgsql_context_info;
 
diff --git a/contrib/sepgsql/label.c b/contrib/sepgsql/label.c
index 38ff4068ff2..e13640dc4d9 100644
--- a/contrib/sepgsql/label.c
+++ b/contrib/sepgsql/label.c
@@ -67,7 +67,7 @@ typedef struct
 {
 	SubTransactionId subid;
 	char	   *label;
-}			pending_label;
+} pending_label;
 
 /*
  * sepgsql_get_client_label
diff --git a/contrib/sepgsql/uavc.c b/contrib/sepgsql/uavc.c
index 6e3a892da24..6f63a19aa96 100644
--- a/contrib/sepgsql/uavc.c
+++ b/contrib/sepgsql/uavc.c
@@ -44,7 +44,7 @@ typedef struct
 	/* true, if tcontext is valid */
 	char	   *ncontext;		/* temporary scontext on execution of trusted
 								 * procedure, or NULL elsewhere */
-}			avc_cache;
+} avc_cache;
 
 /*
  * Declaration of static variables
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e69bb671bf8..d6178386252 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2,6 +2,7 @@ ACCESS_ALLOWED_ACE
 ACL
 ACL_SIZE_INFORMATION
 AFFIX
+ANY
 ASN1_INTEGER
 ASN1_OBJECT
 ASN1_OCTET_STRING
@@ -1275,9 +1276,9 @@ JsonManifestWALRangeField
 JsonObjectAgg
 JsonObjectConstructor
 JsonOutput
-JsonParseExpr
 JsonParseContext
 JsonParseErrorType
+JsonParseExpr
 JsonPath
 JsonPathBool
 JsonPathExecContext
@@ -1913,7 +1914,6 @@ ParallelHashJoinBatch
 ParallelHashJoinBatchAccessor
 ParallelHashJoinState
 ParallelIndexScanDesc
-ParallelReadyList
 ParallelSlot
 ParallelSlotArray
 ParallelSlotResultHandler
@@ -3135,6 +3135,7 @@ _ino_t
 _locale_t
 _resultmap
 _stringlist
+access_vector_t
 acquireLocksOnSubLinks_context
 add_nulling_relids_context
 adjust_appendrel_attrs_context
@@ -3167,6 +3168,7 @@ assign_collations_context
 auth_password_hook_typ
 autovac_table
 av_relation
+avc_cache
 avl_dbase
 avl_node
 avl_tree
@@ -3247,6 +3249,7 @@ crosstab_HashEnt
 crosstab_cat_desc
 datapagemap_iterator_t
 datapagemap_t
+date
 dateKEY
 datetkn
 dce_uuid_t
@@ -3402,6 +3405,7 @@ indexed_tlist
 inet
 inetKEY
 inet_struct
+initRowMethod
 init_function
 inline_cte_walker_context
 inline_error_callback_arg
@@ -3418,6 +3422,7 @@ int64
 int64KEY
 int8
 internalPQconninfoOption
+interval
 intptr_t
 intset_internal_node
 intset_leaf_node
@@ -3519,6 +3524,7 @@ parse_error_callback_arg
 parser_context
 partition_method_t
 pendingPosition
+pending_label
 pgParameterStatus
 pg_atomic_flag
 pg_atomic_uint32
@@ -3710,7 +3716,9 @@ saophash_hash
 save_buffer
 scram_state
 scram_state_enum
+security_class_t
 sem_t
+sepgsql_context_info_t
 sequence_magic
 set_join_pathlist_hook_type
 set_rel_pathlist_hook_type
@@ -3792,6 +3800,7 @@ time_t
 timeout_handler_proc
 timeout_params
 timerCA
+timestamp
 tlist_vinfo
 toast_compress_header
 tokenize_error_callback_arg
@@ -3869,7 +3878,6 @@ wchar2mb_with_len_converter
 wchar_t
 win32_deadchild_waitinfo
 wint_t
-worker_spi_state
 worker_state
 worktable
 wrap
-- 
2.38.0

#297Jeff Davis
pgsql@j-davis.com
In reply to: David Rowley (#288)
Re: run pgindent on a regular basis / scripted manner

On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:

It would be good to learn how many of the committers out of the ones
you listed that --enable-indent-checks would have saved from breaking
koel.

I'd find that a useful option.

Regards,
Jeff Davis

#298Amit Langote
amitlangote09@gmail.com
In reply to: Jeff Davis (#297)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 24, 2023 at 9:51 Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:

It would be good to learn how many of the committers out of the ones
you listed that --enable-indent-checks would have saved from breaking
koel.

I'd find that a useful option.

+1. While I’ve made it part of routine to keep my local work pgindented
since breaking Joel once, an option like this would still be useful.

Show quoted text
#299Michael Paquier
michael@paquier.xyz
In reply to: Amit Langote (#298)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 24, 2023 at 10:16:55AM +0900, Amit Langote wrote:

+1. While I’ve made it part of routine to keep my local work pgindented
since breaking Joel once, an option like this would still be useful.

I'd be OK with an option like that. It is one of these things to type
once in a script, then forget about it.
--
Michael

#300Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#262)
Re: run pgindent on a regular basis / scripted manner

On 2023-10-17 Tu 09:52, Robert Haas wrote:

On Tue, Oct 17, 2023 at 6:34 AM Jelte Fennema <postgres@jeltef.nl> wrote:

I think *it is* dead easy to comply. If you run the following commands
before committing/after rebasing, then koel should always be happy:

src/tools/pgindent/pgindent src # works always but a bit slow
src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
# much faster, but only works if you DID NOT change typedefs.list

In isolation, that's true, but the list of mistakes that you can make
while committing which will inconvenience everyone working on the
project is very long. Another one that comes up frequently is
forgetting to bump CATALOG_VERSION_NO, but you also need a good commit
message, and good comments, and a good Discussion link in the commit
message, and the right list of authors and reviewers, and to update
the docs (with spaces, not tabs) and the Makefiles (with tabs, not
spaces) and the meson stuff and, as if that weren't enough already,
you actually need the code to work! And that includes not only working
regularly but also with CLOBBER_CACHE_ALWAYS and debug_parallel_query
and so on. It's very easy to miss something somewhere. I put a LOT of
work into polishing my commits before I push them, and it's still not
that uncommon that I screw something up.

Yes, there's a lot to look out for, and you're a damn sight better at it
than I am. But we should try to automate the things that can be
automated, even if that leaves many tasks that can't be. I have three
things in my pre-commit hook: a check for catalog updates, a check for
new typedefs, and an indent check. And every one of them has saved me
from doing things I should not be doing. They aren't perfect but they
are useful.

Slightly off topic, but apropos your message, maybe we should recommend
a standard git commit template.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#301Bruce Momjian
bruce@momjian.us
In reply to: Andrew Dunstan (#300)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 24, 2023 at 09:40:01AM -0400, Andrew Dunstan wrote:

Slightly off topic, but apropos your message, maybe we should recommend a
standard git commit template.

I use this and then automatically remove any sections that are empty.

---------------------------------------------------------------------------

|--- gitweb subject length limit ----------------|-email limit-|

Reported-by:

Diagnosed-by:

Bug:

Discussion:

Author:

Reviewed-by:

Tested-by:

Backpatch-through:

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Only you can decide what is important to you.

#302Andrew Dunstan
andrew@dunslane.net
In reply to: Jelte Fennema (#287)
Re: run pgindent on a regular basis / scripted manner

On 2023-10-18 We 05:07, Jelte Fennema wrote:

I think --enable-indent-checks sounds like a good improvement to the
status quo. But I'm not confident that it will help remove the cases
where only a comment needs to be re-indented. Do commiters really
always run check-world again when only changing a typo in a comment? I
know I probably wouldn't (or at least not always).

Yeah. In fact I'm betting that a lot of the offending commits we've seen
come into this category. You build, you check, then you do some final
polish. That's where a pre-commit hook can save you.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#303Amit Kapila
amit.kapila16@gmail.com
In reply to: Jeff Davis (#297)
Re: run pgindent on a regular basis / scripted manner

On Tue, Oct 24, 2023 at 6:21 AM Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:

It would be good to learn how many of the committers out of the ones
you listed that --enable-indent-checks would have saved from breaking
koel.

I'd find that a useful option.

+1.

--
With Regards,
Amit Kapila.

#304Étienne BERSAC
etienne.bersac@dalibo.com
In reply to: Andrew Dunstan (#300)
Re: run pgindent on a regular basis / scripted manner

Hello,

Yes, there's a lot to look out for, and you're a damn sight better at
it
than I am. But we should try to automate the things that can be
automated, even if that leaves many tasks that can't be. I have three
things in my pre-commit hook: a check for catalog updates, a check
for
new typedefs, and an indent check.

Could you share your configuration ? Could we provide more helper and
integration to help produce consistent code ?

For logfmt extension, I configured clang-formatd so that Emacs format
the buffer on save. Any editor running clangd will use this. This is
ease my mind about formatting. I need to investigate how to use
pgindent instead or at lease ensure clang-format produce same output as
pgindent.

https://gitlab.com/dalibo/logfmt/-/blob/0d808b368e649b23ac06ce2657354b67be398b21/.clang-format

Automate nitpicking in CI is good, but checking locally before sending
the patch will save way more round-trip.

Regards,
Étienne

#305Andrew Dunstan
andrew@dunslane.net
In reply to: Étienne BERSAC (#304)
1 attachment(s)
Re: run pgindent on a regular basis / scripted manner

On 2023-10-27 Fr 03:14, Étienne BERSAC wrote:

Hello,

Yes, there's a lot to look out for, and you're a damn sight better at
it
than I am. But we should try to automate the things that can be
automated, even if that leaves many tasks that can't be. I have three
things in my pre-commit hook: a check for catalog updates, a check
for
new typedefs, and an indent check.

Could you share your configuration ? Could we provide more helper and
integration to help produce consistent code ?

Sure. pre-commit hook file attached. I'm sure this could be improved on.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachments:

pre-committext/plain; charset=UTF-8; name=pre-commitDownload
#306Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#238)
Re: run pgindent on a regular basis / scripted manner

On 2023-08-12 Sa 11:57, Andrew Dunstan wrote:

On 2023-08-11 Fr 19:17, Tom Lane wrote:

Peter Geoghegan<pg@bowt.ie> writes:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

FWIW, I'm much more optimistic than that. I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that. If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.

Yeah, part of the point of creating koel was to give committers a bit
of a nudge in that direction.

With a git pre-commit hook it's pretty painless.

Based on recent experience, where a lot koel's recent complaints seem to
be about comments, I'd like to suggest a modest adjustment.

First, we should provide a mode of pgindent that doesn't reflow
comments. pg_bsd_indent has a flag for this (-nfcb), so this should be
relatively simple.  Second, koel could use that mode, so that it
wouldn't complain about comments it thinks need to be reflowed. Of
course, we'd fix these up with our regular pgindent runs.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#307Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#306)
Re: run pgindent on a regular basis / scripted manner

Andrew Dunstan <andrew@dunslane.net> writes:

Based on recent experience, where a lot koel's recent complaints seem to
be about comments, I'd like to suggest a modest adjustment.

First, we should provide a mode of pgindent that doesn't reflow
comments. pg_bsd_indent has a flag for this (-nfcb), so this should be
relatively simple.  Second, koel could use that mode, so that it
wouldn't complain about comments it thinks need to be reflowed. Of
course, we'd fix these up with our regular pgindent runs.

Seems like a bit of a kluge. Maybe it's the right thing to do, but
I don't think we have enough data points yet to be confident that
it'd meaningfully reduce the number of breakages.

On a more abstract level: the point of trying to maintain indent
cleanliness is so that if you modify a file and then want to run
pgindent on your own changes, you don't get incidental changes
elsewhere in the file. This solution would break that, so I'm
not sure it isn't throwing the baby out with the bathwater.

regards, tom lane

#308Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#307)
Re: run pgindent on a regular basis / scripted manner

On 2023-10-28 Sa 12:09, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

Based on recent experience, where a lot koel's recent complaints seem to
be about comments, I'd like to suggest a modest adjustment.
First, we should provide a mode of pgindent that doesn't reflow
comments. pg_bsd_indent has a flag for this (-nfcb), so this should be
relatively simple.  Second, koel could use that mode, so that it
wouldn't complain about comments it thinks need to be reflowed. Of
course, we'd fix these up with our regular pgindent runs.

Seems like a bit of a kluge. Maybe it's the right thing to do, but
I don't think we have enough data points yet to be confident that
it'd meaningfully reduce the number of breakages.

On a more abstract level: the point of trying to maintain indent
cleanliness is so that if you modify a file and then want to run
pgindent on your own changes, you don't get incidental changes
elsewhere in the file. This solution would break that, so I'm
not sure it isn't throwing the baby out with the bathwater.

Yeah, could be.

cheers

andrew.

--
Andrew Dunstan
EDB: https://www.enterprisedb.com