Preliminary results for proposed new pgindent implementation

Started by Tom Lanealmost 9 years ago50 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Over in this thread:
/messages/by-id/E1dAmxK-0006EE-1r@gemulon.postgresql.org
we've been discussing switching to FreeBSD's version of "indent",
specifically the updated version that Piotr Stefaniak is working on,
as the basis for pgindent. There seem to be a small number of bugs
that Piotr needs to fix, but overall it is looking remarkably good
to me.

To create a diff with not too much noise in it, I applied the hacks
(fixes?) already mentioned in the other thread, and I tweaked things
slightly to suppress changes in alignment of comments on #else and
#endif lines. We might well want to reconsider that later, because it's
quite random right now (basically, #else comments are indented to column
33, #endif comments are indented to column 10, and for no reason at all
the latter do not have standard tab substitution). But again, I'm trying
to drill down to the changes we want rather than the incidental ones.

At this point, the changes look pretty good, although still bulky.
I've attached a diff between current HEAD and re-indented HEAD
for anybody who wants to peruse all the details, but basically
what I'm seeing is:

* Improvements in formatting around sizeof and related constructs,
for example:

-       *phoned_word = palloc(sizeof(char) * strlen(word) +1);
+       *phoned_word = palloc(sizeof(char) * strlen(word) + 1);

* Likewise, operators after casts work better than before:

        res = shm_mq_send_bytes(mqh, sizeof(Size) - mqh->mqh_partial_bytes,
-                               ((char *) &nbytes) +mqh->mqh_partial_bytes,
+                               ((char *) &nbytes) + mqh->mqh_partial_bytes,
                                nowait, &bytes_written);

* Sane formatting of function typedefs, for example:

     * use buf as work area if NULL in-place copy
     */
    int         (*pull) (void *priv, PullFilter *src, int len,
-                                    uint8 **data_p, uint8 *buf, int buflen);
+                        uint8 **data_p, uint8 *buf, int buflen);
    void        (*free) (void *priv);
 };

* Non-typedef struct pointers are now formatted consistently, for example:

-mdcbuf_finish(struct MDCBufData * st)
+mdcbuf_finish(struct MDCBufData *st)

* Better handling of pointers with const/volatile qualifiers, for example:

-static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile * ptr,
+static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC *volatile *ptr,

* Better handling of PG_USED_FOR_ASSERTS_ONLY declarations, for example

    BlockNumber blkno;
    Buffer      buf;
-   Bucket new_bucket PG_USED_FOR_ASSERTS_ONLY = InvalidBucket;
+   Bucket      new_bucket PG_USED_FOR_ASSERTS_ONLY = InvalidBucket;
    bool        bucket_dirty = false;

* Corner cases where no space was left before a comment are fixed:

@@ -149,12 +149,12 @@ typedef struct RewriteStateData
    bool        rs_logical_rewrite;     /* do we need to do logical rewriting */
    TransactionId rs_oldest_xmin;       /* oldest xmin used by caller to
                                         * determine tuple visibility */
-   TransactionId rs_freeze_xid;/* Xid that will be used as freeze cutoff
-                                * point */
+   TransactionId rs_freeze_xid;    /* Xid that will be used as freeze cutoff
+                                    * point */
    TransactionId rs_logical_xmin;      /* Xid that will be used as cutoff
                                         * point for logical rewrites */
-   MultiXactId rs_cutoff_multi;/* MultiXactId that will be used as cutoff
-                                * point for multixacts */
+   MultiXactId rs_cutoff_multi;    /* MultiXactId that will be used as cutoff
+                                    * point for multixacts */
    MemoryContext rs_cxt;       /* for hash tables and entries and tuples in
                                 * them */
    XLogRecPtr  rs_begin_lsn;   /* XLogInsertLsn when starting the rewrite */

There are also changes that are not really wins, just different, but they
do arguably improve consistency. One is that comments following "else"
and "case" are now always indented at least as far as column 33 (or
indent's -c parameter), whereas the old code would sometimes let them
slide to the left of that:

@@ -723,7 +723,7 @@ dblink_record_internal(FunctionCallInfo fcinfo, bool is_async)
                /* shouldn't happen */
                elog(ERROR, "wrong number of arguments");
        }
-       else    /* is_async */
+       else                    /* is_async */
        {
            /* get async result */
            conname = text_to_cstring(PG_GETARG_TEXT_PP(0));

This also happens for comments on a function's "}" close brace:

@@ -722,7 +722,7 @@ _metaphone(char *word, /* IN */
End_Phoned_Word;

    return (META_SUCCESS);
-}  /* END metaphone */
+}                              /* END metaphone */

It's hard to see these as anything except bug fixes, because the
old indent code certainly looks like it intends to always force
same-line columns to at least the -c column. I haven't quite figured
out why it fails to, or where the new version fixed that. Still,
it's a difference that isn't a particular benefit to us.

Another set of changes is slightly different handling of unrecognized
typedef names:

@@ -250,7 +250,7 @@ typedef enum
    PGSS_TRACK_NONE,            /* track no statements */
    PGSS_TRACK_TOP,             /* only top level statements */
    PGSS_TRACK_ALL              /* all statements, including nested ones */
-}  PGSSTrackLevel;
+}          PGSSTrackLevel;

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)
The handling of such names was already slightly wonky, though;
note that the previous version was already differently indented
from what would happen if PGSSTrackLevel were known.

Another issue I'm noticing is that "extern C" declarations are
getting reformatted:

diff --git a/src/interfaces/ecpg/include/ecpgtype.h b/src/interfaces/ecpg/include/ecpgtype.h
index 7cc47e9..236d028 100644
--- a/src/interfaces/ecpg/include/ecpgtype.h
+++ b/src/interfaces/ecpg/include/ecpgtype.h
@@ -34,7 +34,7 @@
 #define _ECPGTYPE_H

#ifdef __cplusplus
-extern "C"
+extern "C"
{
#endif

The pgindent Perl script is fooling around with these already, and
I think what it's doing is probably not quite compatible with what
the new indent version does, but I haven't tracked it down yet.

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

regards, tom lane

Attachments:

new-pgindent-changes-1.patch.gzapplication/x-gzip; name=new-pgindent-changes-1.patch.gzDownload+1-2
#2Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#1)
Re: Preliminary results for proposed new pgindent implementation

Tom, all,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

+1.

Thanks!

Stephen

#3Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#1)
Re: Preliminary results for proposed new pgindent implementation

On Thu, May 18, 2017 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

* Improvements in formatting around sizeof and related constructs,
for example:

* Likewise, operators after casts work better than before:

* Sane formatting of function typedefs, for example:

* Non-typedef struct pointers are now formatted consistently, for example:

* Better handling of pointers with const/volatile qualifiers, for example:

* Better handling of PG_USED_FOR_ASSERTS_ONLY declarations, for example

* Corner cases where no space was left before a comment are fixed:

Those all sound like good things.

Another set of changes is slightly different handling of unrecognized
typedef names:

@@ -250,7 +250,7 @@ typedef enum
PGSS_TRACK_NONE,            /* track no statements */
PGSS_TRACK_TOP,             /* only top level statements */
PGSS_TRACK_ALL              /* all statements, including nested ones */
-}  PGSSTrackLevel;
+}          PGSSTrackLevel;

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)
The handling of such names was already slightly wonky, though;
note that the previous version was already differently indented
from what would happen if PGSSTrackLevel were known.

This, however, doesn't sound so good. Isn't there some way this can be fixed?

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

What does that exactly mean concretely?

We've talked about pulling pgindent into our main repo, or posting a
link to a tarball someplace. An intermediate plan might be to give it
its own repo, but on git.postgresql.org, which seems like it might
give us the best of both worlds. But I really want something that's
going to be easy to set up and configure. It took me years to be
brave enough to get the current pgindent set up.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#3)
Re: Preliminary results for proposed new pgindent implementation

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, May 18, 2017 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)

This, however, doesn't sound so good. Isn't there some way this can be fixed?

I'm intending to look into it, but I think it's mostly independent of
whether we replace pgindent itself. The existing code has the same
problem, really.

One brute-force way we could deal with the problem is to have a "manual"
list of names to be treated as typedefs, in addition to whatever the
buildfarm produces. I see no other way than that to get, for instance,
simplehash.h's SH_TYPE to be formatted as a typedef. There are also
some typedefs that don't get formatted correctly because they are only
used for wonky options that no existing typedef-reporting buildfarm member
builds. Manual addition might be the path of least resistance there too.

Now the other side of this coin is that, by definition, such typedefs
are not getting used in a huge number of places. If we just had to
live with it, it might not be awful.

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

What does that exactly mean concretely?

That means I plan to continue putting effort into it with the goal of
making a switchover sometime pretty darn soon. We do not have a very
wide window for fooling with pgindent rules, IMO --- once v11 development
starts I think we can't touch it again (until this time next year).

We've talked about pulling pgindent into our main repo, or posting a
link to a tarball someplace. An intermediate plan might be to give it
its own repo, but on git.postgresql.org, which seems like it might
give us the best of both worlds. But I really want something that's
going to be easy to set up and configure. It took me years to be
brave enough to get the current pgindent set up.

Yes, moving the goalposts on ease-of-use is an important consideration
here. What that says to me is that we ought to pull FreeBSD indent
into our tree, and provide Makefile support that makes it easy for
any developer to build it and put it into their PATH. (I suppose
that means support in the MSVC scripts too, but somebody else will
have to do that part.)

We should also think hard about getting rid of the entab dependency,
to eliminate the other nonstandard prerequisite program. We could
either accept that that will result in some tab-vs-space changes in
our code, or try to convert those steps in pgindent into pure Perl,
or try to convince Piotr to add an option to indent that will make
it do tabs the way we want (ie use a space not a tab if the tab
would only move one space anyway).

Lastly (and I've said this before, but you pushed back on it at
the time), if we're doing this then we're going all in. That
means reformatting the back branches to match too. That diff
is already big enough to be a disaster for back-patching, and
we haven't even considered whether we want to let pgindent adopt
less-inconsistent rules for comment indentation. So I think that
as soon as the dust has settled in HEAD, we back-patch the addition
of FreeBSD indent, and the changes in pgindent proper, and then
run pgindent in each supported back branch.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#4)
Re: Preliminary results for proposed new pgindent implementation

Tom, all,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, May 18, 2017 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)

This, however, doesn't sound so good. Isn't there some way this can be fixed?

I'm intending to look into it, but I think it's mostly independent of
whether we replace pgindent itself. The existing code has the same
problem, really.

One brute-force way we could deal with the problem is to have a "manual"
list of names to be treated as typedefs, in addition to whatever the
buildfarm produces. I see no other way than that to get, for instance,
simplehash.h's SH_TYPE to be formatted as a typedef. There are also
some typedefs that don't get formatted correctly because they are only
used for wonky options that no existing typedef-reporting buildfarm member
builds. Manual addition might be the path of least resistance there too.

Now the other side of this coin is that, by definition, such typedefs
are not getting used in a huge number of places. If we just had to
live with it, it might not be awful.

Dealing with the typedef lists in general is a bit of a pain to get
right, to make sure that new just-written code gets correctly indented.
Perhaps we could find a way to incorporate the buildfarm typedef lists
and a manual list and a locally generated/provided set in a simpler
fashion in general.

All in all, this looks pretty darn good from here, and I'm thinking
we should push forward on it.

What does that exactly mean concretely?

That means I plan to continue putting effort into it with the goal of
making a switchover sometime pretty darn soon. We do not have a very
wide window for fooling with pgindent rules, IMO --- once v11 development
starts I think we can't touch it again (until this time next year).

Agreed.

We've talked about pulling pgindent into our main repo, or posting a
link to a tarball someplace. An intermediate plan might be to give it
its own repo, but on git.postgresql.org, which seems like it might
give us the best of both worlds. But I really want something that's
going to be easy to set up and configure. It took me years to be
brave enough to get the current pgindent set up.

Yes, moving the goalposts on ease-of-use is an important consideration
here. What that says to me is that we ought to pull FreeBSD indent
into our tree, and provide Makefile support that makes it easy for
any developer to build it and put it into their PATH. (I suppose
that means support in the MSVC scripts too, but somebody else will
have to do that part.)

I'm not a huge fan of this, however. Do we really need to carry around
the FreeBSD indent in our tree? I had been expecting that these changes
would eventually result in a package that's available in the common
distributions (possibly from apt/yum.postgresql.org, at least until it's
in the main Debian-based and RHEL-based package systems). Are you
thinking that we'll always have to have our own modified version?

We should also think hard about getting rid of the entab dependency,
to eliminate the other nonstandard prerequisite program. We could
either accept that that will result in some tab-vs-space changes in
our code, or try to convert those steps in pgindent into pure Perl,
or try to convince Piotr to add an option to indent that will make
it do tabs the way we want (ie use a space not a tab if the tab
would only move one space anyway).

What about perltidy itself..? We don't include that in our tree either.

I do think it'd be good to if Piotr would add such an option, hopefully
that's agreeable.

Lastly (and I've said this before, but you pushed back on it at
the time), if we're doing this then we're going all in. That
means reformatting the back branches to match too. That diff
is already big enough to be a disaster for back-patching, and
we haven't even considered whether we want to let pgindent adopt
less-inconsistent rules for comment indentation. So I think that
as soon as the dust has settled in HEAD, we back-patch the addition
of FreeBSD indent, and the changes in pgindent proper, and then
run pgindent in each supported back branch.

Ugh. This would be pretty painful, but I agree that back-patching
without doing re-indenting the back-branches would also suck, so I'm on
the fence about this.

Thanks!

Stephen

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#5)
Re: Preliminary results for proposed new pgindent implementation

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Yes, moving the goalposts on ease-of-use is an important consideration
here. What that says to me is that we ought to pull FreeBSD indent
into our tree, and provide Makefile support that makes it easy for
any developer to build it and put it into their PATH. (I suppose
that means support in the MSVC scripts too, but somebody else will
have to do that part.)

I'm not a huge fan of this, however. Do we really need to carry around
the FreeBSD indent in our tree? I had been expecting that these changes
would eventually result in a package that's available in the common
distributions (possibly from apt/yum.postgresql.org, at least until it's
in the main Debian-based and RHEL-based package systems). Are you
thinking that we'll always have to have our own modified version?

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

* easy accessibility for PG developers

* at any given time we need to be using a specific "blessed" version,
so that all developers can get equivalent results. There's pretty much
no chance of that happening if we depend on distro-provided packages,
even if those share a common upstream.

We've had reasonably decent luck with tracking the tzcode/tzdata packages
as local copies, so I feel like we're not taking on anything unreasonable
if our model is that we'll occasionally (not oftener than once per year)
update our copy to recent upstream and then re-indent using that.

What about perltidy itself..? We don't include that in our tree either.

Not being much of a Perl guy, I don't care one way or the other about
perltidy. Somebody else can work on that if it needs work.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Stephen Frost (#5)
Re: Preliminary results for proposed new pgindent implementation

On 05/19/2017 06:05 PM, Stephen Frost wrote:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, May 18, 2017 at 11:00 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The reason PGSSTrackLevel is "unrecognized" is that it's not in
typedefs.list, which is a deficiency in our typedef-collection
technology not in indent. (I believe the problem is that there
are no variables declared with that typename, causing there to
not be any of the kind of symbol table entries we are looking for.)

This, however, doesn't sound so good. Isn't there some way this can be fixed?

I'm intending to look into it, but I think it's mostly independent of
whether we replace pgindent itself. The existing code has the same
problem, really.

One brute-force way we could deal with the problem is to have a "manual"
list of names to be treated as typedefs, in addition to whatever the
buildfarm produces. I see no other way than that to get, for instance,
simplehash.h's SH_TYPE to be formatted as a typedef. There are also
some typedefs that don't get formatted correctly because they are only
used for wonky options that no existing typedef-reporting buildfarm member
builds. Manual addition might be the path of least resistance there too.

Now the other side of this coin is that, by definition, such typedefs
are not getting used in a huge number of places. If we just had to
live with it, it might not be awful.

Dealing with the typedef lists in general is a bit of a pain to get
right, to make sure that new just-written code gets correctly indented.
Perhaps we could find a way to incorporate the buildfarm typedef lists
and a manual list and a locally generated/provided set in a simpler
fashion in general.

You can get a pretty good typedefs list just by looking for the pattern
"} <type name>;". Something like this:

grep -o -h -I --perl-regexp -r "}\W(\w+);" src/ contrib/ | perl -pe
's/}\W(.*);/\1/' | sort | uniq > typedefs.list

It won't cover system headers and non-struct typedefs, but it catches
those simplehash typedefs and PGSSTrackLevel. Maybe we should run that
and merge the result with the typedef lists we collect in the buildfarm.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#7)
Re: Preliminary results for proposed new pgindent implementation

Heikki Linnakangas <hlinnaka@iki.fi> writes:

You can get a pretty good typedefs list just by looking for the pattern
"} <type name>;".

That's going to catch a lot of things that are just variables, though.
It might be all right as long as there was manual filtering after it.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#6)
Re: Preliminary results for proposed new pgindent implementation

On Fri, May 19, 2017 at 11:22:54AM -0400, Tom Lane wrote:

We've had reasonably decent luck with tracking the tzcode/tzdata packages
as local copies, so I feel like we're not taking on anything unreasonable
if our model is that we'll occasionally (not oftener than once per year)
update our copy to recent upstream and then re-indent using that.

I guess by having a copy in our tree we would overtly update our
version, rather than downloading whatever happens to be the most recent
version and finding changes by accident.

If we could download a specific version that had everything we need,
that would work too.

What about perltidy itself..? We don't include that in our tree either.

Not being much of a Perl guy, I don't care one way or the other about
perltidy. Somebody else can work on that if it needs work.

We have agreed on a fixed version of perltidy and I added a download
link to pgindent/README.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#8)
Re: Preliminary results for proposed new pgindent implementation

On 05/19/2017 06:48 PM, Tom Lane wrote:

Heikki Linnakangas <hlinnaka@iki.fi> writes:

You can get a pretty good typedefs list just by looking for the pattern
"} <type name>;".

That's going to catch a lot of things that are just variables, though.
It might be all right as long as there was manual filtering after it.

At a quick glance, there are only a couple of them. This two cases
caught my eye. In twophase.c:

static struct xllist
{
StateFileChunk *head; /* first data block in the chain */
StateFileChunk *tail; /* last block in chain */
uint32 num_chunks;
uint32 bytes_free; /* free bytes left in
tail block */
uint32 total_len; /* total data bytes in
chain */
} records;

And this in informix.c:

static struct
{
long val;
int maxdigits;
int digits;
int remaining;
char sign;
char *val_string;
} value;

IMHO it would actually be an improvement if there was a space rather
than a tab there. But I'm not sure what else it would mess up to
consider those typedef names. And those are awfully generic names;
wouldn't hurt to rename them, anyway.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#6)
Re: Preliminary results for proposed new pgindent implementation

On Fri, May 19, 2017 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

* easy accessibility for PG developers

* at any given time we need to be using a specific "blessed" version,
so that all developers can get equivalent results. There's pretty much
no chance of that happening if we depend on distro-provided packages,
even if those share a common upstream.

Yeah, but those advantages could also be gained by putting the
pgindent tree on git.postgresql.org in a separate repository. Having
it in the same repository as the actual PostgreSQL code is not
required nor, in my opinion, particularly desirable.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#11)
Re: Preliminary results for proposed new pgindent implementation

On 2017-05-19 12:21:52 -0400, Robert Haas wrote:

On Fri, May 19, 2017 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

* easy accessibility for PG developers

* at any given time we need to be using a specific "blessed" version,
so that all developers can get equivalent results. There's pretty much
no chance of that happening if we depend on distro-provided packages,
even if those share a common upstream.

Yeah, but those advantages could also be gained by putting the
pgindent tree on git.postgresql.org in a separate repository. Having
it in the same repository as the actual PostgreSQL code is not
required nor, in my opinion, particularly desirable.

I'm of the contrary opinion. A lot of the regular churn due to pgindent
right now is because it's inconvenient to run. Having to clone a
separate repository, compile that project, put it into PATH (fun if
there's multiple versions), run pgindent, discover typedefs.list is out
of date, update, run, ... is pretty much a guarantee that'll continue.
If we had a make indent that computed local typedefs list, *added* new
but not removed old ones, we could get much closer to just always being
properly indented.

The cost of putting it somewhere blow src/tools/pgindent seems fairly
minor.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#11)
Re: Preliminary results for proposed new pgindent implementation

Robert Haas <robertmhaas@gmail.com> writes:

On Fri, May 19, 2017 at 11:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

* easy accessibility for PG developers

* at any given time we need to be using a specific "blessed" version,
so that all developers can get equivalent results. There's pretty much
no chance of that happening if we depend on distro-provided packages,
even if those share a common upstream.

Yeah, but those advantages could also be gained by putting the
pgindent tree on git.postgresql.org in a separate repository. Having
it in the same repository as the actual PostgreSQL code is not
required nor, in my opinion, particularly desirable.

It adds an extra step to what a developer has to do to get pgindent
up and running, so it doesn't seem to me like it's helping the goal
of reducing the setup overhead.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#10)
Re: Preliminary results for proposed new pgindent implementation

Heikki Linnakangas <hlinnaka@iki.fi> writes:

On 05/19/2017 06:48 PM, Tom Lane wrote:

That's going to catch a lot of things that are just variables, though.
It might be all right as long as there was manual filtering after it.

At a quick glance, there are only a couple of them. This two cases
caught my eye. In twophase.c:

static struct xllist
{
...
} records;

IMHO it would actually be an improvement if there was a space rather
than a tab there.

Agreed, but if "records" were considered a typedef name, that would
likely screw up the formatting of code referencing it. Maybe less
badly with this version of indent than our old one, not sure.

What I was just looking at is the possibility of absorbing struct
tags ("xllist" in the above) as if they were typedef names. In
at least 95% of our usages, if a struct has a tag then the tag is
also the struct's typedef name. The reason this is interesting
is that it looks like (on at least Linux and macOS) the debug info
captures struct tags even when it misses the corresponding typedef.
We could certainly create a coding rule that struct tags *must*
match struct typedef names for our own code, but I'm not sure what
violations of that convention might appear in system headers.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#12)
Re: Preliminary results for proposed new pgindent implementation

Andres Freund <andres@anarazel.de> writes:

On 2017-05-19 12:21:52 -0400, Robert Haas wrote:

Yeah, but those advantages could also be gained by putting the
pgindent tree on git.postgresql.org in a separate repository. Having
it in the same repository as the actual PostgreSQL code is not
required nor, in my opinion, particularly desirable.

I'm of the contrary opinion. A lot of the regular churn due to pgindent
right now is because it's inconvenient to run. Having to clone a
separate repository, compile that project, put it into PATH (fun if
there's multiple versions), run pgindent, discover typedefs.list is out
of date, update, run, ... is pretty much a guarantee that'll continue.
If we had a make indent that computed local typedefs list, *added* new
but not removed old ones, we could get much closer to just always being
properly indented.

I hadn't really thought of automating it to that extent, but yeah,
that seems like an interesting prospect.

The cost of putting it somewhere blow src/tools/pgindent seems fairly
minor.

I think the main cost would be bloating distribution tarballs. Although
we're talking about adding ~50K to tarballs that are already pushing 20MB,
so realistically who's going to notice? If you want to cut the tarball
size, let's reopen the discussion about keeping release notes since the
dawn of time.

Also, having the support in distributed tarballs is not all bad, because
it would allow someone working from a tarball rather than a git pull
to have pgindent support. Dunno if there are any such someones anymore,
but maybe they're out there.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#13)
Re: Preliminary results for proposed new pgindent implementation

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

Yeah, but those advantages could also be gained by putting the
pgindent tree on git.postgresql.org in a separate repository. Having
it in the same repository as the actual PostgreSQL code is not
required nor, in my opinion, particularly desirable.

It adds an extra step to what a developer has to do to get pgindent
up and running, so it doesn't seem to me like it's helping the goal
of reducing the setup overhead.

I favor having indent in a separate repository in our Git server, for
these reasons

0. it's under our control (so we can change rules as we see fit)
1. we can have Piotr as a committer there
2. we can use the same pgindent version for all Pg branches

I'm thinking that whenever we change the indent rules, we would
re-indent supported back-branches, just as Tom's proposing we'd do now
(which I endorse).

We wouldn't change the rules often, but say if we leave some typedef
wonky behavior alone for now, and somebody happens to fix it in the
future, then the fix would apply not only to the current tree at the
time but also to all older trees, which makes sense.

Now, there is a process that can be followed to update a *patch* from an
"pre-indent upstream PG" to a "post-indent upstream PG", to avoid manual
work in rebasing the patch past pgindent. This can easily be used on
branches that can be rebased, also (since they are essentially just a
collection of patches). One problem that remains is that this doesn't
apply easily to branches that get merged without rebase. I *think* it
should be possible to come up with a process that creates a merge commit
using pgindent, but I haven't tried.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#14)
Re: Preliminary results for proposed new pgindent implementation

I wrote:

What I was just looking at is the possibility of absorbing struct
tags ("xllist" in the above) as if they were typedef names. In
at least 95% of our usages, if a struct has a tag then the tag is
also the struct's typedef name. The reason this is interesting
is that it looks like (on at least Linux and macOS) the debug info
captures struct tags even when it misses the corresponding typedef.
We could certainly create a coding rule that struct tags *must*
match struct typedef names for our own code, but I'm not sure what
violations of that convention might appear in system headers.

I did an experiment with seeing what would happen to the typedef list
if we included struct tags. On my Linux box, that adds about 10%
more names (3343 instead of 3028). A lot of them would be good to
have, but there are a lot of others that maybe not so much. See
attached diff output.

I hesitate to suggest any rule as grotty as "take struct tags only
if they begin with an upper-case letter", but that would actually
work really well, looks like.

regards, tom lane

Attachments:

typedefs.addedtext/x-diff; charset=us-ascii; name=typedefs.addedDownload+315-0
#18Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#16)
Re: Preliminary results for proposed new pgindent implementation

On 5/19/17 13:31, Alvaro Herrera wrote:

I favor having indent in a separate repository in our Git server, for
these reasons

I am also in favor of that.

0. it's under our control (so we can change rules as we see fit)
1. we can have Piotr as a committer there
2. we can use the same pgindent version for all Pg branches

3. We can use pgindent for external code.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: Preliminary results for proposed new pgindent implementation

On 5/19/17 11:22, Tom Lane wrote:

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

Is pgindent going to be indented by pgindent?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#19)
Re: Preliminary results for proposed new pgindent implementation

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 5/19/17 11:22, Tom Lane wrote:

I certainly would rather that our version matched something that's under
active maintenance someplace. But it seems like there are two good
arguments for having a copy in our tree:

Is pgindent going to be indented by pgindent?

If we were going to keep it in our tree, I'd plan to add an exclusion
rule to keep pgindent from touching it, as we already have for assorted
other files that are copied from external projects. However, it seems
like "keep it in a separate repo" is winning, so it's moot.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#18)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#21)
#23Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#24)
#26Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#24)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#26)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#27)
#30Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#29)
#31Andres Freund
andres@anarazel.de
In reply to: Bruce Momjian (#30)
#32Bruce Momjian
bruce@momjian.us
In reply to: Andres Freund (#31)
#33Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#32)
#34Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#33)
#35Piotr Stefaniak
postgres@piotr-stefaniak.me
In reply to: Tom Lane (#33)
#36Piotr Stefaniak
postgres@piotr-stefaniak.me
In reply to: Tom Lane (#28)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Piotr Stefaniak (#36)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Piotr Stefaniak (#35)
#39Piotr Stefaniak
postgres@piotr-stefaniak.me
In reply to: Tom Lane (#38)
#40Tom Lane
tgl@sss.pgh.pa.us
In reply to: Piotr Stefaniak (#39)
#41Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#21)
#42Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#41)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#40)
#44Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#21)
#45Piotr Stefaniak
postgres@piotr-stefaniak.me
In reply to: Tom Lane (#43)
#46Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#44)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#44)
#48Stephen Frost
sfrost@snowman.net
In reply to: Robert Haas (#47)
#49Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephen Frost (#48)
#50Stephen Frost
sfrost@snowman.net
In reply to: Tom Lane (#49)