Thread: run pgindent on a regular basis / scripted manner

run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

12 August 2020, 22:34:09

Hi,

When developing patches I find it fairly painful that I cannot re-indent
patches with pgindent without also seeing a lot of indentation changes
in unmodified parts of files.  It is easy enough ([1]) to only re-indent
files that I have modified, but there's often a lot of independent
indentation changes in the files that I did modified.

I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
most of the hunks were entirely unrelated. Despite the development
window for 14 having only relatively recently opened. Based on my
experience it tends to get worse over time.


Is there any reason we don't just automatically run pgindent regularly?
Like once a week? And also update typedefs.list automatically, while
we're at it?

Currently the yearly pgindent runs are somewhat painful for patches that
didn't make it into the release, but if we were to reindent on a more
regular basis, that should be much less the case.  It'd also help newer
developers who we sometimes tell to use pgindent - which doesn't really
work.

Greetings,

Andres Freund

[1] ./src/tools/pgindent/pgindent $(git diff-tree --no-commit-id --name-only -r upstream/master..HEAD|grep -v
src/test|grep-v READ ME|grep -v typedefs.list)

Re: run pgindent on a regular basis / scripted manner

From

Alvaro Herrera

Date:

12 August 2020, 22:53:25

On 2020-Aug-12, Andres Freund wrote:

> Is there any reason we don't just automatically run pgindent regularly?
> Like once a week? And also update typedefs.list automatically, while
> we're at it?

Seconded.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: run pgindent on a regular basis / scripted manner

From

Jesse Zhang

Date:

12 August 2020, 23:08:50

Hi Andres,

On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
>
> Hi,
>
> When developing patches I find it fairly painful that I cannot re-indent
> patches with pgindent without also seeing a lot of indentation changes
> in unmodified parts of files.  It is easy enough ([1]) to only re-indent
> files that I have modified, but there's often a lot of independent
> indentation changes in the files that I did modified.
>
> I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> most of the hunks were entirely unrelated. Despite the development
> window for 14 having only relatively recently opened. Based on my
> experience it tends to get worse over time.

How bad was it right after branching 13? I wonder if we have any
empirical measure of badness over time -- assuming there was a point in
the recent past where everything was good, and the bad just crept in.

>
>
> Is there any reason we don't just automatically run pgindent regularly?
> Like once a week? And also update typedefs.list automatically, while
> we're at it?

You know what's better than weekly? Every check-in. I for one would love
it if we can just format the entire codebase, and ensure that new
check-ins are also formatted. We _do_ need some form of continuous
integration to catch us when we have fallen short (again, once HEAD
reaches a "known good" state, it's conceivably cheap to keep it in the
good state.

Cheers,
Jesse

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

12 August 2020, 23:23:20

Hi,

On 2020-08-12 16:08:50 -0700, Jesse Zhang wrote:
> On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
> >
> > Hi,
> >
> > When developing patches I find it fairly painful that I cannot re-indent
> > patches with pgindent without also seeing a lot of indentation changes
> > in unmodified parts of files.  It is easy enough ([1]) to only re-indent
> > files that I have modified, but there's often a lot of independent
> > indentation changes in the files that I did modified.
> >
> > I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> > most of the hunks were entirely unrelated. Despite the development
> > window for 14 having only relatively recently opened. Based on my
> > experience it tends to get worse over time.
> 
> How bad was it right after branching 13? I wonder if we have any
> empirical measure of badness over time -- assuming there was a point in
> the recent past where everything was good, and the bad just crept in.

Well, just after branching it was perfect, because pgindent was
customarily is run just before branching. After that it incrementally
gets worse.


> > Is there any reason we don't just automatically run pgindent regularly?
> > Like once a week? And also update typedefs.list automatically, while
> > we're at it?
> 
> You know what's better than weekly? Every check-in. I for one would love
> it if we can just format the entire codebase, and ensure that new
> check-ins are also formatted. We _do_ need some form of continuous
> integration to catch us when we have fallen short (again, once HEAD
> reaches a "known good" state, it's conceivably cheap to keep it in the
> good state.

Unfortunately that is, with the current tooling, not entirely trivial to
do so completely. The way we generate the list of known typedefs
unfortunately depends on the platform a build is run on. Therefore the
buildfarm collects a number of the generated list of typedefs from
different platforms, and then we use that combined list to run pgindent.

We surely can improve further, but I think having any automation around
this already would be a huge step.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 August 2020, 23:47:01

Jesse Zhang <sbjesse@gmail.com> writes:
> On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
>> Is there any reason we don't just automatically run pgindent regularly?
>> Like once a week? And also update typedefs.list automatically, while
>> we're at it?

> You know what's better than weekly? Every check-in.

I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
of damage to code that was written without thinking about it --- in
particular, it'll make a hash of comment blocks that were manually
formatted and not protected with dashes.

If the workflow is commit first and re-indent later, then we can always
revert the pgindent commit and clean things up manually; but an auto
re-indent during commit wouldn't provide that history.

I do like the idea of more frequent, smaller pgindent runs instead of
doing just one giant run per year.  With the giant runs it's necessary
to invest a fair amount of time eyeballing all the changes; if we did it
every couple weeks then the pain would be a lot less.

Another idea would be to have a bot that runs pgindent *without*
committing the results, and emails the people who have made commits
into files that changed, saying "if you don't like these diffs then
you'd better clean up your commit before it happens for real".  With
some warning like that, it might be okay to do automatic reindenting
a little bit later.  Plus, nagging committers who habitually commit
improperly-indented code might persuade them to clean up their acts ;-)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 August 2020, 23:57:29

Andres Freund <andres@anarazel.de> writes:
> Unfortunately that is, with the current tooling, not entirely trivial to
> do so completely. The way we generate the list of known typedefs
> unfortunately depends on the platform a build is run on. Therefore the
> buildfarm collects a number of the generated list of typedefs from
> different platforms, and then we use that combined list to run pgindent.

Yeah, it's hard to see how to avoid that given that the set of typedefs
provided by system headers is not fixed.  (Windows vs not Windows is the
worst case of course, but Unixen aren't all alike either.)

Another gotcha that we have to keep our eyes on is that sometimes the
process finds spurious names that we don't want to treat as typedefs
because it results in messing up too much code.  There's a reject list
in pgindent that takes care of those, but it has to be maintained
manually.  So I'm not sure how that could work in conjunction with
unsupervised reindents --- by the time you notice a problem, the git
history will already be cluttered with bogus reindentations.

Maybe the secret is to not allow automated adoption of new typedefs.list
entries, but to require somebody to add entries to that file by hand,
even if they're basing it on the buildfarm results.  (This would
encourage the habit some people have adopted of updating typedefs.list
along with commits that add typedefs.  I've never done that, but would
be willing to change if there's good motivation to.)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

13 August 2020, 01:29:42

On Wed, Aug 12, 2020 at 06:53:25PM -0400, Alvaro Herrera wrote:
> On 2020-Aug-12, Andres Freund wrote:
>> Is there any reason we don't just automatically run pgindent regularly?
>> Like once a week? And also update typedefs.list automatically, while
>> we're at it?
>
> Seconded.

Thirded.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

13 August 2020, 03:53:08

On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
> Jesse Zhang <sbjesse@gmail.com> writes:
> > On Wed, Aug 12, 2020 at 3:34 PM Andres Freund wrote:
> >> Is there any reason we don't just automatically run pgindent regularly?
> >> Like once a week? And also update typedefs.list automatically, while
> >> we're at it?
> 
> > You know what's better than weekly? Every check-in.
> 
> I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
> of damage to code that was written without thinking about it --- in
> particular, it'll make a hash of comment blocks that were manually
> formatted and not protected with dashes.
> 
> If the workflow is commit first and re-indent later, then we can always
> revert the pgindent commit and clean things up manually; but an auto
> re-indent during commit wouldn't provide that history.

There are competing implementations of assuring pgindent-cleanliness at every
check-in:

1. After each push, an automated followup commit appears, restoring
   pgindent-cleanliness.
2. "git push" results in a commit that melds pgindent changes into what the
   committer tried to push.
3. "git push" fails, for the master branch, if the pushed commit disrupts
   pgindent-cleanliness.

Were you thinking of (2)?  (1) doesn't have the lack-of-history problem, but
it does have the unexpected-damage problem, and it makes gitweb noisier.  (3)
has neither problem, and I'd prefer it over (1), (2), or $SUBJECT.

Regarding typedefs.list, I would use the in-tree one, like you discussed here:

On Wed, Aug 12, 2020 at 07:57:29PM -0400, Tom Lane wrote:
> Maybe the secret is to not allow automated adoption of new typedefs.list
> entries, but to require somebody to add entries to that file by hand,
> even if they're basing it on the buildfarm results.  (This would
> encourage the habit some people have adopted of updating typedefs.list
> along with commits that add typedefs.  I've never done that, but would
> be willing to change if there's good motivation to.)

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

13 August 2020, 04:08:36

Noah Misch <noah@leadboat.com> writes:
> On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
>> If the workflow is commit first and re-indent later, then we can always
>> revert the pgindent commit and clean things up manually; but an auto
>> re-indent during commit wouldn't provide that history.

> There are competing implementations of assuring pgindent-cleanliness at every
> check-in:

> 1. After each push, an automated followup commit appears, restoring
>    pgindent-cleanliness.
> 2. "git push" results in a commit that melds pgindent changes into what the
>    committer tried to push.
> 3. "git push" fails, for the master branch, if the pushed commit disrupts
>    pgindent-cleanliness.

> Were you thinking of (2)?

I was objecting to (2).  (1) would perhaps work.  (3) could be pretty
darn annoying, especially if it blocks a time-critical security patch.

> Regarding typedefs.list, I would use the in-tree one, like you discussed here:

Yeah, after thinking about that more, it seems like automated
typedefs.list updates would be far riskier than automated reindent
based on the existing typedefs.list.  The latter could at least be
expected not to change code unrelated to the immediate commit.
typedefs.list updates need some amount of adult supervision.

(I'd still vote for nag-mail to the committer whose work got reindented,
in case the bot made things a lot worse.)

I hadn't thought about the angle of HEAD versus back-branch patches,
but that does seem like something to ponder.  The back branches don't
have the same pgindent rules necessarily, plus the patch versions
might be different in more than just whitespace.  My own habit when
back-patching has been to indent the HEAD patch per-current-rules and
then preserve that layout as much as possible in the back branches,
but I doubt we could get a tool to do that with any reliability.

Of course, there's also the possibility of forcibly reindenting
all the active back branches to current rules.  But I think we've
rejected that idea already, because it would cause so much pain
for forks that are following a back branch.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

13 August 2020, 04:26:55

On Thu, Aug 13, 2020 at 12:08:36AM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
> >> If the workflow is commit first and re-indent later, then we can always
> >> revert the pgindent commit and clean things up manually; but an auto
> >> re-indent during commit wouldn't provide that history.
> 
> > There are competing implementations of assuring pgindent-cleanliness at every
> > check-in:
> 
> > 1. After each push, an automated followup commit appears, restoring
> >    pgindent-cleanliness.
> > 2. "git push" results in a commit that melds pgindent changes into what the
> >    committer tried to push.
> > 3. "git push" fails, for the master branch, if the pushed commit disrupts
> >    pgindent-cleanliness.
> 
> > Were you thinking of (2)?
> 
> I was objecting to (2).  (1) would perhaps work.  (3) could be pretty
> darn annoying,

Right.  I think of three use cases here:

a) I'm a committer who wants to push clean code.  I want (3).
b) I'm a committer who wants to ignore pgindent.  I get some email complaints
   under (1), which I ignore.  Under (3), I'm forced to become (a).
c) I'm reading the history.  I want (3).

> I hadn't thought about the angle of HEAD versus back-branch patches,
> but that does seem like something to ponder.  The back branches don't
> have the same pgindent rules necessarily, plus the patch versions
> might be different in more than just whitespace.  My own habit when
> back-patching has been to indent the HEAD patch per-current-rules and
> then preserve that layout as much as possible in the back branches,
> but I doubt we could get a tool to do that with any reliability.

Similar habit here.  Another advantage of master-only is a guarantee against
disrupting time-critical patches.  (It would be ugly to push back branches and
sort out the master push later, but it doesn't obstruct the mission.)

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

13 August 2020, 05:14:33

Noah Misch <noah@leadboat.com> writes:
> ... Another advantage of master-only is a guarantee against
> disrupting time-critical patches.  (It would be ugly to push back branches and
> sort out the master push later, but it doesn't obstruct the mission.)

Hm, doesn't it?  I had the idea that "git push" is atomic --- either all
the per-branch commits succeed, or they all fail.  I might be wrong.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

13 August 2020, 05:21:37

On Thu, Aug 13, 2020 at 01:14:33AM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > ... Another advantage of master-only is a guarantee against
> > disrupting time-critical patches.  (It would be ugly to push back branches and
> > sort out the master push later, but it doesn't obstruct the mission.)
> 
> Hm, doesn't it?  I had the idea that "git push" is atomic --- either all
> the per-branch commits succeed, or they all fail.  I might be wrong.

Atomicity is good.  I just meant that you could issue something like "git push
origin $(cd .git/refs/heads && ls REL*)" to defer the complaint about master.

Re: run pgindent on a regular basis / scripted manner

From

Jesse Zhang

Date:

13 August 2020, 05:24:41

On Wed, Aug 12, 2020 at 10:14 PM Tom Lane wrote:
>
> Noah Misch <noah@leadboat.com> writes:
> > ... Another advantage of master-only is a guarantee against
> > disrupting time-critical patches.  (It would be ugly to push back branches and
> > sort out the master push later, but it doesn't obstruct the mission.)
>
> Hm, doesn't it?  I had the idea that "git push" is atomic --- either all
> the per-branch commits succeed, or they all fail.  I might be wrong.

As of Git 2.28, atomic pushes are not turned on by default. That means
"git push my-remote foo bar" _can_ result in partial success. I'm that
paranoid who does "git push --atomic my-remote foo bar fizz".

Cheers,
Jesse

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

13 August 2020, 07:47:48

On Thu, Aug 13, 2020 at 6:08 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Noah Misch <noah@leadboat.com> writes:
> On Wed, Aug 12, 2020 at 07:47:01PM -0400, Tom Lane wrote:
>> If the workflow is commit first and re-indent later, then we can always
>> revert the pgindent commit and clean things up manually; but an auto
>> re-indent during commit wouldn't provide that history.

> There are competing implementations of assuring pgindent-cleanliness at every
> check-in:

> 1. After each push, an automated followup commit appears, restoring
> pgindent-cleanliness.
> 2. "git push" results in a commit that melds pgindent changes into what the
> committer tried to push.
> 3. "git push" fails, for the master branch, if the pushed commit disrupts
> pgindent-cleanliness.

There's another option here as well, that is a bit "softer", is to use a pre-commit hook.

That is, it's a hook that runs on the committers machine prior to the commit. This hook can then yell "hey, you need to run pgindent before committing this", but it gives the committer the ability to do --no-verify and commit anyway (thus won't block things that are urgent).

Since it allows a simple bypass, and very much relies on the committer to remember to install the hook in their local repository, this is not a guarantee in any way. So it might need to be done together with something else in the background doing like a daily job or so, but it might make that background work be smaller and fewer changes.

This obviously only works in the case where we can rely on the committers to remember to install such a hook, but given the few committers that we do have, I think we can certainly get that up to an "acceptable rate" fairly easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few other repositories, to ensure python styleguides are followed.

> Were you thinking of (2)?

I was objecting to (2). (1) would perhaps work. (3) could be pretty
darn annoying, especially if it blocks a time-critical security patch.

FWIW, I agree that (2) seems like a really bad option. In that suddenly a committer has committed something they were not aware of.

> Regarding typedefs.list, I would use the in-tree one, like you discussed here:

Yeah, after thinking about that more, it seems like automated
typedefs.list updates would be far riskier than automated reindent
based on the existing typedefs.list. The latter could at least be
expected not to change code unrelated to the immediate commit.
typedefs.list updates need some amount of adult supervision.

(I'd still vote for nag-mail to the committer whose work got reindented,
in case the bot made things a lot worse.)

Yeah, I'm definitely not a big fan of automated commits, regardless of if it's just indent or both indent+typedef. It's happened at least once, and I think more than once, that we've had to basically hard reset the upstream repository and clean things up after automated commits have gone bonkers (hi, Bruce!). Having an automated system do the whole flow of change->commit->push definitely invites this type of problem.

There are many solutions that do such things but that do it in the "github workflow" way, which is they do change -> commit -> create pull request, and then somebody eyeballs that pullrequest and approves it. We don't have PRs, but we could either have a script that simply sends out a patch to a mailinglist, or we could have a script that maintains a separate branch which is auto-pgindented, and then have a committer squash-merge that branch after having reviewed it.

Maybe something like that in combination with a pre-commit hook per above.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Stephen Frost

Date:

13 August 2020, 16:30:44

Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> There's another option here as well, that is a bit "softer", is to use a
> pre-commit hook.

Yeah, +1 on a pre-commit idea to help address this.  I certainly agree
with Andres that it's quite annoying to deal with commits coming in that
aren't indented properly but are in some file that I'm working on.

> This obviously only works in the case where we can rely on the committers
> to remember to install such a hook, but given the few committers that we do
> have, I think we can certainly get that up to an "acceptable rate" fairly
> easily. FWIW, this is similar to what we do in the pgweb, pgeu and a few
> other repositories, to ensure python styleguides are followed.

Yeah, no guarantee, but definitely seems like it'd be a good
improvement.

> > I was objecting to (2).  (1) would perhaps work.  (3) could be pretty
> > darn annoying, especially if it blocks a time-critical security patch.
>
> FWIW, I agree that (2) seems like a really bad option. In that suddenly a
> committer has committed something they were not aware of.

Yeah, I dislike (2) a lot too.

> Yeah, I'm definitely not a big fan of automated commits, regardless of if
> it's just indent or both indent+typedef. It's happened at least once, and I
> think more than once, that we've had to basically hard reset the upstream
> repository and clean things up after automated commits have gone bonkers
> (hi, Bruce!). Having an automated system do the whole flow of
> change->commit->push definitely invites this type of problem.

Agreed, automated commits seems terribly risky.

> There are many solutions that do such things but that do it in the "github
> workflow" way, which is they do change -> commit -> create pull request,
> and then somebody eyeballs that pullrequest and approves it. We don't have
> PRs, but we could either have a script that simply sends out a patch to a
> mailinglist, or we could have a script that maintains a separate branch
> which is auto-pgindented, and then have a committer squash-merge that
> branch after having reviewed it.
>
> Maybe something like that in combination with a pre-commit hook per above.

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Thanks,

Stephen

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

13 August 2020, 16:50:16

Stephen Frost <sfrost@snowman.net> writes:
> So, in our world, wouldn't this translate to 'make cfbot complain'?

> I'm definitely a fan of the idea of having cfbot flag these and then we
> maybe get to a point where it's not the committers dealing with fixing
> patches that weren't pgindent'd properly, it's the actual patch
> submitters being nagged about it...

Meh.  Asking all submitters to install pgindent is a bit of a burden.
Moreover, sometimes it's better to provide a patch that deliberately
hasn't reindented existing code, for ease of review (say, when you're
adding if() { ... } around some big hunk of code).  I think getting
committers to do this as part of commit is a better workflow.

(Admittedly, since I've been doing that for a long time, I don't
find it to be a burden.  I suppose some committers do.)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

13 August 2020, 16:58:50

On Thu, Aug 13, 2020 at 6:30 PM Stephen Frost <sfrost@snowman.net> wrote:

* Magnus Hagander (magnus@hagander.net) wrote:

> There are many solutions that do such things but that do it in the "github
> workflow" way, which is they do change -> commit -> create pull request,
> and then somebody eyeballs that pullrequest and approves it. We don't have
> PRs, but we could either have a script that simply sends out a patch to a
> mailinglist, or we could have a script that maintains a separate branch
> which is auto-pgindented, and then have a committer squash-merge that
> branch after having reviewed it.
>
> Maybe something like that in combination with a pre-commit hook per above.

So, in our world, wouldn't this translate to 'make cfbot complain'?

I'm definitely a fan of the idea of having cfbot flag these and then we
maybe get to a point where it's not the committers dealing with fixing
patches that weren't pgindent'd properly, it's the actual patch
submitters being nagged about it...

Werll, that's one thing, but what I was thinking of here was more of an automated branch maintained for the committers, not for the individual patch owners.

That is:
1. Whenever a patch is pushed on master on the main repo a process kicked off (or maybe wait 5 minutes to coalesce multiple patches if there are)

2. This process checks out master, and runs pgindent on it

3. When done, this gets committed to a new branch (or just overwrites an existing branch of course, we don't need to maintain history here) like "master-indented". This branch can be in a different repo, but one that starts out as a clone of the main one

4. A committer (any committer) can then on regular basis examine the differences by fetch + diff. If they're happy with it, cherry pick it in. If not, figure out what needs to be done to adjust it.

Step 4 can be done at whatever interval we prefer, and we can have something to nag us if head has been "off-indent" for too long.

This would be the backup for things that aren't indented during patch commit, not other things.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

13 August 2020, 19:36:01

On Wed, Aug 12, 2020 at 7:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not in favor of unsupervised pgindent runs, really.  It can do a lot
> of damage to code that was written without thinking about it --- in
> particular, it'll make a hash of comment blocks that were manually
> formatted and not protected with dashes.

No committer should be committing code without thinking about
pgindent. If some are, they need to up their game.

I am not sure whether weekly or after-every-commit pgindent runs is a
good idea, but I think we should try to do it once a month or so. It's
too annoying otherwise. I could go either way on the question of
automation.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

13 August 2020, 19:39:29

On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:
> So, in our world, wouldn't this translate to 'make cfbot complain'?

This seems like it would be useful, but we'd have to figure out what
to do about typedefs.list. If the patch is indented with the current
one (which is auto-generated by the entire build farm, remember) it's
likely to mess up a patch that's otherwise properly formatted. We'd
either need to insist that people include updates to typedefs.list in
the patch, or else have the cfbot take a stab at doing those updates
itself.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

13 August 2020, 19:47:28

Robert Haas <robertmhaas@gmail.com> writes:
> I am not sure whether weekly or after-every-commit pgindent runs is a
> good idea, but I think we should try to do it once a month or so. It's
> too annoying otherwise. I could go either way on the question of
> automation.

The traditional reason for not doing pgindent too often has been that
it'd cause more work for people who have to rebase their patches over
pgindent's results.  If we want to do it more often, then in order to
respond to that concern, I think we need to do it really often ---
not necessarily quite continuously, but often enough that pgindent
is only changing recently-committed code.  In this way, it'd be likely
that anyone with a patch touching that same code would only need to
rebase once not twice.  The approaches involving an automated run
give a guarantee of that, otherwise we don't have a guarantee; but
as long as it's not many days delay I think it wouldn't be bad.

Intervals on the order of a month seem likely to be the worst of
both worlds from this standpoint --- too long for people to wait
before rebasing their patch, yet short enough that they'd have
to do so repeatedly.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

13 August 2020, 19:50:01

On Thu, Aug 13, 2020 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The traditional reason for not doing pgindent too often has been that
> it'd cause more work for people who have to rebase their patches over
> pgindent's results.  If we want to do it more often, then in order to
> respond to that concern, I think we need to do it really often ---
> not necessarily quite continuously, but often enough that pgindent
> is only changing recently-committed code.  In this way, it'd be likely
> that anyone with a patch touching that same code would only need to
> rebase once not twice.  The approaches involving an automated run
> give a guarantee of that, otherwise we don't have a guarantee; but
> as long as it's not many days delay I think it wouldn't be bad.
>
> Intervals on the order of a month seem likely to be the worst of
> both worlds from this standpoint --- too long for people to wait
> before rebasing their patch, yet short enough that they'd have
> to do so repeatedly.

Yeah, I get the point. It depends somewhat on how often you think
people will rebase. The main thing against more frequent pgindent runs
is that it clutters the history. If done manually, it's also a lot of
work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

13 August 2020, 20:06:02

Hi,

On 2020-08-13 12:50:16 -0400, Tom Lane wrote:
> Stephen Frost <sfrost@snowman.net> writes:
> > So, in our world, wouldn't this translate to 'make cfbot complain'?
> 
> > I'm definitely a fan of the idea of having cfbot flag these and then we
> > maybe get to a point where it's not the committers dealing with fixing
> > patches that weren't pgindent'd properly, it's the actual patch
> > submitters being nagged about it...
> 
> Meh.  Asking all submitters to install pgindent is a bit of a burden.

+1. We could improve on that by slurping it into src/tools though. If
there were a 'make patchprep' target, it'd be a lot more realistic.


But even so, it'd robably further inrease the rate of needing to
constantly rebase lingering patches if cfbot considered indentation
issues failures. E.g. because of typedefs.list updates etc. So I'm
against that, even if we had a patchprep target that didn't need
external tools.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Stephen Frost

Date:

13 August 2020, 20:16:28

Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Aug 13, 2020 at 12:30 PM Stephen Frost <sfrost@snowman.net> wrote:
> > So, in our world, wouldn't this translate to 'make cfbot complain'?
>
> This seems like it would be useful, but we'd have to figure out what
> to do about typedefs.list. If the patch is indented with the current
> one (which is auto-generated by the entire build farm, remember) it's
> likely to mess up a patch that's otherwise properly formatted. We'd
> either need to insist that people include updates to typedefs.list in
> the patch, or else have the cfbot take a stab at doing those updates
> itself.

For my 2c, anyway, I like the idea of having folks update the typedefs
themselves when they've got a patch that needs a new typedef to be
indented correctly.  Having cfbot try to do that seems unlikely to work
well.

I also didn't mean to imply that we'd push back and ask for a rebase due
to indentation changes, but at the same time, I question if it's really
that realistic a concern- either whomever posted the patch ran pgindent
on it, or they didn't, and I doubt cfbot's check of that would change
without there being a conflict between the patch and something that got
committed anyway.

I also disagree that it's that much of a burden to ask people who are
already hacking on PG to install pgindent.

All that said, seems that others feel differently and while I still
think it's a pretty reasonable idea to have cfbot check, if no one
agrees with me, that's fine too.  Having the pre-commit hook would help
with the downstream issue of pgindent pain from unrelated incorrect
indentation, so at least dealing with the patch author not properly
indenting to start with would be just on the bits the patch is already
modifying, which is a lot better.

Thanks,

Stephen

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Alvaro Herrera

Date:

14 August 2020, 20:22:35

On 2020-Aug-13, Stephen Frost wrote:

> For my 2c, anyway, I like the idea of having folks update the typedefs
> themselves when they've got a patch that needs a new typedef to be
> indented correctly.

Well, let's for starters encourage committers to update typedefs.
Personally I've stayed away from it for each commit just because we've
historically not done it, but I can easily change that.  Plus, it cannot
harm.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: run pgindent on a regular basis / scripted manner

From

Alvaro Herrera

Date:

14 August 2020, 20:26:51

On 2020-Aug-13, Magnus Hagander wrote:

> That is:
> 1. Whenever a patch is pushed on master on the main repo a process kicked
> off (or maybe wait 5 minutes to coalesce multiple patches if there are)
> 2. This process checks out master, and runs pgindent on it
> 3. When done, this gets committed to a new branch (or just overwrites an
> existing branch of course, we don't need to maintain history here) like
> "master-indented". This branch can be in a different repo, but one that
> starts out as a clone of the main one
> 4. A committer (any committer) can then on regular basis examine the
> differences by fetch + diff. If they're happy with it, cherry pick it in.
> If not, figure out what needs to be done to adjust it.

Sounds good -- for branch master.

Yesterday I tried to indent some patch across all branches, only to
discover that I'm lacking the pg_bsd_indent necessary for the older
ones.  I already have two, but apparently I'd need *four* different
versions with current branches (1.3, 2.0, 2.1, 2.1.1)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: run pgindent on a regular basis / scripted manner

From

Amit Kapila

Date:

15 August 2020, 03:34:23

On Sat, Aug 15, 2020 at 1:57 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> On 2020-Aug-13, Magnus Hagander wrote:
>
> > That is:
> > 1. Whenever a patch is pushed on master on the main repo a process kicked
> > off (or maybe wait 5 minutes to coalesce multiple patches if there are)
> > 2. This process checks out master, and runs pgindent on it
> > 3. When done, this gets committed to a new branch (or just overwrites an
> > existing branch of course, we don't need to maintain history here) like
> > "master-indented". This branch can be in a different repo, but one that
> > starts out as a clone of the main one
> > 4. A committer (any committer) can then on regular basis examine the
> > differences by fetch + diff. If they're happy with it, cherry pick it in.
> > If not, figure out what needs to be done to adjust it.
>
> Sounds good -- for branch master.
>
> Yesterday I tried to indent some patch across all branches, only to
> discover that I'm lacking the pg_bsd_indent necessary for the older
> ones.  I already have two, but apparently I'd need *four* different
> versions with current branches (1.3, 2.0, 2.1, 2.1.1)
>

FWIW, for back-branches, I just do similar to what Tom said above  [1]
("My own habit when back-patching has been to indent the HEAD patch
per-current-rules and then preserve that layout as much as possible in
the back branches"). If we want we can maintain all the required
versions of pg_bsd_indent but as of now, I am not doing so and thought
that following some approximation rule (do it for HEAD and try my best
to maintain the layout for back-branches) is good enough.

[1] - https://www.postgresql.org/message-id/397020.1597291716%40sss.pgh.pa.us

-- 
With Regards,
Amit Kapila.

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

15 August 2020, 11:47:41

On 2020-08-13 00:34, Andres Freund wrote:
> I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> most of the hunks were entirely unrelated. Despite the development
> window for 14 having only relatively recently opened. Based on my
> experience it tends to get worse over time.

Do we have a sense of why poorly-indented code gets committed?  I think 
some of the indentation rules are hard to follow manually.  (pgperltidy 
is worse.)

Also, since pgindent gets run eventually anyway, it's not really that 
important to get the indentation right the first time.  I suspect the 
goal of most authors and committers is to write readable code rather 
than to divine the exact pgindent output.

I think as a start, we could just issue a guidelines that all committed 
code should follow pgindent.  That has never really been a guideline, so 
it's not surprising that it's not followed.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

15 August 2020, 16:59:13

Hi,

On 2020-08-15 13:47:41 +0200, Peter Eisentraut wrote:
> On 2020-08-13 00:34, Andres Freund wrote:
> > I e.g. just re-indented patch 0001 of my GetSnapshotData() series and
> > most of the hunks were entirely unrelated. Despite the development
> > window for 14 having only relatively recently opened. Based on my
> > experience it tends to get worse over time.
> 
> Do we have a sense of why poorly-indented code gets committed?  I think some
> of the indentation rules are hard to follow manually.  (pgperltidy is
> worse.)
> 
> Also, since pgindent gets run eventually anyway, it's not really that
> important to get the indentation right the first time.  I suspect the goal
> of most authors and committers is to write readable code rather than to
> divine the exact pgindent output.

One thing is that some here are actively against manually adding entries
to typedefs.list. Which then means that code gets oddly indented if you
use pgindent.  I personally try to make the predictable updates to
typedefs.list, which then at least allows halfway sensibly indenting my
own changes.

> I think as a start, we could just issue a guidelines that all committed code
> should follow pgindent.  That has never really been a guideline, so it's not
> surprising that it's not followed.

Without a properly indented baseline that's hard to do, because it'll
cause damage all over. So I don't think we easily can start just there -
we'd first need to re-indent everything.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

15 August 2020, 17:44:34

Andres Freund <andres@anarazel.de> writes:
> One thing is that some here are actively against manually adding entries
> to typedefs.list.

I've been of the opinion that it's pointless to do so under the current
regime where (a) only a few people do that and (b) we only officially
re-indent once a year anyway.  When I want to manually run pgindent,
I always pull down a fresh typedefs.list from the buildfarm, which is
reasonably up-to-date regardless of what people added or didn't add,
and then add any new typedefs from my current patch to that out-of-tree
copy.

Now, if we switch to a regime where we're trying to keep the tree in
more nearly correctly-indented shape all the time, it would make sense
to revisit that.  I'm not saying that it's unreasonable to want to have
the in-tree typedefs.list track reality more closely --- only that doing
so in a half-baked way won't be very helpful.

>> I think as a start, we could just issue a guidelines that all committed code
>> should follow pgindent.  That has never really been a guideline, so it's not
>> surprising that it's not followed.

> Without a properly indented baseline that's hard to do, because it'll
> cause damage all over. So I don't think we easily can start just there -
> we'd first need to re-indent everything.

Well, we can certainly do a tree-wide re-indent anytime we're ready.
I doubt it would be very painful right now, with so little new work
since the last run.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 August 2020, 18:05:24

Bruce Momjian <bruce@momjian.us> writes:
> On Sat, Aug 15, 2020 at 01:44:34PM -0400, Tom Lane wrote:
>> Well, we can certainly do a tree-wide re-indent anytime we're ready.
>> I doubt it would be very painful right now, with so little new work
>> since the last run.

> Uh, I thought Tom was saying we need to reindent all branches, which
> would be a big change for those tracking forks.

No, I'm not for reindenting the back branches in general.  There was
some discussion about whether to try to indent back-patched patches
to meet the conventions of the specific back branch, but I doubt
that that's worth troubling over.  I think we really only care
about whether HEAD is fully consistently indented.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 August 2020, 19:25:52

Andres Freund <andres@anarazel.de> writes:
> On 2020-08-15 13:44:34 -0400, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> One thing is that some here are actively against manually adding entries
>>> to typedefs.list.

>> I've been of the opinion that it's pointless to do so under the current
>> regime where (a) only a few people do that and (b) we only officially
>> re-indent once a year anyway.  When I want to manually run pgindent,
>> I always pull down a fresh typedefs.list from the buildfarm, which is
>> reasonably up-to-date regardless of what people added or didn't add,
>> and then add any new typedefs from my current patch to that out-of-tree
>> copy.

> Well, properly indenting new code still is worthwhile. And once you go
> through the trouble of adding the typedefs locally, I don't really see
> the reason not to also include them in the commit.

Yeah, I'm quite religious about making sure my commits have been through
pgindent already (mainly to reduce subsequent "git blame" noise).
However, relying on manual updates to the in-tree typedefs.list only
works if every committer is equally religious about it.  They're not;
else we'd not be having this discussion.  The workflow I describe above
is not dependent on how careful everybody else is, which is why I
prefer it.

I think that the main new idea that's come out of this thread so far
is that very frequent reindents would be as good, maybe better, as
once-a-year reindents in terms of the amount of rebasing pain they
cause for not-yet-committed patches.  If we can fix it so that any
mis-indented commit is quickly corrected, then rebasing would only be
needed in places that were changed anyway.  So it seems like that
would be OK as a new project policy if we can make it happen.

However, I don't see any way to make it happen like that without
more-or-less automated reindents and typedefs.list updates,
and that remains a bit scary.

I did just have an idea that seems to ameliorate the scariness
a good bit: allow the reindent bot to auto-commit its results
only if the only lines it's changing are ones that were touched
by commits since the last auto-reindent.  Otherwise punt and ask
a human to review the results.  Not sure how hard that is to
implement, though.

Another good safety check would be to not proceed unless the latest
typedefs list available from the buildfarm is newer than the last
commit --- then we won't mess up recent commits whose typedefs are
not in the list yet.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

21 August 2020, 16:50:40

On Fri, Aug 14, 2020 at 10:26 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2020-Aug-13, Magnus Hagander wrote:

> That is:
> 1. Whenever a patch is pushed on master on the main repo a process kicked
> off (or maybe wait 5 minutes to coalesce multiple patches if there are)
> 2. This process checks out master, and runs pgindent on it
> 3. When done, this gets committed to a new branch (or just overwrites an
> existing branch of course, we don't need to maintain history here) like
> "master-indented". This branch can be in a different repo, but one that
> starts out as a clone of the main one
> 4. A committer (any committer) can then on regular basis examine the
> differences by fetch + diff. If they're happy with it, cherry pick it in.
> If not, figure out what needs to be done to adjust it.

Sounds good -- for branch master.

So mostly for testing, I've set up a job that does this.

Basically it runs every 15 minutes and if there is a new commit on master it will rebase onto the latest master and run pgindent on it. This then gets pushed up to a separate repo (postgresql-pgindent.git on git.postgresql.org), and can be viewed there.

To see the current state of pgindent, view:

https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=commitdiff;h=master-pgindent

(or decorate as wanted to see for example a raw patch format)

If a committer wants to use it directly, just "git remote add" the postgresql-pgindent.git and then cherry-pick the branch tip into your own repository, and push. Well, actually that will right now fail because the commit is made by "Automatic pgindent" which is not an approved committer, but if we want to do this as a more regular thing, we can certainly fix that.

Note that the only thing the job does is run pgindent. It does not attempt to do anything with the typedef list at this point.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

20 January 2023, 09:43:50

As a not very frequent submitter, on all the patches that I submit I
keep running into this problem. I have two open patchsets for
libpq[1][2] both of which currently include the same initial "run
pgindent" patch in addition to the actual patch, just so I can
actually run it on my own patch because a9e9a9f caused formatting to
totally be off in the libpq directory. And there's lots of other
changes that pgindent wants to make, which are visible on the job that
Magnus has set up.

To me a master branch that pgindent never complains about sounds
amazing! And I personally think rejection of unindented pushes and
cfbot complaining about unindented patches would be a very good thing,
because that seems to be the only solution that could achieve that.

Having cfbot complain also doesn't sound like a crazy burden for
submitters. Many open-source projects have CI complaining if code
formatting does not pass automatic formatting tools. As long as there
is good documentation on how to install and run pgindent I don't think
it should be a big problem. A link to those docs could even be
included in the failing CI job its error message. A pre-commit hook
that submitters/committers could install would be super useful too.
Since right now I sometimes forget to run pgindent, especially since
there's no editor integration (that I know of) for pgindent.

Side-question: What's the reason why pgindent is used instead of some
more "modern" code formatter that doesn't require keeping
typedefs.list up to date for good looking output? (e.g. uncrustify or
clang-format) Because that would also allow for easy editor
integration.

[1]: https://commitfest.postgresql.org/41/3511/
[2]: https://commitfest.postgresql.org/41/3679/

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

20 January 2023, 16:42:33

On Fri, Jan 20, 2023 at 10:43:50AM +0100, Jelte Fennema wrote:
> Side-question: What's the reason why pgindent is used instead of some
> more "modern" code formatter that doesn't require keeping
> typedefs.list up to date for good looking output? (e.g. uncrustify or
> clang-format) Because that would also allow for easy editor
> integration.

Good question.  Our last big pgindent dicussion was in 2017, where I
said:

    https://www.postgresql.org/message-id/flat/20170612213525.GA4074%40momjian.us#a96eac96c147ebcc1de86fe2356a160d
    
    Understood.  You would think that with the number of open-source
    programs written in C that there would be more interest in C formatting
    tools.  Is the Postgres community the only ones with specific
    requirements, or is it just that we settled on an older tool and can't
    easily change?  I have reviewed the C formatting options a few times
    over the years and every time the other options were worse than what we
    had.

We also discussed it in 2011, and this email was key for me:

    https://www.postgresql.org/message-id/flat/201106220218.p5M2InB08144%40momjian.us#096cbcf02cb58c7d6c49bc79d2c79317
    
    I am excited Andrew has done this.  It has been on my TODO list for a
    while --- I was hoping someday we could switch to GNU indent but gave up
    after the GNU indent report from Greg Stark that exactly matched my
    experience years ago:
    
        http://archives.postgresql.org/pgsql-hackers/2011-04/msg01436.php
    
    Basically, GNU indent has new bugs, but bugs that are harder to work
    around than the BSD indent bugs.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

20 January 2023, 17:09:05

Jelte Fennema <postgres@jeltef.nl> writes:
> To me a master branch that pgindent never complains about sounds
> amazing! And I personally think rejection of unindented pushes and
> cfbot complaining about unindented patches would be a very good thing,
> because that seems to be the only solution that could achieve that.

The core problem here is that requiring that would translate to
requiring every code contributor to have a working copy of pg_bsd_indent.
Maybe that's not a huge lift, but it'd be YA obstacle to new contributors,
and we don't need any more of those.

Yeah, if we switched to some other tool maybe we could reduce the size
of that problem.  But as Bruce replied, we've not found another one that
(a) can be coaxed to make output comparable to what we're accustomed to
and (b) seems decently well maintained.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

20 January 2023, 17:58:06

Hi,

On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
> Jelte Fennema <postgres@jeltef.nl> writes:
> > To me a master branch that pgindent never complains about sounds
> > amazing! And I personally think rejection of unindented pushes and
> > cfbot complaining about unindented patches would be a very good thing,
> > because that seems to be the only solution that could achieve that.
> 
> The core problem here is that requiring that would translate to
> requiring every code contributor to have a working copy of pg_bsd_indent.

Wouldn't just every committer suffice?


> Maybe that's not a huge lift, but it'd be YA obstacle to new contributors,
> and we don't need any more of those.
> 
> Yeah, if we switched to some other tool maybe we could reduce the size
> of that problem.  But as Bruce replied, we've not found another one that
> (a) can be coaxed to make output comparable to what we're accustomed to
> and (b) seems decently well maintained.

One question around this is how much change we'd accept. clang-format for
example is well maintained and can get somewhat close to our style - but
there are things that can't easily be approximated.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

20 January 2023, 18:19:47

Andres Freund <andres@anarazel.de> writes:
> On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
>> The core problem here is that requiring that would translate to
>> requiring every code contributor to have a working copy of pg_bsd_indent.

> Wouldn't just every committer suffice?

Not if we have cfbot complaining about it.

(Another problem here is that there's a sizable subset of committers
who clearly just don't care, and I'm not sure we can convince them to.)

>> Yeah, if we switched to some other tool maybe we could reduce the size
>> of that problem.  But as Bruce replied, we've not found another one that
>> (a) can be coaxed to make output comparable to what we're accustomed to
>> and (b) seems decently well maintained.

> One question around this is how much change we'd accept. clang-format for
> example is well maintained and can get somewhat close to our style - but
> there are things that can't easily be approximated.

If somebody wants to invest the effort in seeing how close we can get
and what the remaining delta would look like, we could have a discussion
about whether it's an acceptable change.  I don't think anyone has
tried with clang-format.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

21 January 2023, 13:26:05

On 2023-01-20 Fr 13:19, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
>>> The core problem here is that requiring that would translate to
>>> requiring every code contributor to have a working copy of pg_bsd_indent.
>> Wouldn't just every committer suffice?
> Not if we have cfbot complaining about it.
>
> (Another problem here is that there's a sizable subset of committers
> who clearly just don't care, and I'm not sure we can convince them to.)

I think we could do better with some automation tooling for committers
here. One low-risk and simple change would be to provide a
non-destructive mode for pgindent that would show you the changes if any
it would make. That could be worked into a git pre-commit hook that
committers could deploy. I can testify to the usefulness of such hooks -
I have one that while not perfect has saved me on at least two occasions
from forgetting to bump the catalog version.

I'll take a look at fleshing this out, for my own if no-one else's use.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

21 January 2023, 15:00:31

On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:
> On 2023-01-20 Fr 13:19, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
>>>> The core problem here is that requiring that would translate to
>>>> requiring every code contributor to have a working copy of pg_bsd_indent.
>>> Wouldn't just every committer suffice?
>> Not if we have cfbot complaining about it.
>>
>> (Another problem here is that there's a sizable subset of committers
>> who clearly just don't care, and I'm not sure we can convince them to.)
>
> I think we could do better with some automation tooling for committers
> here. One low-risk and simple change would be to provide a
> non-destructive mode for pgindent that would show you the changes if any
> it would make. That could be worked into a git pre-commit hook that
> committers could deploy. I can testify to the usefulness of such hooks -
> I have one that while not perfect has saved me on at least two occasions
> from forgetting to bump the catalog version.
>
> I'll take a look at fleshing this out, for my own if no-one else's use.
>
>

Here's a quick patch for this. I have it in mind to use like this in a
pre-commit hook:

    # only do this on master
    test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0

    src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \

      { echo "Need a pgindent run" >&2 ; exit 1; }


The committer could then run

    src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`

to see what changes it thinks are needed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

21 January 2023, 15:02:15

On 2023-01-21 Sa 10:00, Andrew Dunstan wrote:
> On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:
>> On 2023-01-20 Fr 13:19, Tom Lane wrote:
>>> Andres Freund <andres@anarazel.de> writes:
>>>> On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
>>>>> The core problem here is that requiring that would translate to
>>>>> requiring every code contributor to have a working copy of pg_bsd_indent.
>>>> Wouldn't just every committer suffice?
>>> Not if we have cfbot complaining about it.
>>>
>>> (Another problem here is that there's a sizable subset of committers
>>> who clearly just don't care, and I'm not sure we can convince them to.)
>> I think we could do better with some automation tooling for committers
>> here. One low-risk and simple change would be to provide a
>> non-destructive mode for pgindent that would show you the changes if any
>> it would make. That could be worked into a git pre-commit hook that
>> committers could deploy. I can testify to the usefulness of such hooks -
>> I have one that while not perfect has saved me on at least two occasions
>> from forgetting to bump the catalog version.
>>
>> I'll take a look at fleshing this out, for my own if no-one else's use.
>>
>>
> Here's a quick patch for this. I have it in mind to use like this in a
> pre-commit hook:
>
>     # only do this on master
>     test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0
>
>     src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \
>
>       { echo "Need a pgindent run" >&2 ; exit 1; }
>
>
> The committer could then run
>
>     src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`
>
> to see what changes it thinks are needed.
>
>
This time with patch.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent_silent_plus_showdiff.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

21 January 2023, 15:24:02

On 2023-01-21 Sa 10:02, Andrew Dunstan wrote:
> On 2023-01-21 Sa 10:00, Andrew Dunstan wrote:
>> On 2023-01-21 Sa 08:26, Andrew Dunstan wrote:
>>> On 2023-01-20 Fr 13:19, Tom Lane wrote:
>>>> Andres Freund <andres@anarazel.de> writes:
>>>>> On 2023-01-20 12:09:05 -0500, Tom Lane wrote:
>>>>>> The core problem here is that requiring that would translate to
>>>>>> requiring every code contributor to have a working copy of pg_bsd_indent.
>>>>> Wouldn't just every committer suffice?
>>>> Not if we have cfbot complaining about it.
>>>>
>>>> (Another problem here is that there's a sizable subset of committers
>>>> who clearly just don't care, and I'm not sure we can convince them to.)
>>> I think we could do better with some automation tooling for committers
>>> here. One low-risk and simple change would be to provide a
>>> non-destructive mode for pgindent that would show you the changes if any
>>> it would make. That could be worked into a git pre-commit hook that
>>> committers could deploy. I can testify to the usefulness of such hooks -
>>> I have one that while not perfect has saved me on at least two occasions
>>> from forgetting to bump the catalog version.
>>>
>>> I'll take a look at fleshing this out, for my own if no-one else's use.
>>>
>>>
>> Here's a quick patch for this. I have it in mind to use like this in a
>> pre-commit hook:
>>
>>     # only do this on master
>>     test `git rev-parse --abbrev-ref HEAD` = "master" || exit 0
>>
>>     src/tools/pgindent/pg_indent --silent `git diff --cached --name-only` || \
>>
>>       { echo "Need a pgindent run" >&2 ; exit 1; }
>>
>>
>> The committer could then run
>>
>>     src/tools/pgindent/pg_indent --show-diff `git diff --cached --name-only`
>>
>> to see what changes it thinks are needed.
>>
>>
> This time with patch.
>
>
... with typo fixed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent_silent_plus_showdiff-2.patch

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

21 January 2023, 16:10:08

Andrew Dunstan <andrew@dunslane.net> writes:
>>> I think we could do better with some automation tooling for committers
>>> here. One low-risk and simple change would be to provide a
>>> non-destructive mode for pgindent that would show you the changes if any
>>> it would make. That could be worked into a git pre-commit hook that
>>> committers could deploy. I can testify to the usefulness of such hooks -
>>> I have one that while not perfect has saved me on at least two occasions
>>> from forgetting to bump the catalog version.

That sounds like a good idea from here.  I do not think we want a
mandatory commit filter, but if individual committers care to work
this into their process in some optional way, great!  I can think
of ways I'd use it during patch development, too.

(One reason not to want a mandatory filter is that you might wish
to apply pgindent as a separate commit, so that you can then
put that commit into .git-blame-ignore-revs.  This could be handy
for example when a patch needs to change the nesting level of a lot
of pre-existing code, without making changes in it otherwise.)

>> This time with patch.

> ... with typo fixed.

Looks reasonable, but you should also update
src/tools/pgindent/pgindent.man, which AFAICT is our only
documentation for pgindent switches.  (Is it time for a
--help option in pgindent?)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

21 January 2023, 16:24:01

... btw, can we get away with making the diff run be "diff -upd"
not just "diff -u"?  I find diff output for C files noticeably
more useful with those options, but I'm unsure about their
portability.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

21 January 2023, 17:30:32

On Fri, Jan 20, 2023 at 10:43:50AM +0100, Jelte Fennema wrote:
> Side-question: What's the reason why pgindent is used instead of some
> more "modern" code formatter that doesn't require keeping
> typedefs.list up to date for good looking output? (e.g. uncrustify or
> clang-format) Because that would also allow for easy editor
> integration.

One reason the typedef list is required is a quirk of the C syntax. 
Most languages have a lexer/scanner, which tokenizes, and a parser,
which parses.  The communication is usually one-way, lexer to parser. 
For C, typedefs require the parser to feed new typedefs back into the
lexer:

    http://calculist.blogspot.com/2009/02/c-typedef-parsing-problem.html

BSD indent doesn't have that feedback mechanism, probably because it
doesn't fully parse the C file.  Therefore, we have to supply typedefs
manually, and for Postgres we pull them from debug-enabled binaries in
our buildfarm.  The problem with that is you often import typedefs from
system headers, and the typedefs apply to all C files, not just the ones
were the typdefs are visible.

I don't see uncrustify or clang-format supporting typedef lists so maybe
they implemented this feedback loop.  It would be good to see if we can
get either of these tools to match our formatting.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

21 January 2023, 20:21:03

On Sat, Jan 21, 2023 at 9:30 AM Bruce Momjian <bruce@momjian.us> wrote:
> I don't see uncrustify or clang-format supporting typedef lists so maybe
> they implemented this feedback loop.  It would be good to see if we can
> get either of these tools to match our formatting.

I personally use clang-format for Postgres development work, since it
integrates nicely with my text editor, and can be configured to
produce approximately the same result as pgindent (certainly good
enough when working on a patch that's still far from a committable
state). I'm fairly sure that clang-format has access to a full AST
from the clang compiler, which is the ideal approach - at least in
theory.

In practice this approach tends to run into problems when the relevant
AST isn't available. For example, if there's code that only builds on
Windows, maybe it won't work at all (at least on my Linux system).
This doesn't really bother me currently, since I only rely on
clang-format as a first pass sort of thing. Maybe I could figure out a
better way to deal with such issues, but right now I don't have much
incentive to.

Another advantage of clang-format is that it's a known quantity. For
example there is direct support for it built into meson, with bells
and whistles such as CI support:

https://mesonbuild.com/Code-formatting.html

My guess is that moving to clang-format would require giving up some
flexibility, but getting much better integration with text editors and
tools like meson in return. It would probably make it practical to
have much stronger rules about how committed code must be indented --
rules that are practical, and can actually be enforced. That trade-off
seems likely to be worth it in my view, though it's not something that
I feel too strongly about.

-- 
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

21 January 2023, 20:59:45

Peter Geoghegan <pg@bowt.ie> writes:
> In practice this approach tends to run into problems when the relevant
> AST isn't available. For example, if there's code that only builds on
> Windows, maybe it won't work at all (at least on my Linux system).

Hmm, that could be a deal-breaker.  It's not going to be acceptable
to have to pgindent different parts of the system on different platforms
... at least not unless we can segregate them on the file level, and
even that would have a large PITA factor.

Still, we won't know unless someone makes a serious experiment with it.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

21 January 2023, 22:05:41

On Sat, Jan 21, 2023 at 12:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm, that could be a deal-breaker.  It's not going to be acceptable
> to have to pgindent different parts of the system on different platforms
> ... at least not unless we can segregate them on the file level, and
> even that would have a large PITA factor.

It's probably something that could be worked around. My remarks are
based on some dim memories of dealing with the tool before I arrived
at a configuration that works well enough for me. Importantly,
clang-format doesn't require you to futz around with Makefiles or
objdump or anything like that -- that's a huge plus. It doesn't seem
to impose any requirements on how I build Postgres at all (I generally
use gcc, not clang).

Even if these kinds of issues proved to be a problem for the person
tasked with running clang-format against the whole tree periodically,
they still likely wouldn't affect most of us. It's quite convenient to
use clang-format from an editor -- it can be invoked very
incrementally, against a small range of lines at a time. It's pretty
much something that I can treat like the built-in indent for my
editor. It's vastly different to the typical pgindent workflow.

> Still, we won't know unless someone makes a serious experiment with it.

There is one thing about clang-format that I find mildly infuriating:
it can indent function declarations in the way that I want it to, and
it can indent variable declarations in the way that I want it to. It
just can't do both at the same time, because they're both controlled
by AlignConsecutiveDeclarations.

Of course the way that I want to do things is (almost by definition)
the pgindent way, at least right now -- it's not necessarily about my
fixed preferences (though it can be hard to tell!). It's really not
surprising that clang-format cannot quite perfectly simulate pgindent.
How flexible can we be about stuff like that? Obviously there is no
clear answer right now.

-- 
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

21 January 2023, 22:20:35

Peter Geoghegan <pg@bowt.ie> writes:
> Of course the way that I want to do things is (almost by definition)
> the pgindent way, at least right now -- it's not necessarily about my
> fixed preferences (though it can be hard to tell!). It's really not
> surprising that clang-format cannot quite perfectly simulate pgindent.
> How flexible can we be about stuff like that? Obviously there is no
> clear answer right now.

I don't feel wedded to every last detail of what pgindent does (and
especially not the bugs).  But I think if the new tool is not a pretty
close match we'll be in for years of back-patching pain.  We have made
changes in pgindent itself in the past, and the patching consequences
weren't *too* awful, but the changes weren't very big either.

As I said upthread, this is really impossible to answer without a
concrete proposal of how to configure clang-format and a survey of
what diffs we'd wind up with.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

21 January 2023, 22:43:12

Hi,

On 2023-01-21 14:05:41 -0800, Peter Geoghegan wrote:
> On Sat, Jan 21, 2023 at 12:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Hmm, that could be a deal-breaker.  It's not going to be acceptable
> > to have to pgindent different parts of the system on different platforms
> > ... at least not unless we can segregate them on the file level, and
> > even that would have a large PITA factor.

Unless I miss something, I don't think clang-format actually does that level
of C parsing - you can't pass include paths etc, so it really can't.


> It's probably something that could be worked around. My remarks are
> based on some dim memories of dealing with the tool before I arrived
> at a configuration that works well enough for me.

Could you share your .clang-format?



> > Still, we won't know unless someone makes a serious experiment with it.
> 
> There is one thing about clang-format that I find mildly infuriating:
> it can indent function declarations in the way that I want it to, and
> it can indent variable declarations in the way that I want it to. It
> just can't do both at the same time, because they're both controlled
> by AlignConsecutiveDeclarations.
> 
> Of course the way that I want to do things is (almost by definition)
> the pgindent way, at least right now -- it's not necessarily about my
> fixed preferences (though it can be hard to tell!). It's really not
> surprising that clang-format cannot quite perfectly simulate pgindent.
> How flexible can we be about stuff like that? Obviously there is no
> clear answer right now.

I personally find the current indentation of variables assignment deeply
unhelpful - but changing it would be a very noisy change.


Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

21 January 2023, 22:47:00

Hi,

On 2023-01-21 17:20:35 -0500, Tom Lane wrote:
> I don't feel wedded to every last detail of what pgindent does (and
> especially not the bugs).  But I think if the new tool is not a pretty
> close match we'll be in for years of back-patching pain.  We have made
> changes in pgindent itself in the past, and the patching consequences
> weren't *too* awful, but the changes weren't very big either.

Perhaps we could backpatch formatting changes in a way that doesn't
inconvenience forks too much. One way to deal with such changes is to

1) revert the re-indent commits in $backbranch
2) merge $backbranch-with-revert into $forkbranch
3) re-indent $forkbranch

After that future changes should be mergable again.

Certainly doesn't do away with the pain entirely, but it does make it perhaps
bearable

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

21 January 2023, 23:32:45

On Sat, Jan 21, 2023 at 2:43 PM Andres Freund <andres@anarazel.de> wrote:
> Unless I miss something, I don't think clang-format actually does that level
> of C parsing - you can't pass include paths etc, so it really can't.

It's hard to keep track of, since I also use clangd, which is
influenced by .clang-format for certain completions. It clearly does
plenty of stuff that requires an AST, since it requires a
compile_commands.json. You're the LLVM committer, not me.

Attached is my .clang-format, since you asked for it. It was
originally based on stuff that both you and Peter E posted several
years back, I believe. Plus the timescaledb one in one or two places.
I worked a couple of things out through trial and error. It's
relatively hard to follow the documentation, and there have been
features added to newer LLVM versions.

-- 
Peter Geoghegan

Attachment

clang-format

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

21 January 2023, 23:39:25

Attached are a .clang-format file and an uncrustify.cfg file that I
whipped up to match the current style at least somewhat.

I was unable to get either of them to produce the weird alignment of
declarations that pgindent outputs. Maybe that's just because I don't
understand what this alignment is supposed to do. Because to me the
current alignment seems completely random most of the time (and like
Andres said also not very unhelpful). For clang-format you should use
at least clang-format 15, otherwise it has some bugs in the alignment
logic.

One thing that clang-format really really wants to change in the
codebase, is making it consistent on what to do when
arguments/parameters don't fit on a single line: You can either choose
to minimally pack everything on the minimum number of lines, or to put
all arguments on their own separate line. Uncrustify is a lot less
strict about that and will leave most things as they currently are.

Overall it seems that uncrustify is a lot more customizable, whether
that's a good thing is debatable. Personally I think the fewer
possible debates I have about codestyle the better, so I usually like
the tools with fewer options better. But if the customizability allows
for closer matching of existing codestyle then it might be worth the
extra debates and effort in customization in this case.

> 1) revert the re-indent commits in $backbranch
> 2) merge $backbranch-with-revert into $forkbranch
> 3) re-indent $forkbranch

The git commands to achieve this are something like the following.
I've used such git commands in the past to make big automatic
refactors much easier to get in and it has worked quite well so far.

git checkout forkbranch
git rebase {commit-before-re-indent}
# now, get many merge conflicts
git rebase {re-indent-commit}
# keep your own changes (its --theirs instead of --ours because rebase
flips it around)
git checkout --theirs .
run-new-reindent
git add .
git rebase --continue

On Sat, 21 Jan 2023 at 23:47, Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-01-21 17:20:35 -0500, Tom Lane wrote:
> > I don't feel wedded to every last detail of what pgindent does (and
> > especially not the bugs).  But I think if the new tool is not a pretty
> > close match we'll be in for years of back-patching pain.  We have made
> > changes in pgindent itself in the past, and the patching consequences
> > weren't *too* awful, but the changes weren't very big either.
>
> Perhaps we could backpatch formatting changes in a way that doesn't
> inconvenience forks too much. One way to deal with such changes is to
>
> 1) revert the re-indent commits in $backbranch
> 2) merge $backbranch-with-revert into $forkbranch
> 3) re-indent $forkbranch
>
> After that future changes should be mergable again.
>
> Certainly doesn't do away with the pain entirely, but it does make it perhaps
> bearable
>
> Greetings,
>
> Andres Freund

Attachment

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

22 January 2023, 03:19:15

On Sat, Jan 21, 2023 at 3:39 PM Jelte Fennema <postgres@jeltef.nl> wrote:
> I was unable to get either of them to produce the weird alignment of
> declarations that pgindent outputs. Maybe that's just because I don't
> understand what this alignment is supposed to do. Because to me the
> current alignment seems completely random most of the time (and like
> Andres said also not very unhelpful). For clang-format you should use
> at least clang-format 15, otherwise it has some bugs in the alignment
> logic.

Really? I have been using 14, which is quite recent. Did you just
figure this out recently? If this is true, then it's certainly
discouraging.

I don't have a problem with the current pgindent alignment of function
parameters, so not sure what you mean about that. It *was* terrible
prior to commit e3860ffa, but that was back in 2017 (pg_bsd_indent 2.0
fixed that problem).

-- 
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

22 January 2023, 04:08:34

On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:
> There is one thing about clang-format that I find mildly infuriating:
> it can indent function declarations in the way that I want it to, and
> it can indent variable declarations in the way that I want it to. It
> just can't do both at the same time, because they're both controlled
> by AlignConsecutiveDeclarations.

Looks like I'm not the only one that doesn't like this behavior:

https://github.com/llvm/llvm-project/issues/55605

-- 
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

22 January 2023, 04:21:22

On Sat, Jan 21, 2023 at 08:08:34PM -0800, Peter Geoghegan wrote:
> On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > There is one thing about clang-format that I find mildly infuriating:
> > it can indent function declarations in the way that I want it to, and
> > it can indent variable declarations in the way that I want it to. It
> > just can't do both at the same time, because they're both controlled
> > by AlignConsecutiveDeclarations.
> 
> Looks like I'm not the only one that doesn't like this behavior:
> 
> https://github.com/llvm/llvm-project/issues/55605

Wow, that is very weird.  When I work with other open source projects, I
am regularly surprised by their low quality requirements.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

22 January 2023, 07:05:45

Hi,

On 2023-01-21 23:21:22 -0500, Bruce Momjian wrote:
> On Sat, Jan 21, 2023 at 08:08:34PM -0800, Peter Geoghegan wrote:
> > On Sat, Jan 21, 2023 at 2:05 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > > There is one thing about clang-format that I find mildly infuriating:
> > > it can indent function declarations in the way that I want it to, and
> > > it can indent variable declarations in the way that I want it to. It
> > > just can't do both at the same time, because they're both controlled
> > > by AlignConsecutiveDeclarations.
> > 
> > Looks like I'm not the only one that doesn't like this behavior:
> > 
> > https://github.com/llvm/llvm-project/issues/55605
> 
> Wow, that is very weird.  When I work with other open source projects, I
> am regularly surprised by their low quality requirements.

I don't think not fulfulling precisely our needs is the same as low quality
requirements.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

22 January 2023, 09:49:57

Hi,

On 2023-01-21 15:32:45 -0800, Peter Geoghegan wrote:
> Attached is my .clang-format, since you asked for it. It was
> originally based on stuff that both you and Peter E posted several
> years back, I believe. Plus the timescaledb one in one or two places.
> I worked a couple of things out through trial and error. It's
> relatively hard to follow the documentation, and there have been
> features added to newer LLVM versions.

Reformatting with your clang-format end up with something like:

Peter's:
 2234 files changed, 334753 insertions(+), 289772 deletions(-)

Jelte's:
 2236 files changed, 357500 insertions(+), 306815 deletions(-)

Mine (modified to reduce this):
 2226 files changed, 261538 insertions(+), 256039 deletions(-)

Which is all at least an order of magnitude too much.

Jelte's uncrustify:
 1773 files changed, 121722 insertions(+), 125369 deletions(-)

better, but still not great. Also had to prevent a file files it choked on
from getting reindented.

I think the main issue with either is that our variable definition indentation
just can't be emulated by the tools as-is.

Some tools can indent variable definitions so that the variable name starts on
the same column. Some can limit that for too long type names. But so far I
haven't seen one that cn make that column be column +12. They all look to
other surrounding types.

I hate that variable name indentation with a fiery passion. But switching away
from that intermixed with a lot of other changes isn't going to be fun.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

22 January 2023, 15:01:13

On 2023-01-21 Sa 11:10, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>>>> I think we could do better with some automation tooling for committers
>>>> here. One low-risk and simple change would be to provide a
>>>> non-destructive mode for pgindent that would show you the changes if any
>>>> it would make. That could be worked into a git pre-commit hook that
>>>> committers could deploy. I can testify to the usefulness of such hooks -
>>>> I have one that while not perfect has saved me on at least two occasions
>>>> from forgetting to bump the catalog version.
> That sounds like a good idea from here.  I do not think we want a
> mandatory commit filter, but if individual committers care to work
> this into their process in some optional way, great!  I can think
> of ways I'd use it during patch development, too.


Yes, it's intended for use at committers' discretion. We have no way of
forcing use of a git hook on committers, although we could reject pushes
that offend against certain rules. For the reasons you give below that's
not a good idea. A pre-commit hook can be avoided by using `git commit
-n` and there's are similar option/hook for `git merge`.


>
> (One reason not to want a mandatory filter is that you might wish
> to apply pgindent as a separate commit, so that you can then
> put that commit into .git-blame-ignore-revs.  This could be handy
> for example when a patch needs to change the nesting level of a lot
> of pre-existing code, without making changes in it otherwise.)


Agreed.


> Looks reasonable, but you should also update
> src/tools/pgindent/pgindent.man, which AFAICT is our only
> documentation for pgindent switches.  (Is it time for a
> --help option in pgindent?)
>
>             


Yes, see revised patch.


> ... btw, can we get away with making the diff run be "diff -upd"
> not just "diff -u"?  I find diff output for C files noticeably
> more useful with those options, but I'm unsure about their
> portability.


I think they are available on Linux, MacOS and FBSD, and on Windows (if
anyone's actually using it for this) it's likely to be Gnu diff. So I
think that's probably enough coverage.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent_silent_plus_showdiff-3.patch

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 16:18:29

Andrew Dunstan <andrew@dunslane.net> writes:
>> ... btw, can we get away with making the diff run be "diff -upd"
>> not just "diff -u"?  I find diff output for C files noticeably
>> more useful with those options, but I'm unsure about their
>> portability.

> I think they are available on Linux, MacOS and FBSD, and on Windows (if
> anyone's actually using it for this) it's likely to be Gnu diff. So I
> think that's probably enough coverage.

I checked NetBSD as well, and it has all three too.

Patch looks good to me.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 16:35:17

Peter Geoghegan <pg@bowt.ie> writes:
> On Sat, Jan 21, 2023 at 3:39 PM Jelte Fennema <postgres@jeltef.nl> wrote:
>> ...  For clang-format you should use
>> at least clang-format 15, otherwise it has some bugs in the alignment
>> logic.

> Really? I have been using 14, which is quite recent. Did you just
> figure this out recently? If this is true, then it's certainly
> discouraging.

Indeed.  What that points to is a future where different contributors
get different results depending on what clang version they have
installed --- and it's not going to be practical to insist that
everybody have the same version, because AFAICT clang-format is tied
to clang itself.  So that sounds a bit unappetizing.

One of the few advantages of the current tool situation is that at any
time there's just one agreed-on version of pgindent and pgperltidy.
I've not heard push-back about our policy that you should use
perltidy version 20170521, because that's not especially connected
to any other part of one's system.  Maybe the same would hold for
uncrustify, but it's never going to work for pieces of the clang
ecosystem.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

22 January 2023, 17:20:49

> But so far I haven't seen one that can make that
> column be column +12.

Thanks for clarifying what the current variable declaration indention
rule is. Indeed neither uncrustify or clang-format seem to support
that. Getting uncrustify to support it might not be too difficult, but
the question remains if we even want that.

> But switching away from that intermixed with a lot of other changes isn't going to be fun.

I don't think the amount of pain is really much lower if we reformat
10,000 or 300,000 lines of code, without automation both would be
quite painful. But the git commands I shared in my previous email
should alleviate most of that pain.

> I don't have a problem with the current pgindent alignment of function
> parameters, so not sure what you mean about that.

Function parameter alignment is fine with pgindent imho, but this +12
column variable declaration thing I personally think is quite weird.

> Really? I have been using 14, which is quite recent. Did you just
> figure this out recently? If this is true, then it's certainly
> discouraging.

It seems this was due to my Ubuntu 22.04 install having clang-format
14.0.0. After
updating it to 14.0.6 by using the official llvm provided packages, I
don't have this
issue on clang-format-14 anymore. To be clear this was an issue in alignment of
variable declarations not function parameters.

But I agree with Tom Lane that this makes clear that whatever tool we
pick we'll need
to pick a specific version, just like we do now with perltidy. And
indeed I'm not sure
how easy that is with clang. Installing a specific uncrustify version
is pretty easy btw,
the compilation from source is quite quick.

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

22 January 2023, 18:14:24

But I do think this discussion about other formatting tools
is distracting from the main pain point I wanted to discuss:
our current formatting tool is not run consistently enough.
The only thing that another tool will change in this
regard is that there is no need to update typedefs.list.
It doesn't seem like that's such a significant difference
that it would change the solution to the first problem.

When reading the emails in this discussion from 2 years ago
it seems like the respondents wouldn't mind updating the
typedefs.list manually. And proposed approach number 3
seemed to have support overall, i.e. fail a push to master
when pgindent was not run on the new commit. Would
it make sense to simply try that approach and see if
there's any big issues with it?

> (Another problem here is that there's a sizable subset of committers
> who clearly just don't care, and I'm not sure we can convince them to.)

My guess would be that the main reason is simply
because committers forget it sometimes because
there's no automation complaining about it.

On Sun, 22 Jan 2023 at 18:20, Jelte Fennema <postgres@jeltef.nl> wrote:
>
> > But so far I haven't seen one that can make that
> > column be column +12.
>
> Thanks for clarifying what the current variable declaration indention
> rule is. Indeed neither uncrustify or clang-format seem to support
> that. Getting uncrustify to support it might not be too difficult, but
> the question remains if we even want that.
>
> > But switching away from that intermixed with a lot of other changes isn't going to be fun.
>
> I don't think the amount of pain is really much lower if we reformat
> 10,000 or 300,000 lines of code, without automation both would be
> quite painful. But the git commands I shared in my previous email
> should alleviate most of that pain.
>
> > I don't have a problem with the current pgindent alignment of function
> > parameters, so not sure what you mean about that.
>
> Function parameter alignment is fine with pgindent imho, but this +12
> column variable declaration thing I personally think is quite weird.
>
> > Really? I have been using 14, which is quite recent. Did you just
> > figure this out recently? If this is true, then it's certainly
> > discouraging.
>
> It seems this was due to my Ubuntu 22.04 install having clang-format
> 14.0.0. After
> updating it to 14.0.6 by using the official llvm provided packages, I
> don't have this
> issue on clang-format-14 anymore. To be clear this was an issue in alignment of
> variable declarations not function parameters.
>
> But I agree with Tom Lane that this makes clear that whatever tool we
> pick we'll need
> to pick a specific version, just like we do now with perltidy. And
> indeed I'm not sure
> how easy that is with clang. Installing a specific uncrustify version
> is pretty easy btw,
> the compilation from source is quite quick.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 19:20:02

Jelte Fennema <postgres@jeltef.nl> writes:
> When reading the emails in this discussion from 2 years ago
> it seems like the respondents wouldn't mind updating the
> typedefs.list manually. And proposed approach number 3
> seemed to have support overall, i.e. fail a push to master
> when pgindent was not run on the new commit. Would
> it make sense to simply try that approach and see if
> there's any big issues with it?

I will absolutely not accept putting in something that fails pushes
on this basis.  There are too many cases where pgindent purity is
not an overriding issue.  I mentioned a counterexample just upthread:
even if you are as anal as you could be about indentation, you might
prefer to separate a logic-changing patch from the ensuing mechanical
reindentation.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

22 January 2023, 21:19:10

Maybe I'm not understanding your issue correctly, but for such
a case you could push two commits at the same time. Apart
from that "git diff -w" will hide any whitespace changes so I'm
not I personally wouldn't consider it important to split such
changes across commits.

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

22 January 2023, 22:02:35

Hi,

On 2023-01-22 22:19:10 +0100, Jelte Fennema wrote:
> Maybe I'm not understanding your issue correctly, but for such
> a case you could push two commits at the same time.

Right.

> Apart from that "git diff -w" will hide any whitespace changes so I'm not I
> personally wouldn't consider it important to split such changes across
> commits.

I do think it's important. For one, the changes made by pgindent et al aren't
just whitespace ones. But I think it's also important to be able to see the
actual changes made in a patch precisely - lots of spurious whitespace changes
could indicate a problem.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

22 January 2023, 22:38:06

Hi,

On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:
> > But switching away from that intermixed with a lot of other changes isn't going to be fun.
> 
> I don't think the amount of pain is really much lower if we reformat
> 10,000 or 300,000 lines of code, without automation both would be
> quite painful. But the git commands I shared in my previous email
> should alleviate most of that pain.

It's practically not possible to review a 300k line change. And perhaps I'm
paranoid, but I would have a problem with a commit in the history that's
practically not reviewable.

> > I don't have a problem with the current pgindent alignment of function
> > parameters, so not sure what you mean about that.
> 
> Function parameter alignment is fine with pgindent imho, but this +12
> column variable declaration thing I personally think is quite weird.

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Personally I think the cost of trying to align local variable names is way way
higher than the benefit.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 22:47:37

Andres Freund <andres@anarazel.de> writes:
> I strongly dislike it, I rarely get it right by hand - but it does have some
> benefit over aligning variable names based on the length of the type names as
> uncrustify/clang-format: In their approach an added local variable can cause
> all the other variables to be re-indented (and their initial value possibly
> wrapped). The fixed alignment doesn't have that issue.

Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up.  That's just awful.  You can either
run pgperltidy on new code before committing, and accept that the feature
patch will touch a lot of lines it's not making real changes to (thereby
dirtying the "git blame" history) or not do so and thereby commit code
that's not passing tidiness checks.  Let's *not* adopt any style that
causes similar things to start happening in our C code.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 23:14:03

Jelte Fennema <postgres@jeltef.nl> writes:
> Maybe I'm not understanding your issue correctly, but for such
> a case you could push two commits at the same time.

I don't know that much about git commit hooks, but do they really
only check the final state of a series of commits?

In any case, I'm still down on the idea of checking this in a
commit hook because of the complexity and lack of transparency
of such a check.  If you think your commit is correctly indented,
but the hook (running on somebody else's machine) disagrees,
how are you going to debug that?  I don't want to get into such
a situation, especially since Murphy's law guarantees that it
would mainly bite people under time pressure, like when pushing
a security fix.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 January 2023, 23:28:27

Andres Freund <andres@anarazel.de> writes:
> On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:
>> I don't think the amount of pain is really much lower if we reformat
>> 10,000 or 300,000 lines of code, without automation both would be
>> quite painful. But the git commands I shared in my previous email
>> should alleviate most of that pain.

> It's practically not possible to review a 300k line change. And perhaps I'm
> paranoid, but I would have a problem with a commit in the history that's
> practically not reviewable.

As far as that goes, if you had concern then you could run the indentation
tool locally and confirm you got matching results.  But this does point up
that the processes Jelte suggested all depend critically on indentation
results being 100% reproducible by anybody.

So the more I think about it the less excited I am about depending on
clang-format, because version skew in peoples' clang installations seems
inevitable, and there's good reason to fear that that would show up
as varying indentation results.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

22 January 2023, 23:52:18

On Sun, Jan 22, 2023 at 3:28 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> So the more I think about it the less excited I am about depending on
> clang-format, because version skew in peoples' clang installations seems
> inevitable, and there's good reason to fear that that would show up
> as varying indentation results.

I have to admit that the way that I was thinking about this was
colored by the way that I use clang-format today. I only now realize
how different my requirements are to the requirements that we'd have
for any tool that gets run against the tree in bulk. In particular, I
didn't realize how annoying the non-additive nature of certain
variable alignment rules might be until you pointed it out today
(seems obvious now!).

In my experience clang-format really shines when you need to quickly
indent code that is indented in some way that looks completely wrong
-- it does quite a lot better than pgindent when that's your starting
point. It has a reasonable way of balancing competing goals like
maximum number of columns (a soft maximum) and how function parameters
are displayed, which pgindent can't do. It also avoids allowing a
function parameter from a function declaration with its type name on
its own line, before the variable name -- also beyond the capabilities
of pgindent IIRC.

Features like that make it very useful as a first pass thing, where
all the bells and whistles have little real downside.  Running
clang-format and then running pgindent tends to produce better results
than just running pgindent, at least when working on a new patch.

-- 
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

23 January 2023, 00:15:18

Hi,

On 2023-01-22 18:28:27 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-01-22 18:20:49 +0100, Jelte Fennema wrote:
> >> I don't think the amount of pain is really much lower if we reformat
> >> 10,000 or 300,000 lines of code, without automation both would be
> >> quite painful. But the git commands I shared in my previous email
> >> should alleviate most of that pain.
> 
> > It's practically not possible to review a 300k line change. And perhaps I'm
> > paranoid, but I would have a problem with a commit in the history that's
> > practically not reviewable.
> 
> As far as that goes, if you had concern then you could run the indentation
> tool locally and confirm you got matching results.

Of course, but I somehow feel a change of formatting should be reviewable to
at least some degree. Even if it's just to make sure that the tool didn't have
a bug and cause some subtle behavioural change.

> So the more I think about it the less excited I am about depending on
> clang-format, because version skew in peoples' clang installations seems
> inevitable, and there's good reason to fear that that would show up
> as varying indentation results.

One thing that I like about clang-format is that it's possible to treat it
about our include order rules (which does find some "irregularities). But of
course that's not enough.

If we decide to move to another tool, I think it might be worth to remove a
few of the pg_bsd_indent options, that other tools won't be able to emulate,
first. E.g. -di12 -> -di4 would remove a *lot* of the noise from a move to
another tool.  And be much easier to write manually, but ... :)

I think I've proposed this before, but I still think that as long as we rely
on pg_bsd_indent, we should have it be part of our source tree and
automatically built.  It's no wonder that barely anybody indents their
patches, given that it requires building pg_bsd_ident in a separate repo (but
referencing our sourc etree), putting the binary in path, etc.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

23 January 2023, 00:28:42

Andres Freund <andres@anarazel.de> writes:
> I think I've proposed this before, but I still think that as long as we rely
> on pg_bsd_indent, we should have it be part of our source tree and
> automatically built.  It's no wonder that barely anybody indents their
> patches, given that it requires building pg_bsd_ident in a separate repo (but
> referencing our sourc etree), putting the binary in path, etc.

Hmm ... right offhand, the only objection I can see is that the
pg_bsd_indent files use the BSD 4-clause license, which is not ours.
However, didn't UCB grant a blanket exception years ago that said
that people could treat that as the 3-clause license?  If we could
get past the license question, I agree that doing what you suggest
would be superior to the current situation.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

23 January 2023, 00:37:24

On Sun, Jan 22, 2023 at 07:28:42PM -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> I think I've proposed this before, but I still think that as long as we rely
>> on pg_bsd_indent, we should have it be part of our source tree and
>> automatically built.  It's no wonder that barely anybody indents their
>> patches, given that it requires building pg_bsd_ident in a separate repo (but
>> referencing our sourc etree), putting the binary in path, etc.
>
> Hmm ... right offhand, the only objection I can see is that the
> pg_bsd_indent files use the BSD 4-clause license, which is not ours.
> However, didn't UCB grant a blanket exception years ago that said
> that people could treat that as the 3-clause license?  If we could
> get past the license question, I agree that doing what you suggest
> would be superior to the current situation.

+1.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 January 2023, 00:50:10

On 2023-01-22 Su 18:14, Tom Lane wrote:
> Jelte Fennema <postgres@jeltef.nl> writes:
>> Maybe I'm not understanding your issue correctly, but for such
>> a case you could push two commits at the same time.
> I don't know that much about git commit hooks, but do they really
> only check the final state of a series of commits?


The pre-commit hook is literally run every time you do `git commit`. But
it's only run on your local instance and only if you have enabled it.
It's not project-wide.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

23 January 2023, 01:03:02

Hi,

On 2023-01-22 19:50:10 -0500, Andrew Dunstan wrote:
> On 2023-01-22 Su 18:14, Tom Lane wrote:
> > Jelte Fennema <postgres@jeltef.nl> writes:
> >> Maybe I'm not understanding your issue correctly, but for such
> >> a case you could push two commits at the same time.
> > I don't know that much about git commit hooks, but do they really
> > only check the final state of a series of commits?
> 
> 
> The pre-commit hook is literally run every time you do `git commit`. But
> it's only run on your local instance and only if you have enabled it.
> It's not project-wide.

There's different hooks. Locally, I think pre-push would be better suited to
this than pre-commit (I often save WIP work in local branches, it'd be pretty
annoying if some indentation thing swore at me).

But there's also hooks like pre-receive, that allow doing validation on the
server side. Which obviously would be project wide...

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

23 January 2023, 01:10:02

Hi,

On 2023-01-22 19:28:42 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > I think I've proposed this before, but I still think that as long as we rely
> > on pg_bsd_indent, we should have it be part of our source tree and
> > automatically built.  It's no wonder that barely anybody indents their
> > patches, given that it requires building pg_bsd_ident in a separate repo (but
> > referencing our sourc etree), putting the binary in path, etc.
> 
> Hmm ... right offhand, the only objection I can see is that the
> pg_bsd_indent files use the BSD 4-clause license, which is not ours.
> However, didn't UCB grant a blanket exception years ago that said
> that people could treat that as the 3-clause license?

Yep:
https://www.freebsd.org/copyright/license/


NOTE: The copyright of UC Berkeley’s Berkeley Software Distribution ("BSD") source has been updated. The copyright
addendummay be found at ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change and is included below.
 

    July 22, 1999

    To All Licensees, Distributors of Any Version of BSD:

    As you know, certain of the Berkeley Software Distribution ("BSD") source code files require that further
distributionsof products containing all or portions of the software, acknowledge within their advertising materials
thatsuch products contain software developed by UC Berkeley and its contributors.
 

    Specifically, the provision reads:

          * 3. All advertising materials mentioning features or use of this software
          *    must display the following acknowledgement:
          *    This product includes software developed by the University of
          *    California, Berkeley and its contributors."

    Effective immediately, licensees and distributors are no longer required to include the acknowledgement within
advertisingmaterials. Accordingly, the foregoing paragraph of those BSD Unix files containing it is hereby deleted in
itsentirety.
 

    William Hoskins
    Director, Office of Technology Licensing
    University of California, Berkeley


I did check, and the FTP bit is still downloadable. A bit awkward though, now
that browsers have/are removing ftp support.


Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

23 January 2023, 01:29:36

Andres Freund <andres@anarazel.de> writes:
> On 2023-01-22 19:28:42 -0500, Tom Lane wrote:
>> Hmm ... right offhand, the only objection I can see is that the
>> pg_bsd_indent files use the BSD 4-clause license, which is not ours.
>> However, didn't UCB grant a blanket exception years ago that said
>> that people could treat that as the 3-clause license?

> Yep:
> https://www.freebsd.org/copyright/license/

Cool.  I'll take a look at doing this later (probably after the current
CF) unless somebody beats me to it.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

23 January 2023, 10:44:11

I whipped up a pre-commit hook which automatically runs pgindent on the
changed files in the commit. It won't add any changes automatically, but
instead it fails the commit if it made any changes. That way you can add
them manually if you want. Or if you don't, you can simply run git commit
again without adding the changes. (or you can use the --no-verify flag of
git commit to skip the hook completely)

It did require adding some extra flags to pgindent. While it only required
the --staged-only and --fail-on-changed flags, the --changed-only flag
was easy to add and seemed generally useful.

I also attached a patch which adds the rules for formatting pgindent
itself to the .editorconfig file.

> Locally, I think pre-push would be better suited to
> this than pre-commit (I often save WIP work in local branches, it'd be pretty
> annoying if some indentation thing swore at me).

I personally prefer pre-commit hooks, since then I don't have to
go back and change some commit I made some time ago. And I
think with the easy opt-out that this hook has it would work fine
for my own workflow. But it shouldn't be hard to also include a
pre-push hook too if you want that after all. Then people can
choose to install the hook that they prefer.

On Mon, 23 Jan 2023 at 02:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-01-22 19:28:42 -0500, Tom Lane wrote:
> >> Hmm ... right offhand, the only objection I can see is that the
> >> pg_bsd_indent files use the BSD 4-clause license, which is not ours.
> >> However, didn't UCB grant a blanket exception years ago that said
> >> that people could treat that as the 3-clause license?
>
> > Yep:
> > https://www.freebsd.org/copyright/license/
>
> Cool.  I'll take a look at doing this later (probably after the current
> CF) unless somebody beats me to it.
>
>                         regards, tom lane

Attachment

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 January 2023, 10:56:20

On 2023-01-22 Su 20:03, Andres Freund wrote:
> Hi,
>
> On 2023-01-22 19:50:10 -0500, Andrew Dunstan wrote:
>> On 2023-01-22 Su 18:14, Tom Lane wrote:
>>> Jelte Fennema <postgres@jeltef.nl> writes:
>>>> Maybe I'm not understanding your issue correctly, but for such
>>>> a case you could push two commits at the same time.
>>> I don't know that much about git commit hooks, but do they really
>>> only check the final state of a series of commits?
>>
>> The pre-commit hook is literally run every time you do `git commit`. But
>> it's only run on your local instance and only if you have enabled it.
>> It's not project-wide.
> There's different hooks. Locally, I think pre-push would be better suited to
> this than pre-commit (I often save WIP work in local branches, it'd be pretty
> annoying if some indentation thing swore at me).

Yes, me too, so I currently have a filter in my hook that ignores local
WIP branches. The problem with pre-push is that by the time you're
pushing you have already committed and you would have to go back and
undo some stuff to fix it. Probably 99 times out of 100 I'd prefer to
commit indented code off the bat rather than make a separate indentation
commit. But this really illustrates my point: how you do it is up to you.

>
> But there's also hooks like pre-receive, that allow doing validation on the
> server side. Which obviously would be project wide...
>

Yes, but I think it's been demonstrated (again) that there's no
consensus in using those for this purpose.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 January 2023, 12:11:53

On 2023-01-22 Su 11:18, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>>> ... btw, can we get away with making the diff run be "diff -upd"
>>> not just "diff -u"?  I find diff output for C files noticeably
>>> more useful with those options, but I'm unsure about their
>>> portability.
>> I think they are available on Linux, MacOS and FBSD, and on Windows (if
>> anyone's actually using it for this) it's likely to be Gnu diff. So I
>> think that's probably enough coverage.
> I checked NetBSD as well, and it has all three too.
>
> Patch looks good to me.
>
>             


Thanks, pushed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 January 2023, 14:07:03

On 2023-01-23 Mo 05:44, Jelte Fennema wrote:
> I whipped up a pre-commit hook which automatically runs pgindent on the
> changed files in the commit. It won't add any changes automatically, but
> instead it fails the commit if it made any changes. That way you can add
> them manually if you want. Or if you don't, you can simply run git commit
> again without adding the changes. (or you can use the --no-verify flag of
> git commit to skip the hook completely)
>
> It did require adding some extra flags to pgindent. While it only required
> the --staged-only and --fail-on-changed flags, the --changed-only flag
> was easy to add and seemed generally useful.

Please see the changes to pgindent I committed about the same time I got
your email. I don't think we need your new flags, as it's possible (and
always has been) to provide pgindent with a list of files to be
indented. Instead of having pgindent run `git diff --name-only ...` the
git hook can do it and pass the results to pgindent in its command line.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

23 January 2023, 14:49:15

Indeed the flags you added are enough. Attached is a patch
that adds an updated pre-commit hook with the same behaviour
as the one before. I definitely think having a pre-commit hook
in the repo is beneficial, since writing one that works in all
cases definitely takes some time.

> as it's possible (and
> always has been) to provide pgindent with a list of files to be
> indented.

I guess I didn't realise this was a feature that existed, because
none of the documentation mentioned it.

On Mon, 23 Jan 2023 at 15:07, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2023-01-23 Mo 05:44, Jelte Fennema wrote:
> > I whipped up a pre-commit hook which automatically runs pgindent on the
> > changed files in the commit. It won't add any changes automatically, but
> > instead it fails the commit if it made any changes. That way you can add
> > them manually if you want. Or if you don't, you can simply run git commit
> > again without adding the changes. (or you can use the --no-verify flag of
> > git commit to skip the hook completely)
> >
> > It did require adding some extra flags to pgindent. While it only required
> > the --staged-only and --fail-on-changed flags, the --changed-only flag
> > was easy to add and seemed generally useful.
>
>
> Please see the changes to pgindent I committed about the same time I got
> your email. I don't think we need your new flags, as it's possible (and
> always has been) to provide pgindent with a list of files to be
> indented. Instead of having pgindent run `git diff --name-only ...` the
> git hook can do it and pass the results to pgindent in its command line.
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>

Attachment

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

23 January 2023, 15:09:06

I wrote:
> Cool.  I'll take a look at doing this later (probably after the current
> CF) unless somebody beats me to it.

Thinking about that (importing pg_bsd_indent into our main source
tree) a bit harder:

1. I'd originally thought vaguely that we could teach pgindent
how to build pg_bsd_indent automatically.  But with a little
more consideration, I doubt that would work transparently.
It's common (at least for me) to run pgindent in a distclean'd
tree, where configure results wouldn't be available.  It's even
worse if you habitually use VPATH builds, so that those files
*never* exist in your source tree.  So now I think that we should
stick to the convention that it's on the user to install
pg_bsd_indent somewhere in their PATH; all we'll be doing with
this change is eliminating the step of fetching pg_bsd_indent's
source files from somewhere else.

2. Given #1, it'll be prudent to continue having pgindent
double-check that pg_bsd_indent reports a specific version
number.  We could imagine starting to use the main Postgres
version number for that, but I'm inclined to continue with
its existing numbering series.  One argument for that is
that we generally change pg_bsd_indent less often than annually,
so having it track the main version would end up forcing
make-work builds of your installed pg_bsd_indent at least
once a year.  Also, when we do change pg_bsd_indent, it's
typically right before a mass reindentation commit, and those
do not happen at the same time as forking a new PG version.

3. If we do nothing special, the first mass reindentation is
going to reformat the pg_bsd_indent sources per PG style,
which is ... er ... not the way they look now.  Do we want
to accept that outcome, or take steps to prevent pgindent
from processing pg_bsd_indent?  I have a feeling that manual
cleanup would be necessary if we let such reindentation
happen, but I haven't experimented.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

23 January 2023, 17:31:36

Hi,

On 2023-01-23 10:09:06 -0500, Tom Lane wrote:
> 1. I'd originally thought vaguely that we could teach pgindent
> how to build pg_bsd_indent automatically.  But with a little
> more consideration, I doubt that would work transparently.
> It's common (at least for me) to run pgindent in a distclean'd
> tree, where configure results wouldn't be available.  It's even
> worse if you habitually use VPATH builds, so that those files
> *never* exist in your source tree.  So now I think that we should
> stick to the convention that it's on the user to install
> pg_bsd_indent somewhere in their PATH; all we'll be doing with
> this change is eliminating the step of fetching pg_bsd_indent's
> source files from somewhere else.

I think it'd be better to build pg_bsd_indent automatically as you planned
earlier - most others don't run pgindent from a distcleaned source tree. And
it shouldn't be hard to teach pgindent to run from a vpath build directory.

I'd like to get to the point where we can have simple build target for
a) re-indenting the whole tree
b) re-indenting the files touched in changes compared to master

If we add that to the list of things to do before sending a patch upstream,
we're a heck of a lot more likely to get decently formatted patches compared
to today.

As long as we need typedefs.list, I think it'd be good for such a target to
add new typedefs found in the local build to typedefs.list (but *not* remove
old ones, due to platform dependent code). But that's a separate enough
topic...

> 2. Given #1, it'll be prudent to continue having pgindent
> double-check that pg_bsd_indent reports a specific version
> number.

+1

> 3. If we do nothing special, the first mass reindentation is
> going to reformat the pg_bsd_indent sources per PG style,
> which is ... er ... not the way they look now.  Do we want
> to accept that outcome, or take steps to prevent pgindent
> from processing pg_bsd_indent?  I have a feeling that manual
> cleanup would be necessary if we let such reindentation
> happen, but I haven't experimented.

I think we should exempt it, initially at least. If somebody decides to invest
a substantial amount of time in pgindent, let's change it, but I'm somewhat
doubtful that'll happen anytime soon.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Alvaro Herrera

Date:

23 January 2023, 17:38:20

On 2023-Jan-23, Tom Lane wrote:

> 1. [...] So now I think that we should
> stick to the convention that it's on the user to install
> pg_bsd_indent somewhere in their PATH; all we'll be doing with
> this change is eliminating the step of fetching pg_bsd_indent's
> source files from somewhere else.

+1

> 2. Given #1, it'll be prudent to continue having pgindent
> double-check that pg_bsd_indent reports a specific version
> number.  We could imagine starting to use the main Postgres
> version number for that, but I'm inclined to continue with
> its existing numbering series.

+1

> 3. If we do nothing special, the first mass reindentation is
> going to reformat the pg_bsd_indent sources per PG style,
> which is ... er ... not the way they look now.  Do we want
> to accept that outcome, or take steps to prevent pgindent
> from processing pg_bsd_indent?  I have a feeling that manual
> cleanup would be necessary if we let such reindentation
> happen, but I haven't experimented.

Hmm, initially it must just be easier to have an exception so that
pg_bsd_indent itself isn't indented.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
#error "Operator lives in the wrong universe"
  ("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 January 2023, 18:52:46

On 2023-01-23 Mo 09:49, Jelte Fennema wrote:
> Attached is a patch
> that adds an updated pre-commit hook with the same behaviour
> as the one before. I definitely think having a pre-commit hook
> in the repo is beneficial, since writing one that works in all
> cases definitely takes some time.


Not sure if this should go in the git repo or in the developer wiki.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Nikita Malakhov

Date:

23 January 2023, 18:54:15

Hi!

I've ran pdindent on the whole Postgres and it'd changed

an awful lot of source files. Won't it create a lot of merge conflicts?

On Mon, Jan 23, 2023 at 8:48 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2023-Jan-23, Tom Lane wrote:

> 1. [...] So now I think that we should
> stick to the convention that it's on the user to install
> pg_bsd_indent somewhere in their PATH; all we'll be doing with
> this change is eliminating the step of fetching pg_bsd_indent's
> source files from somewhere else.

+1

> 2. Given #1, it'll be prudent to continue having pgindent
> double-check that pg_bsd_indent reports a specific version
> number. We could imagine starting to use the main Postgres
> version number for that, but I'm inclined to continue with
> its existing numbering series.

+1

> 3. If we do nothing special, the first mass reindentation is
> going to reformat the pg_bsd_indent sources per PG style,
> which is ... er ... not the way they look now. Do we want
> to accept that outcome, or take steps to prevent pgindent
> from processing pg_bsd_indent? I have a feeling that manual
> cleanup would be necessary if we let such reindentation
> happen, but I haven't experimented.

Hmm, initially it must just be easier to have an exception so that
pg_bsd_indent itself isn't indented.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
#error "Operator lives in the wrong universe"
("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)

Regards,

Nikita Malakhov

Postgres Professional

https://postgrespro.ru/

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

23 January 2023, 20:09:10

Nikita Malakhov <hukutoc@gmail.com> writes:
> I've ran pdindent on the whole Postgres and it'd changed
> an awful lot of source files. Won't it create a lot of merge conflicts?

Well, yeah, you've rediscovered the fact that a lot of commits are sloppy
about this, and it's been awhile since our last tree-wide pgindent.

Our normal process has been to do a cleanup pgindent run annually or so,
usually after the end of the last commitfest of a cycle when there's
plenty of time for people to deal with the ensuing merge conflicts.

If we could get to "commits are pgindent clean to begin with", we
could avoid the merge problems from one-big-reindent.  I'd still
be inclined to do an annual run as a backup, but hopefully it would
find few problems.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

23 January 2023, 20:53:00

> Not sure if this should go in the git repo or in the developer wiki.

I would say the git repo is currently the most fitting place, since it
has all the existing docs for pgindent. The wiki even links to the
pgindent source directory:
https://wiki.postgresql.org/wiki/Developer_FAQ#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

23 January 2023, 21:11:50

On Mon, Jan 23, 2023 at 09:31:36AM -0800, Andres Freund wrote:
> As long as we need typedefs.list, I think it'd be good for such a target to
> add new typedefs found in the local build to typedefs.list (but *not* remove
> old ones, due to platform dependent code). But that's a separate enough
> topic...

One issue on requiring patches to have run pgindent previously is
actually the typedef list.  If someone adds a typedef in a commit, they
will see different pgident output in the committed files, and perhaps
others, and the new typedefs might only appear after the commit, causing
later commits to not match.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

24 January 2023, 11:44:00

> One issue on requiring patches to have run pgindent previously is
> actually the typedef list.  If someone adds a typedef in a commit, they
> will see different pgident output in the committed files, and perhaps
> others, and the new typedefs might only appear after the commit, causing
> later commits to not match.

I'm not sure I understand the issue you're pointing out. If someone
changes the typedef list, imho they want the formatting to change
because of that. So requiring an addition to the typedef list to also
commit reindentation to all files that this typedef indirectly impacts
seems pretty reasonable to me.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

24 January 2023, 14:54:57

Jelte Fennema <postgres@jeltef.nl> writes:
>> One issue on requiring patches to have run pgindent previously is
>> actually the typedef list.  If someone adds a typedef in a commit, they
>> will see different pgident output in the committed files, and perhaps
>> others, and the new typedefs might only appear after the commit, causing
>> later commits to not match.

> I'm not sure I understand the issue you're pointing out. If someone
> changes the typedef list, imho they want the formatting to change
> because of that. So requiring an addition to the typedef list to also
> commit reindentation to all files that this typedef indirectly impacts
> seems pretty reasonable to me.

I think the issue Bruce is pointing out is that this is another mechanism
whereby different people could get different indentation results.
I fear any policy that is based on an assumption that indentation has
One True Result is going to fail.

As a concrete example, suppose Alice commits some code that uses "foo"
as a variable name, and more or less concurrently, Bob commits something
that defines "foo" as a typedef.  Bob's change is likely to have
side-effects on the formatting of Alice's code.  If they're working in
well-separated parts of the source tree, nobody is likely to notice
that for awhile --- but whoever next touches the files Alice touched
will be in for a surprise, which will be more or less painful depending
on whether we've installed brittle processes.

As another example, the mechanisms we use to create the typedefs list
in the first place are pretty squishy/leaky: they depend on which
buildfarm animals are running the typedef-generation step, and on
whether anything's broken lately in that code --- which happens on
a fairly regular basis (eg [1]).  Maybe that could be improved,
but I don't see an easy way to capture the set of system-defined
typedefs that are in use on platforms other than your own.  I
definitely do not want to go over to hand maintenance of that list.

I think we need to be content with a "soft", it's more-or-less-right
approach to indentation.  As I explained to somebody upthread, the
main benefit of this for most people is avoiding the need for a massive
once-a-year reindent run that causes merge failures for many pending
patches.  But we don't need to completely eliminate such runs to get
99.9% of that benefit; we only need to reduce the number of places
they're likely to touch.

            regards, tom lane

[1] https://github.com/PGBuildFarm/client-code/commit/dcca861554e90d6395c3c153317b0b0e3841f103

Re: run pgindent on a regular basis / scripted manner

From

Vladimir Sitnikov

Date:

24 January 2023, 15:03:51

Andres> Of course, but I somehow feel a change of formatting should be reviewable to

Andres> at least some degree

One way of reviewing the formatting changes is to compare the compiled binaries.

If the binaries before and after formatting are the same, then there's a high chance the behaviour is the same.

Vladimir

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

24 January 2023, 16:03:25

> As a concrete example, suppose Alice commits some code that uses "foo"
> as a variable name, and more or less concurrently, Bob commits something
> that defines "foo" as a typedef.  Bob's change is likely to have
> side-effects on the formatting of Alice's code.  If they're working in
> well-separated parts of the source tree, nobody is likely to notice
> that for awhile --- but whoever next touches the files Alice touched
> will be in for a surprise, which will be more or less painful depending
> on whether we've installed brittle processes.

Sounds like this conflict could be handled fairly easily by
having a local git hook rerunning pgindent whenever
you rebase a commit:
1. if you changed typedefs.list the hook would format all files
2. if you didn't it only formats the files that you changed

> As another example, the mechanisms we use to create the typedefs list
> in the first place are pretty squishy/leaky: they depend on which
> buildfarm animals are running the typedef-generation step, and on
> whether anything's broken lately in that code --- which happens on
> a fairly regular basis (eg [1]).  Maybe that could be improved,
> but I don't see an easy way to capture the set of system-defined
> typedefs that are in use on platforms other than your own.  I
> definitely do not want to go over to hand maintenance of that list.

Wouldn't the automatic addition-only solution that Andres suggested
solve this issue? Build farms could still remove unused typedefs
on a regular basis, but commits would at least add typedefs for the
platform that the comitter uses.

> I think we need to be content with a "soft", it's more-or-less-right
> approach to indentation.

I think that this would already be a significant improvement over the
current situation. My experience with the current situation is that
indentation is more-or-less-wrong.

> As I explained to somebody upthread, the
> main benefit of this for most people is avoiding the need for a massive
> once-a-year reindent run that causes merge failures for many pending
> patches.

Merge failures are one issue. But personally the main benefit that
I would be getting is being able to run pgindent on the files
I'm editing and get this weird +12 columns formatting correct
without having to manually type it. Without pgindent also
changing random parts of the files that someone else touched
a few commits before me.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

24 January 2023, 16:43:25

Jelte Fennema <postgres@jeltef.nl> writes:
> Sounds like this conflict could be handled fairly easily by
> having a local git hook rerunning pgindent whenever
> you rebase a commit:
> 1. if you changed typedefs.list the hook would format all files
> 2. if you didn't it only formats the files that you changed

I think that would be undesirable, because then reindentation noise
in completely-unrelated files would get baked into feature commits,
complicating review and messing up "git blame" history.
The approach we currently have allows reindent effects to be
separated into ignorable labeled commits, which is a nice property.

> Merge failures are one issue. But personally the main benefit that
> I would be getting is being able to run pgindent on the files
> I'm editing and get this weird +12 columns formatting correct
> without having to manually type it. Without pgindent also
> changing random parts of the files that someone else touched
> a few commits before me.

Yeah, that always annoys me too, but I've always considered that
it's my problem not something I can externalize onto other people.
The real bottom line here is that AFAICT, there are fewer committers
who care about indent cleanliness than committers who do not, so
I do not think that the former group get to impose strict rules
on the latter, much as I might wish otherwise.

FWIW, Andrew's recent --show-diff feature for pgindent has
already improved my workflow for that.  I can do
"pgindent --show-diff >fixindent.patch", manually remove any hunks
in fixindent.patch that don't pertain to the code I'm working on,
and apply what remains to fix up my new code.  (I had been doing
something basically like this, but with more file-copying steps
to undo pgindent's edit-in-place behavior.)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

24 January 2023, 17:00:12

On 2023-01-24 Tu 11:43, Tom Lane wrote:
> Jelte Fennema <postgres@jeltef.nl> writes:
>> Sounds like this conflict could be handled fairly easily by
>> having a local git hook rerunning pgindent whenever
>> you rebase a commit:
>> 1. if you changed typedefs.list the hook would format all files
>> 2. if you didn't it only formats the files that you changed
> I think that would be undesirable, because then reindentation noise
> in completely-unrelated files would get baked into feature commits,
> complicating review and messing up "git blame" history.
> The approach we currently have allows reindent effects to be
> separated into ignorable labeled commits, which is a nice property.
>
>> Merge failures are one issue. But personally the main benefit that
>> I would be getting is being able to run pgindent on the files
>> I'm editing and get this weird +12 columns formatting correct
>> without having to manually type it. Without pgindent also
>> changing random parts of the files that someone else touched
>> a few commits before me.
> Yeah, that always annoys me too, but I've always considered that
> it's my problem not something I can externalize onto other people.
> The real bottom line here is that AFAICT, there are fewer committers
> who care about indent cleanliness than committers who do not, so
> I do not think that the former group get to impose strict rules
> on the latter, much as I might wish otherwise.
>
> FWIW, Andrew's recent --show-diff feature for pgindent has
> already improved my workflow for that.  I can do
> "pgindent --show-diff >fixindent.patch", manually remove any hunks
> in fixindent.patch that don't pertain to the code I'm working on,
> and apply what remains to fix up my new code.  (I had been doing
> something basically like this, but with more file-copying steps
> to undo pgindent's edit-in-place behavior.)
>
>             


I'm glad it's helpful.

Here's another improvement I think will be useful when the new gadgets
are used in a git hook: first, look for the excludes file under the
current directory if we aren't setting $code_base (e.g if we have files
given on the command line), and second apply the exclude patterns to the
command line files as well as to files found using File::Find.

I propose to apply this fairly soon.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-exclude-fix.patch

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

24 January 2023, 18:42:18

> I think that would be undesirable, because then reindentation noise
> in completely-unrelated files would get baked into feature commits,
> complicating review and messing up "git blame" history.

With a rebase hook similar to the the pre-commit hook that I shared
upthread, your files will be changed accordingly, but you don't need
to commit those changes in the same commit as the one that you're
rebasing. You could append another commit after it. Another option
would be to move the typedefs.list change to a separate commit
together with all project wide indentation changes.

> The real bottom line here is that AFAICT, there are fewer committers
> who care about indent cleanliness than committers who do not, so
> I do not think that the former group get to impose strict rules
> on the latter, much as I might wish otherwise.

Is this actually the case? I haven't seen anyone in this thread say
they don't care. From my perspective it seems like the unclean
indents simply come from forgetting to run pgindent once in
a while. And those few forgetful moments add up over a year.
of commits. That's why to me tooling seems the answer here.
If the tooling makes it easy not to forget then the problem
goes away.

> FWIW, Andrew's recent --show-diff feature for pgindent has
> already improved my workflow for that.  I can do
> "pgindent --show-diff >fixindent.patch", manually remove any hunks
> in fixindent.patch that don't pertain to the code I'm working on,
> and apply what remains to fix up my new code.  (I had been doing
> something basically like this, but with more file-copying steps
> to undo pgindent's edit-in-place behavior.)

Yeah, I have a similar workflow with the pre-commit hook that
I shared. By using "git checkout -p" I can remove hunks that
don't pertain to my code. Still it would be really nice not
to have to go through that effort (which is significant for the
libpq code that I've been workin on, since there's ~50
incorrectly indented hunks).

> Here's another improvement I think will be useful when the new gadgets
> are used in a git hook: first, look for the excludes file under the
> current directory if we aren't setting $code_base (e.g if we have files
> given on the command line), and second apply the exclude patterns to the
> command line files as well as to files found using File::Find.

Change looks good to me.

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

24 January 2023, 19:04:02

On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:
> As another example, the mechanisms we use to create the typedefs list
> in the first place are pretty squishy/leaky: they depend on which
> buildfarm animals are running the typedef-generation step, and on
> whether anything's broken lately in that code --- which happens on
> a fairly regular basis (eg [1]).  Maybe that could be improved,
> but I don't see an easy way to capture the set of system-defined
> typedefs that are in use on platforms other than your own.  I
> definitely do not want to go over to hand maintenance of that list.

As I now understand it, we would need to standardize on a typedef list
at the beginning of each major development cycle, and then only allow
additions, and the addition would have to include any pgindent affects
of the addition.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

24 January 2023, 21:11:20

On 2023-01-24 Tu 13:42, Jelte Fennema wrote:
>> Here's another improvement I think will be useful when the new gadgets
>> are used in a git hook: first, look for the excludes file under the
>> current directory if we aren't setting $code_base (e.g if we have files
>> given on the command line), and second apply the exclude patterns to the
>> command line files as well as to files found using File::Find.
> Change looks good to me.


Thanks, pushed


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

26 January 2023, 14:40:39

On 2023-01-23 Mo 09:49, Jelte Fennema wrote:
> Indeed the flags you added are enough. Attached is a patch
> that adds an updated pre-commit hook with the same behaviour
> as the one before. I definitely think having a pre-commit hook
> in the repo is beneficial, since writing one that works in all
> cases definitely takes some time.

I didn't really like your hook, as it forces a reindent, and many people
won't want that (for reasons given elsewhere in this thread).

Here's an extract from my pre-commit hook that does that if PGAUTOINDENT
is set to "yes", and otherwise just warns you that you need to run pgindent.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pre-commit-sample

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

26 January 2023, 16:16:52

On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:
> I didn't really like your hook, as it forces a reindent, and many people
> won't want that (for reasons given elsewhere in this thread).

I'm not sure what you mean by "forces a reindent". Like I explained
you can simply run "git commit" again to ignore the changes and
commit anyway. As long as the files are indented on your filesystem
the hook doesn't care if you actually included the indentation changes
in the changes that you're currently committing.

So to be completely clear you can do the following with my hook:
git commit # runs pgindent and fails
git commit # commits changes anyway
git commit -am 'Run pgindent' # commit indentation changes separately

Or what I usually do:
git commit # runs pgindent and fails
git add --patch # choose relevant changes to add to commit
git commit # commit the changes
git checkout -- . # undo irrelevant changes on filesystem

Honestly PGAUTOINDENT=no seems stricter, since the only
way to bypass the failure is now to run manually run pgindent
or git commit with the --no-verify flag.

> files=$(git diff --cached --name-only --diff-filter=ACMR)
> src/tools/pgindent/pgindent $files

That seems like it would fail if there's any files or directories with
spaces in them. Maybe this isn't something we care about though.

> # no need to filter files - pgindent ignores everything that isn't a
> # .c or .h file

If the first argument is a non .c or .h file, then pgindent interprets
it as the typedefs file. So it's definitely important to filter non .c
and .h files out. Because now if you commit a single
non .c or .h file this hook messes up the indentation in all of
your files. You can reproduce by running:
src/tools/pgindent/pgindent README

> # only do this on master
> test  "$branch" = "master" || return 0

I would definitely want a way to disable this check. As a normal
submitter I never work directly on master.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

26 January 2023, 16:54:11

On 2023-01-26 Th 11:16, Jelte Fennema wrote:
> On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:
>> I didn't really like your hook, as it forces a reindent, and many people
>> won't want that (for reasons given elsewhere in this thread).
> I'm not sure what you mean by "forces a reindent". Like I explained
> you can simply run "git commit" again to ignore the changes and
> commit anyway. As long as the files are indented on your filesystem
> the hook doesn't care if you actually included the indentation changes
> in the changes that you're currently committing.


Your hook does this:


+git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' |\
+    xargs src/tools/pgindent/pgindent --silent-diff \
+    || {
+        echo ERROR: Aborting commit because pgindent was not run
+        git diff --cached --name-only --diff-filter=ACMR | grep
'\.[ch]$' | xargs src/tools/pgindent/pgindent
+        exit 1
+    }


At this stage the files are now indented, so if it failed and you run
`git commit` again it will commit with the indention changes.


>
> So to be completely clear you can do the following with my hook:
> git commit # runs pgindent and fails
> git commit # commits changes anyway
> git commit -am 'Run pgindent' # commit indentation changes separately
>
> Or what I usually do:
> git commit # runs pgindent and fails
> git add --patch # choose relevant changes to add to commit
> git commit # commit the changes
> git checkout -- . # undo irrelevant changes on filesystem
>
> Honestly PGAUTOINDENT=no seems stricter, since the only
> way to bypass the failure is now to run manually run pgindent
> or git commit with the --no-verify flag.
>
>> files=$(git diff --cached --name-only --diff-filter=ACMR)
>> src/tools/pgindent/pgindent $files
> That seems like it would fail if there's any files or directories with
> spaces in them. Maybe this isn't something we care about though.


We don't have any, and the filenames git produces are relative to the
git root. I don't think this is an issue.


>
>> # no need to filter files - pgindent ignores everything that isn't a
>> # .c or .h file
> If the first argument is a non .c or .h file, then pgindent interprets
> it as the typedefs file. So it's definitely important to filter non .c
> and .h files out. Because now if you commit a single
> non .c or .h file this hook messes up the indentation in all of
> your files. You can reproduce by running:
> src/tools/pgindent/pgindent README



I have a patch at [1] to remove this misfeature.


>
>> # only do this on master
>> test  "$branch" = "master" || return 0
> I would definitely want a way to disable this check. As a normal
> submitter I never work directly on master.


Sure, that's your choice. My intended audience here is committers, who
of course do work on master.


cheers


andrew


[1]  https://postgr.es/m/21bb8573-9e56-812b-84cf-1e4f3c4c2a7b@dunslane.net

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

26 January 2023, 17:05:53

> At this stage the files are now indented, so if it failed and you run
> `git commit` again it will commit with the indention changes.

No, because at no point a "git add" is happening, so the changes
made by pgindent are not staged. As long as you don't run the
second "git commit" with the -a flag the commit will be exactly
the same as you prepared it before.

> Sure, that's your choice. My intended audience here is committers, who
> of course do work on master.

Yes I understand, I meant it would be nice if the script had a environment
variable like PG_COMMIT_HOOK_ALL_BRANCHES (bad name)
for this purpose.

On Thu, 26 Jan 2023 at 17:54, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2023-01-26 Th 11:16, Jelte Fennema wrote:
> > On Thu, 26 Jan 2023 at 15:40, Andrew Dunstan <andrew@dunslane.net> wrote:
> >> I didn't really like your hook, as it forces a reindent, and many people
> >> won't want that (for reasons given elsewhere in this thread).
> > I'm not sure what you mean by "forces a reindent". Like I explained
> > you can simply run "git commit" again to ignore the changes and
> > commit anyway. As long as the files are indented on your filesystem
> > the hook doesn't care if you actually included the indentation changes
> > in the changes that you're currently committing.
>
>
> Your hook does this:
>
>
> +git diff --cached --name-only --diff-filter=ACMR | grep '\.[ch]$' |\
> +    xargs src/tools/pgindent/pgindent --silent-diff \
> +    || {
> +        echo ERROR: Aborting commit because pgindent was not run
> +        git diff --cached --name-only --diff-filter=ACMR | grep
> '\.[ch]$' | xargs src/tools/pgindent/pgindent
> +        exit 1
> +    }
>
>
> At this stage the files are now indented, so if it failed and you run
> `git commit` again it will commit with the indention changes.
>
>
> >
> > So to be completely clear you can do the following with my hook:
> > git commit # runs pgindent and fails
> > git commit # commits changes anyway
> > git commit -am 'Run pgindent' # commit indentation changes separately
> >
> > Or what I usually do:
> > git commit # runs pgindent and fails
> > git add --patch # choose relevant changes to add to commit
> > git commit # commit the changes
> > git checkout -- . # undo irrelevant changes on filesystem
> >
> > Honestly PGAUTOINDENT=no seems stricter, since the only
> > way to bypass the failure is now to run manually run pgindent
> > or git commit with the --no-verify flag.
> >
> >> files=$(git diff --cached --name-only --diff-filter=ACMR)
> >> src/tools/pgindent/pgindent $files
> > That seems like it would fail if there's any files or directories with
> > spaces in them. Maybe this isn't something we care about though.
>
>
> We don't have any, and the filenames git produces are relative to the
> git root. I don't think this is an issue.
>
>
> >
> >> # no need to filter files - pgindent ignores everything that isn't a
> >> # .c or .h file
> > If the first argument is a non .c or .h file, then pgindent interprets
> > it as the typedefs file. So it's definitely important to filter non .c
> > and .h files out. Because now if you commit a single
> > non .c or .h file this hook messes up the indentation in all of
> > your files. You can reproduce by running:
> > src/tools/pgindent/pgindent README
>
>
>
> I have a patch at [1] to remove this misfeature.
>
>
> >
> >> # only do this on master
> >> test  "$branch" = "master" || return 0
> > I would definitely want a way to disable this check. As a normal
> > submitter I never work directly on master.
>
>
> Sure, that's your choice. My intended audience here is committers, who
> of course do work on master.
>
>
> cheers
>
>
> andrew
>
>
> [1]  https://postgr.es/m/21bb8573-9e56-812b-84cf-1e4f3c4c2a7b@dunslane.net
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

26 January 2023, 21:46:46

On 2023-01-26 Th 12:05, Jelte Fennema wrote:
>> At this stage the files are now indented, so if it failed and you run
>> `git commit` again it will commit with the indention changes.
> No, because at no point a "git add" is happening, so the changes
> made by pgindent are not staged. As long as you don't run the
> second "git commit" with the -a flag the commit will be exactly
> the same as you prepared it before.

Hmm, but I usually run with -a, I even have a git alias for it. I guess
what this discussion illustrates is that there are various patters for
using git, and we shouldn't assume that everyone else is using the same
patterns we are.

I'm still  mildly inclined to say this material would be better placed
in the developer wiki. After all, this isn't the only thing a postgres
developer might use a git hook for (mine has more material in it than in
what I posted).

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

26 January 2023, 22:54:41

On Thu, 26 Jan 2023 at 22:46, Andrew Dunstan <andrew@dunslane.net> wrote:
> Hmm, but I usually run with -a, I even have a git alias for it. I guess
> what this discussion illustrates is that there are various patters for
> using git, and we shouldn't assume that everyone else is using the same
> patterns we are.

I definitely agree that there are lots of ways to use git. And I now
understand why my hook didn't work well for your existing workflow.

I've pretty much unlearned the -a flag. Because the easiest way I've
been able to split up changes into different commits is using "git add
-p", which adds partial pieces of files to the staging area. And that
workflow combines terribly with "git commit -a" because -a adds all
the things that I specifically didn't put in the staging area into the
final commit anyway.

> I'm still  mildly inclined to say this material would be better placed
> in the developer wiki. After all, this isn't the only thing a postgres
> developer might use a git hook for

I think it should definitely be somewhere. I have a preference for the
repo, since I think the docs on codestyle are already in too many
different places. But the wiki is already much better than having no
shared hook at all. I mainly think we should try to make it as easy as
possible for people to commit well indented code.

> (mine has more material in it than in what I posted).

Anything that is useful for the wider community and could be part of
this example/template git hook? (e.g. some perltidy automation)

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

27 January 2023, 14:57:45

On 2023-01-26 Th 17:54, Jelte Fennema wrote:
>
>> I'm still  mildly inclined to say this material would be better placed
>> in the developer wiki. After all, this isn't the only thing a postgres
>> developer might use a git hook for
> I think it should definitely be somewhere. I have a preference for the
> repo, since I think the docs on codestyle are already in too many
> different places. But the wiki is already much better than having no
> shared hook at all. I mainly think we should try to make it as easy as
> possible for people to commit well indented code.
>
>

I've added a section to the wiki at
<https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks> and
put a reference to that in the pgindent docco. You can of course add
some more info to the wiki if you feel it's necessary.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

29 January 2023, 01:06:03

On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:
> On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:
> > As another example, the mechanisms we use to create the typedefs list
> > in the first place are pretty squishy/leaky: they depend on which
> > buildfarm animals are running the typedef-generation step, and on
> > whether anything's broken lately in that code --- which happens on
> > a fairly regular basis (eg [1]).  Maybe that could be improved,
> > but I don't see an easy way to capture the set of system-defined
> > typedefs that are in use on platforms other than your own.  I
> > definitely do not want to go over to hand maintenance of that list.
> 
> As I now understand it, we would need to standardize on a typedef list
> at the beginning of each major development cycle, and then only allow
> additions,

Not to my knowledge.  There's no particular obstacle to updating the list more
frequently or removing entries.

> and the addition would have to include any pgindent affects
> of the addition.

Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
pgindent when the given commit(s) change the typedef list.  typedef list
changes essentially become another kind of refactoring that can yield merge
conflicts.  If your commit passed the pgindent check, rebasing it onto a new
typedefs list may require further indentation changes.  New typedefs don't
tend to change a lot of old code, so I would expect this sort of conflict to
be minor, compared to all the other sources of conflicts.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

29 January 2023, 04:22:19

Noah Misch <noah@leadboat.com> writes:
> Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
> pgindent when the given commit(s) change the typedef list.  typedef list
> changes essentially become another kind of refactoring that can yield merge
> conflicts.  If your commit passed the pgindent check, rebasing it onto a new
> typedefs list may require further indentation changes.  New typedefs don't
> tend to change a lot of old code, so I would expect this sort of conflict to
> be minor, compared to all the other sources of conflicts.

In fact, if a typedef addition *does* affect a lot of old code,
that's a good sign that the choice of typedef name ought to be
rethought: it's evidently conflicting with existing names.

I'm not sure what that observation implies for our standard
practices here.  But it does suggest that "let pgindent do what
it wants without human oversight" probably isn't a good plan.
We've seen that to be true for other reasons as well, notably that
it can destroy the readability of carefully-laid-out comments.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

30 January 2023, 20:42:09

On Sat, Jan 28, 2023 at 05:06:03PM -0800, Noah Misch wrote:
> On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:
> > On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:
> > > As another example, the mechanisms we use to create the typedefs list
> > > in the first place are pretty squishy/leaky: they depend on which
> > > buildfarm animals are running the typedef-generation step, and on
> > > whether anything's broken lately in that code --- which happens on
> > > a fairly regular basis (eg [1]).  Maybe that could be improved,
> > > but I don't see an easy way to capture the set of system-defined
> > > typedefs that are in use on platforms other than your own.  I
> > > definitely do not want to go over to hand maintenance of that list.
> > 
> > As I now understand it, we would need to standardize on a typedef list
> > at the beginning of each major development cycle, and then only allow
> > additions,
> 
> Not to my knowledge.  There's no particular obstacle to updating the list more
> frequently or removing entries.

We would need to re-pgindent the tree each time, I think, which would
cause disruption if we did it too frequently.

> > and the addition would have to include any pgindent affects
> > of the addition.
> 
> Yes, a hook intended to enforce pgindent cleanliness should run tree-wide
> pgindent when the given commit(s) change the typedef list.  typedef list
> changes essentially become another kind of refactoring that can yield merge
> conflicts.  If your commit passed the pgindent check, rebasing it onto a new
> typedefs list may require further indentation changes.  New typedefs don't
> tend to change a lot of old code, so I would expect this sort of conflict to
> be minor, compared to all the other sources of conflicts.

Agreed.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

Embrace your flaws.  They make you human, rather than perfect,
which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

02 February 2023, 06:28:38

On Mon, Jan 30, 2023 at 03:42:09PM -0500, Bruce Momjian wrote:
> On Sat, Jan 28, 2023 at 05:06:03PM -0800, Noah Misch wrote:
> > On Tue, Jan 24, 2023 at 02:04:02PM -0500, Bruce Momjian wrote:
> > > On Tue, Jan 24, 2023 at 09:54:57AM -0500, Tom Lane wrote:
> > > > As another example, the mechanisms we use to create the typedefs list
> > > > in the first place are pretty squishy/leaky: they depend on which
> > > > buildfarm animals are running the typedef-generation step, and on
> > > > whether anything's broken lately in that code --- which happens on
> > > > a fairly regular basis (eg [1]).  Maybe that could be improved,
> > > > but I don't see an easy way to capture the set of system-defined
> > > > typedefs that are in use on platforms other than your own.  I
> > > > definitely do not want to go over to hand maintenance of that list.
> > > 
> > > As I now understand it, we would need to standardize on a typedef list
> > > at the beginning of each major development cycle, and then only allow
> > > additions,
> > 
> > Not to my knowledge.  There's no particular obstacle to updating the list more
> > frequently or removing entries.
> 
> We would need to re-pgindent the tree each time, I think, which would
> cause disruption if we did it too frequently.

More important than frequency is how much old code changes.  A new typedef
typically is an identifier not already appearing in the tree, so no old code
changes.  A removed typedef typically no longer appears in the tree, so again
no old code changes.  The tree can get those daily; they're harmless.

The push that adds or removes FooTypedef from the source code is in the best
position to react to any surprising indentation consequences of adding or
removing FooTypedef from typedefs.list.  (Reactions could include choosing a
different typedef name or renaming incidental matches in older code.)  Hence,
changing typedefs.list as frequently as it affects the code is less disruptive
than changing it once a year.  The same applies to challenges like pgindent
wrecking a non-"/*----------" comment.  Such breakage is hard to miss when
it's part of the push that crafts the comment; it's easier to miss in a bulk,
end-of-cycle pgindent.

Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push.  You'd still want to make the tree clean
sometime the same week or so.  It's cheap to provide a break-glass like that.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

02 February 2023, 06:40:29

Noah Misch <noah@leadboat.com> writes:
> Regarding the concern about a pre-receive hook blocking an emergency push, the
> hook could approve every push where a string like "pgindent: no" appears in a
> commit message within the push.  You'd still want to make the tree clean
> sometime the same week or so.  It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread.  I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements.  That's
not manpower we can afford to drive away.

Maybe this should get taken up at the this-time-for-sure developer
meeting at PGCon?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Dean Rasheed

Date:

02 February 2023, 11:34:37

On Thu, 2 Feb 2023 at 06:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Noah Misch <noah@leadboat.com> writes:
> > Regarding the concern about a pre-receive hook blocking an emergency push, the
> > hook could approve every push where a string like "pgindent: no" appears in a
> > commit message within the push.  You'd still want to make the tree clean
> > sometime the same week or so.  It's cheap to provide a break-glass like that.
>
> I think the real question here is whether we can get all (or at least
> a solid majority of) committers to accept such draconian constraints.
> I'd buy into it, and evidently so would you, but I can't help noting
> that less than a quarter of active committers have bothered to
> comment on this thread.  I suspect the other three-quarters would
> be quite annoyed if we tried to institute such requirements.
>

I didn't reply until now, but I'm solidly in the camp of committers
who care about keeping the tree properly indented, and I wouldn't have
any problem with such a check being imposed.

I regularly run pgindent locally, and if I ever commit without
indenting, it's either intentional, or because I forgot, so the
reminder would be useful.

And as someone who runs pgindent regularly, I think this will be a net
time saver, since I won't have to skip over other unrelated indent
changes all the time.

Regards,
Dean

Re: run pgindent on a regular basis / scripted manner

From

Thomas Munro

Date:

02 February 2023, 11:40:14

On Fri, Feb 3, 2023 at 12:35 AM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> And as someone who runs pgindent regularly, I think this will be a net
> time saver, since I won't have to skip over other unrelated indent
> changes all the time.

+1

Re: run pgindent on a regular basis / scripted manner

From

Amit Kapila

Date:

02 February 2023, 11:59:05

On Thu, Feb 2, 2023 at 5:05 PM Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> On Thu, 2 Feb 2023 at 06:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Noah Misch <noah@leadboat.com> writes:
> > > Regarding the concern about a pre-receive hook blocking an emergency push, the
> > > hook could approve every push where a string like "pgindent: no" appears in a
> > > commit message within the push.  You'd still want to make the tree clean
> > > sometime the same week or so.  It's cheap to provide a break-glass like that.
> >
> > I think the real question here is whether we can get all (or at least
> > a solid majority of) committers to accept such draconian constraints.
> > I'd buy into it, and evidently so would you, but I can't help noting
> > that less than a quarter of active committers have bothered to
> > comment on this thread.  I suspect the other three-quarters would
> > be quite annoyed if we tried to institute such requirements.
> >
>
> I didn't reply until now, but I'm solidly in the camp of committers
> who care about keeping the tree properly indented, and I wouldn't have
> any problem with such a check being imposed.
>
> I regularly run pgindent locally, and if I ever commit without
> indenting, it's either intentional, or because I forgot, so the
> reminder would be useful.
>
> And as someone who runs pgindent regularly, I think this will be a net
> time saver, since I won't have to skip over other unrelated indent
> changes all the time.
>

+1.

-- 
With Regards,
Amit Kapila.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

02 February 2023, 14:06:36

On 2023-02-02 Th 01:40, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
>> Regarding the concern about a pre-receive hook blocking an emergency push, the
>> hook could approve every push where a string like "pgindent: no" appears in a
>> commit message within the push.  You'd still want to make the tree clean
>> sometime the same week or so.  It's cheap to provide a break-glass like that.
> I think the real question here is whether we can get all (or at least
> a solid majority of) committers to accept such draconian constraints.
> I'd buy into it, and evidently so would you, but I can't help noting
> that less than a quarter of active committers have bothered to
> comment on this thread.  I suspect the other three-quarters would
> be quite annoyed if we tried to institute such requirements.  That's
> not manpower we can afford to drive away.


I'd be very surprised if this caused any active committer to walk away
from the project. Many will probably appreciate the nudge. But maybe I'm
overoptimistic.


>
> Maybe this should get taken up at the this-time-for-sure developer
> meeting at PGCon?
>
>             


Sure. There's probably some work that could still be done in this area
too, such as making pgperltidy work similarly to how we've now make
pgindent work.


There's also a question of timing. Possibly the best time would be when
we next fork the tree.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

02 February 2023, 15:00:58

Andrew Dunstan <andrew@dunslane.net> writes:
> Sure. There's probably some work that could still be done in this area
> too, such as making pgperltidy work similarly to how we've now make
> pgindent work.

Perhaps.  But before we commit to that, I'd like to see some tweaks to the
pgperltidy rules to make it less eager to revisit the formatting of lines
close to a change.  Its current behavior will induce a lot of "git blame"
noise if we apply these same procedures to Perl code.

(Should I mention reformat-dat-files?)

> There's also a question of timing. Possibly the best time would be when
> we next fork the tree.

Yeah.  We have generally not wanted to do a mass indent except
when there's a minimum amount of pending patches, ie after the last
CF of a cycle.  What I'd suggest is that we plan on doing a mass
indent and then switch over to the new rules, right after the March
CF closes.  That gives us a couple months to nail down and test out
the new procedures before they go live.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

02 February 2023, 20:22:55

On 2023-02-02 Th 10:00, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Sure. There's probably some work that could still be done in this area
>> too, such as making pgperltidy work similarly to how we've now make
>> pgindent work.
> Perhaps.  But before we commit to that, I'd like to see some tweaks to the
> pgperltidy rules to make it less eager to revisit the formatting of lines
> close to a change.  Its current behavior will induce a lot of "git blame"
> noise if we apply these same procedures to Perl code.


I haven't done anything about that yet, but I have reworked the script
so it's a lot more like pgindent, with --show-diff and --silent-diff
modes, and allowing a list of files to be indented on the command line.
Non-perl files are filtered out from such a list.


>
> (Should I mention reformat-dat-files?)


If you want I can add those flags there too.


>
>> There's also a question of timing. Possibly the best time would be when
>> we next fork the tree.
> Yeah.  We have generally not wanted to do a mass indent except
> when there's a minimum amount of pending patches, ie after the last
> CF of a cycle.  What I'd suggest is that we plan on doing a mass
> indent and then switch over to the new rules, right after the March
> CF closes.  That gives us a couple months to nail down and test out
> the new procedures before they go live.
>
>             


WFM. Of course then we're not waiting for the developer meeting.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgperltidy-new.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

03 February 2023, 16:44:46

On 2023-01-22 Su 17:47, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I strongly dislike it, I rarely get it right by hand - but it does have some
benefit over aligning variable names based on the length of the type names as
uncrustify/clang-format: In their approach an added local variable can cause
all the other variables to be re-indented (and their initial value possibly
wrapped). The fixed alignment doesn't have that issue.

Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
another assignment in a series of assignments, it is very likely to
reformat all the adjacent assignments because it thinks it's cool to
make all the equal signs line up.  That's just awful.  You can either
run pgperltidy on new code before committing, and accept that the feature
patch will touch a lot of lines it's not making real changes to (thereby
dirtying the "git blame" history) or not do so and thereby commit code
that's not passing tidiness checks.  Let's *not* adopt any style that
causes similar things to start happening in our C code.

Modern versions of perltidy give you much more control over this, so maybe we need to investigate the possibility of updating. See the latest docco at <https://metacpan.org/dist/Perl-Tidy/view/bin/perltidy#Completely-turning-off-vertical-alignment-with-novalign>

Probably we'd want to use something like

--valign-exclusion-list='= => ,'


cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

03 February 2023, 17:52:50

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-01-22 Su 17:47, Tom Lane wrote:
>> Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
>> another assignment in a series of assignments, it is very likely to
>> reformat all the adjacent assignments because it thinks it's cool to
>> make all the equal signs line up.  That's just awful.

> Modern versions of perltidy give you much more control over this, so 
> maybe we need to investigate the possibility of updating.

I have no objection to updating perltidy from time to time. I think the
idea is just to make sure that we have an agreed-on version for everyone
to use.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

04 February 2023, 02:38:08

On Thu, Feb 02, 2023 at 11:34:37AM +0000, Dean Rasheed wrote:
> I didn't reply until now, but I'm solidly in the camp of committers
> who care about keeping the tree properly indented, and I wouldn't have
> any problem with such a check being imposed.

So do I.  pgindent is part of my routine when it comes to all the
patches I merge on HEAD, and having to clean up unrelated diffs in
the files touched after an indentation is always annoying.

FWIW, I just use a script that does pgindent, pgperltidy, pgperlcritic
and `make reformat-dat-files` in src/include/catalog.

> I regularly run pgindent locally, and if I ever commit without
> indenting, it's either intentional, or because I forgot, so the
> reminder would be useful.
>
> And as someone who runs pgindent regularly, I think this will be a net
> time saver, since I won't have to skip over other unrelated indent
> changes all the time.

+1.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

04 February 2023, 04:18:48

On Fri, Feb 03, 2023 at 12:52:50PM -0500, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > On 2023-01-22 Su 17:47, Tom Lane wrote:
> >> Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
> >> another assignment in a series of assignments, it is very likely to
> >> reformat all the adjacent assignments because it thinks it's cool to
> >> make all the equal signs line up.  That's just awful.
> 
> > Modern versions of perltidy give you much more control over this, so 
> > maybe we need to investigate the possibility of updating.
> 
> I have no objection to updating perltidy from time to time. I think the
> idea is just to make sure that we have an agreed-on version for everyone
> to use.

Agreed.  If we're changing the indentation of assignments, that's a
considerable diff already.  It would be a good time to absorb other diffs
we'll want eventually, including diffs from a perltidy version upgrade.

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

04 February 2023, 11:34:53

Hi,

On 2023-02-03 20:18:48 -0800, Noah Misch wrote:
> On Fri, Feb 03, 2023 at 12:52:50PM -0500, Tom Lane wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> > > On 2023-01-22 Su 17:47, Tom Lane wrote:
> > >> Yeah.  That's one of my biggest gripes about pgperltidy: if you insert
> > >> another assignment in a series of assignments, it is very likely to
> > >> reformat all the adjacent assignments because it thinks it's cool to
> > >> make all the equal signs line up.  That's just awful.
> > 
> > > Modern versions of perltidy give you much more control over this, so 
> > > maybe we need to investigate the possibility of updating.
> > 
> > I have no objection to updating perltidy from time to time. I think the
> > idea is just to make sure that we have an agreed-on version for everyone
> > to use.
> 
> Agreed.  If we're changing the indentation of assignments, that's a
> considerable diff already.  It would be a good time to absorb other diffs
> we'll want eventually, including diffs from a perltidy version upgrade.

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files.  So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

04 February 2023, 14:20:26

On 2023-02-04 Sa 06:34, Andres Freund wrote:

ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files.  So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?

Sure, we don't have to tie them together.

I'm currently experimenting with settings on the buildfarm code, trying to get it both stable and looking nice. Then I'll try on the Postgres core code and post some results.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

04 February 2023, 16:07:59

Andres Freund <andres@anarazel.de> writes:
> ISTM that we're closer to being able to enforce pgindent than
> perltidy. At the same time, I think the issue of C code in HEAD not
> being indented is more pressing - IME it's much more common to have to
> touch a lot of C code than to have to touch a lot fo perl files.  So
> perhaps we should just start with being more stringent with C code, and
> once we made perltidy less noisy, add that?

Agreed, we should move more slowly with perltidy.  Aside from the
points you raise, I bet fewer committers have it installed at all.

(I haven't forgotten that I'm on the hook to import pg_bsd_indent
into our tree.  Will get to that soon.)

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Justin Pryzby

Date:

04 February 2023, 17:11:01

On Sat, Feb 04, 2023 at 11:07:59AM -0500, Tom Lane wrote:
> (I haven't forgotten that I'm on the hook to import pg_bsd_indent
> into our tree.  Will get to that soon.)

+1 for that - it's no surprise that you have trouble convincing people
to follow the current process:

1) requires using a hacked copy of BSD indent; 2) which is stored
outside the main repo; 3) is run via a perl script that itself mungles
the source code (because the only indent tool that can support the
project's style doesn't actually support what's needed); 4) and wants to
retrieve a remote copy of typedefs.list (?). 

The only thing that makes this scheme even remotely viable is that
apt.postgresql.org includes a package for pg-bsd-indent.  I've used it
only a handful of times by running:
pg_bsd_indent -bad -bap -bbb -bc -bl -cli1 -cp33 -cdb -nce -d0 -di12 -nfc1 -i4 -l79 -lp -lpl -nip -npro -sac -tpg -ts4
-U.../typedefs.list

The perl wrapper is still a step too far for me (maybe it'd be tolerable
if available as a build target).

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

-- 
Justin

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

04 February 2023, 17:37:11

Justin Pryzby <pryzby@telsasoft.com> writes:
> Would you want to make those the default options of the in-tree indent ?
> Or provide a shortcut like --postgresql ?

Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that).  I could see inventing a --postgresql shortcut switch perhaps.

But it's not clear to me why you're allergic to the perl wrapper?
It's not like that's the only perl infrastructure in our build process.
Also, whether or not we could push some of what it does into pg_bsd_indent
proper, I can't see pushing all of it (for instance, the very PG-specific
list of typedef exclusions).

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

05 February 2023, 14:29:08

On 2023-02-04 Sa 09:20, Andrew Dunstan wrote:

On 2023-02-04 Sa 06:34, Andres Freund wrote:
ISTM that we're closer to being able to enforce pgindent than
perltidy. At the same time, I think the issue of C code in HEAD not
being indented is more pressing - IME it's much more common to have to
touch a lot of C code than to have to touch a lot fo perl files.  So
perhaps we should just start with being more stringent with C code, and
once we made perltidy less noisy, add that?
Sure, we don't have to tie them together.
I'm currently experimenting with settings on the buildfarm code, trying to get it both stable and looking nice. Then I'll try on the Postgres core code and post some results.

So here's a diff made from running perltidy v20221112 with the additional setting --valign-exclusion-list=", = => || && if unless"

Essentially this abandons those bits of vertical alignment that tend to catch us out when additions are made to the code.

I think this will make the code much more maintainable and result in much less perltidy churn. It would also mean that it's far more likely that what we would naturally code can be undisturbed by perltidy.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgperltidy-no-valign.patch.gz

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

06 February 2023, 14:40:02

On Sat, Feb 4, 2023 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But it's not clear to me why you're allergic to the perl wrapper?
> It's not like that's the only perl infrastructure in our build process.
> Also, whether or not we could push some of what it does into pg_bsd_indent
> proper, I can't see pushing all of it (for instance, the very PG-specific
> list of typedef exclusions).

I don't mind that there is a script. I do mind that it's not that good
of a script. There have been some improvements for which I am
grateful, like removing the thing where the first argument was taken
as a typedefs file under some circumstances. But there are still some
things that I would like:

1. I'd like to be able to run pgindent src/include and have it indent
everything relevant under src/include. Right now that silently does
nothing.

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

3. I'd also like an easy way to indent every file touched by a recent
commit, e.g. pgindent --commit HEAD, pgindent --commit HEAD~2,
pgindent --commit 62e1e28bf7.

It'd be much less annoying to include this in my workflow with these
kinds of options. For instance:

patch -p1 < ~/Downloads/some_stuff_v94.patch
# committer adjustments as desired
git add -u
pgindent --cached
git diff # did pgindent change anything? does it look ok?
git commit -a

For a while I, like some others here, was trying to be religious about
pgindenting at least the bigger commits that I pushed. But I fear I've
grown slack. I don't mind if we tighten up the process, but the better
we make the tools, the less friction it will cause.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

06 February 2023, 15:16:05

Robert Haas <robertmhaas@gmail.com> writes:
> I don't mind that there is a script. I do mind that it's not that good
> of a script. There have been some improvements for which I am
> grateful, like removing the thing where the first argument was taken
> as a typedefs file under some circumstances. But there are still some
> things that I would like:

I have no objection to someone coding those things up ;-).
I'll just note that adding features like those to a Perl script
is going to be a ton easier than doing it inside pg_bsd_indent.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

06 February 2023, 15:21:07

On 2023-02-06 Mo 09:40, Robert Haas wrote:

On Sat, Feb 4, 2023 at 12:37 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

But it's not clear to me why you're allergic to the perl wrapper?
It's not like that's the only perl infrastructure in our build process.
Also, whether or not we could push some of what it does into pg_bsd_indent
proper, I can't see pushing all of it (for instance, the very PG-specific
list of typedef exclusions).

I don't mind that there is a script. I do mind that it's not that good
of a script. There have been some improvements for which I am
grateful, like removing the thing where the first argument was taken
as a typedefs file under some circumstances. But there are still some
things that I would like:

1. I'd like to be able to run pgindent src/include and have it indent
everything relevant under src/include. Right now that silently does
nothing.

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

3. I'd also like an easy way to indent every file touched by a recent
commit, e.g. pgindent --commit HEAD, pgindent --commit HEAD~2,
pgindent --commit 62e1e28bf7.

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

06 February 2023, 15:35:57

On Mon, Feb 6, 2023 at 10:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'll just note that adding features like those to a Perl script
> is going to be a ton easier than doing it inside pg_bsd_indent.

+1.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

06 February 2023, 15:36:18

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

Thanks!

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

06 February 2023, 17:03:47

On 2023-02-06 Mo 10:36, Robert Haas wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.

Thanks!

Here's a quick patch for 1 and 3. Would also need to adjust the docco.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

06 February 2023, 17:17:02

On 02.02.23 07:40, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
>> Regarding the concern about a pre-receive hook blocking an emergency push, the
>> hook could approve every push where a string like "pgindent: no" appears in a
>> commit message within the push.  You'd still want to make the tree clean
>> sometime the same week or so.  It's cheap to provide a break-glass like that.
> 
> I think the real question here is whether we can get all (or at least
> a solid majority of) committers to accept such draconian constraints.
> I'd buy into it, and evidently so would you, but I can't help noting
> that less than a quarter of active committers have bothered to
> comment on this thread.  I suspect the other three-quarters would
> be quite annoyed if we tried to institute such requirements.  That's
> not manpower we can afford to drive away.

I have some concerns about this.

First, as a matter of principle, it would introduce another level of 
gatekeeping power.  Right now, the committers are as a group in charge 
of what gets into the tree.  Adding commit hooks that are installed 
somewhere(?) by someone(?) and can only be seen by some(?) would upset 
that.  If we were using something like github or gitlab (not suggesting 
that, but for illustration), then you could put this kind of thing under 
.github/ or similar and then it would be under the same control as the 
source code itself.

Also, pgindent takes tens of seconds to run, so hooking that into the 
git push process would slow this down quite a bit.  And maybe we want to 
add pgperltidy and so on, where would this lead?  If somehow your local 
indenting doesn't give you the "correct" result for some reason, you 
might sit there for minutes and minutes trying to fix and push and fix 
and push.

Then, consider the typedefs issue.  If you add a typedef but don't add 
it to the typedefs list but otherwise pgindent your code perfectly, the 
push would be accepted.  If then later someone updates the typedefs 
list, perhaps from the build farm, it would then reject the indentation 
of your previously committed code, thus making it their problem.

I think a better way to address these issues would be making this into a 
test suite, so that you can run some command that checks "is everything 
indented correctly".  Then you can run this locally, on the build farm, 
in the cfbot etc. in a uniform way and apply the existing 
blaming/encouragement processes like for any other test failure.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

06 February 2023, 17:53:44

On 2023-02-06 Mo 12:03, Andrew Dunstan wrote:

On 2023-02-06 Mo 10:36, Robert Haas wrote:
On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:
Good suggestions. 1 and 3 seem fairly straightforward. I'll start on those, and look into 2.
Thanks!
Here's a quick patch for 1 and 3. Would also need to adjust the docco.

This time with patch.

cheers

andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-directory+git-enhancements.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

06 February 2023, 21:13:24

On 2023-02-06 Mo 12:17, Peter Eisentraut wrote:

On 02.02.23 07:40, Tom Lane wrote:
Noah Misch <noah@leadboat.com> writes:
Regarding the concern about a pre-receive hook blocking an emergency push, the
hook could approve every push where a string like "pgindent: no" appears in a
commit message within the push. You'd still want to make the tree clean
sometime the same week or so. It's cheap to provide a break-glass like that.

I think the real question here is whether we can get all (or at least
a solid majority of) committers to accept such draconian constraints.
I'd buy into it, and evidently so would you, but I can't help noting
that less than a quarter of active committers have bothered to
comment on this thread. I suspect the other three-quarters would
be quite annoyed if we tried to institute such requirements. That's
not manpower we can afford to drive away.

I have some concerns about this.

First, as a matter of principle, it would introduce another level of gatekeeping power. Right now, the committers are as a group in charge of what gets into the tree. Adding commit hooks that are installed somewhere(?) by someone(?) and can only be seen by some(?) would upset that. If we were using something like github or gitlab (not suggesting that, but for illustration), then you could put this kind of thing under .github/ or similar and then it would be under the same control as the source code itself.

Also, pgindent takes tens of seconds to run, so hooking that into the git push process would slow this down quite a bit. And maybe we want to add pgperltidy and so on, where would this lead? If somehow your local indenting doesn't give you the "correct" result for some reason, you might sit there for minutes and minutes trying to fix and push and fix and push.

Well, pgindent should produce canonical results or we're surely doing it wrong. Regarding the time it takes, if we are only indenting the changed files that time will be vastly reduced for most cases.

But I take your point to some extent. I think we should start by making it easier and quicker to run pgindent locally, both by hand and in local git hooks, for ordinary developers and for committers, and we should encourage committers to be stricter in their use of pgindent. If there are features we need to make this possible, speak up (c.f. Robert's email earlier today). I'm committed to making this as easy as possible for people.

Once we get over those hurdles we can possibly revisit automation.

Then, consider the typedefs issue. If you add a typedef but don't add it to the typedefs list but otherwise pgindent your code perfectly, the push would be accepted. If then later someone updates the typedefs list, perhaps from the build farm, it would then reject the indentation of your previously committed code, thus making it their problem.

It would be nice if there were a gadget that would find new typedefs and warn you about them. Unfortunately our current code to find typedefs isn't all that fast either. Nicer still would be a way of not needing the typedefs list, but I don't think anyone has come up with one yet that meets our other requirements.

I think a better way to address these issues would be making this into a test suite, so that you can run some command that checks "is everything indented correctly". Then you can run this locally, on the build farm, in the cfbot etc. in a uniform way and apply the existing blaming/encouragement processes like for any other test failure.

Well arguably the new --silent-diff and --show-diff modes are such tests :-)

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

06 February 2023, 21:21:57

Hi,

On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
> First, as a matter of principle, it would introduce another level of
> gatekeeping power.  Right now, the committers are as a group in charge of
> what gets into the tree.  Adding commit hooks that are installed
> somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
> If we were using something like github or gitlab (not suggesting that, but
> for illustration), then you could put this kind of thing under .github/ or
> similar and then it would be under the same control as the source code
> itself.

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

> Also, pgindent takes tens of seconds to run, so hooking that into the git
> push process would slow this down quite a bit.  And maybe we want to add
> pgperltidy and so on, where would this lead?

Yes, I've been annoyed by this as well. This is painful, even without a push
hook.  Not just for pgindent, headerscheck/cpluspluscheck are quite painful as
well. I came to the conclusion that we ought to integrate pgindent etc into
the buildsystem properly. Instead of running such targets serially across all
files, without logic to prevent re-processing files, the relevant targets
ought to be run once for each process, and create a stamp file.

> If somehow your local indenting doesn't give you the "correct" result for
> some reason, you might sit there for minutes and minutes trying to fix and
> push and fix and push.

I was imagining that such a pre-receive hook would spit out the target that
you'd need to run locally to verify that the issue is resolved.

> Then, consider the typedefs issue.  If you add a typedef but don't add it to
> the typedefs list but otherwise pgindent your code perfectly, the push would
> be accepted.  If then later someone updates the typedefs list, perhaps from
> the build farm, it would then reject the indentation of your previously
> committed code, thus making it their problem.

I'd like to address this one via the buildsystem as well. We can do the
trickery that the buildfarm uses to extract typedefs as part of the build, and
update typedefs.list with the additional types.  There's really no need to
force us to do this manually.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

06 February 2023, 21:36:07

Andres Freund <andres@anarazel.de> writes:
> On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
>> First, as a matter of principle, it would introduce another level of
>> gatekeeping power.  Right now, the committers are as a group in charge of
>> what gets into the tree.  Adding commit hooks that are installed
>> somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
>> If we were using something like github or gitlab (not suggesting that, but
>> for illustration), then you could put this kind of thing under .github/ or
>> similar and then it would be under the same control as the source code
>> itself.

> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.

Yeah.  I don't think we are seriously considering putting any restrictions
in place on gitmaster --- the idea is to offer better tools to committers
to let them check/fix the indentation of what they are working on.  If
somebody wants to run that as a local pre-commit hook, that's their choice.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

07 February 2023, 04:43:24

On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> Also, pgindent takes tens of seconds to run, so hooking that into the git
> push process would slow this down quite a bit.

The pre-receive hook would do a full pgindent when you change typedefs.list.
Otherwise, it would reindent only the files being changed.  The average push
need not take tens of seconds.

> If somehow your local
> indenting doesn't give you the "correct" result for some reason, you might
> sit there for minutes and minutes trying to fix and push and fix and push.

As Andres mentioned, the hook could print the command it used.  It could even
print the diff it found, for you to apply.

On Mon, Feb 06, 2023 at 04:36:07PM -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-02-06 18:17:02 +0100, Peter Eisentraut wrote:
> >> First, as a matter of principle, it would introduce another level of
> >> gatekeeping power.  Right now, the committers are as a group in charge of
> >> what gets into the tree.  Adding commit hooks that are installed
> >> somewhere(?) by someone(?) and can only be seen by some(?) would upset that.
> >> If we were using something like github or gitlab (not suggesting that, but
> >> for illustration), then you could put this kind of thing under .github/ or
> >> similar and then it would be under the same control as the source code
> >> itself.
> 
> > Well, we did talk about adding a pre-commit hook to the repository, with
> > instructions for how to enable it. And I don't see a problem with adding the
> > pre-receive we're discussing here to src/tools/something.
> 
> Yeah.  I don't think we are seriously considering putting any restrictions
> in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

07 February 2023, 11:45:45

On 2023-02-06 Mo 23:43, Noah Misch wrote:

Well, we did talk about adding a pre-commit hook to the repository, with
instructions for how to enable it. And I don't see a problem with adding the
pre-receive we're discussing here to src/tools/something.

Yeah.  I don't think we are seriously considering putting any restrictions
in place on gitmaster

I could have sworn that was exactly what we were discussing, a pre-receive
hook on gitmaster.

That's one idea that's been put forward, but it seems clear that some people are nervous about it.

Maybe a better course would be to continue improving the toolset and get more people comfortable with using it locally and then talk about integrating it upstream.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Amit Kapila

Date:

07 February 2023, 12:56:43

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>
>
> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.
>
> Yeah.  I don't think we are seriously considering putting any restrictions
> in place on gitmaster
>
> I could have sworn that was exactly what we were discussing, a pre-receive
> hook on gitmaster.
>
>
> That's one idea that's been put forward, but it seems clear that some people are nervous about it.
>
> Maybe a better course would be to continue improving the toolset and get more people comfortable with using it
locallyand then talk about integrating it upstream.
 
>

Yeah, that sounds more reasonable to me as well.

-- 
With Regards,
Amit Kapila.

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

07 February 2023, 12:59:53

On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>
>
> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.
>
> Yeah. I don't think we are seriously considering putting any restrictions
> in place on gitmaster
>
> I could have sworn that was exactly what we were discussing, a pre-receive
> hook on gitmaster.
>
>
> That's one idea that's been put forward, but it seems clear that some people are nervous about it.
>
> Maybe a better course would be to continue improving the toolset and get more people comfortable with using it locally and then talk about integrating it upstream.
>

Yeah, that sounds more reasonable to me as well.

If we wanted something "in between" we could perhaps also have a async ci job that runs after each commit and sends an emali to the committer if the commit doesn't match up, instead of rejecting it hard but still getting some relatively fast feedback.

Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

07 February 2023, 13:17:49

On 2023-02-06 Mo 09:40, Robert Haas wrote:

2. I'd like an easy way to indent the unstaged files in the current
directory (e.g. pgindent --dirty) or the files that have been queued
up for commit (e.g. pgindent --cached).

My git-fu is probably not all that it should be. I think we could possibly get at this list of files by running

  git status --porcelain --untracked-files=no --ignored=no -- .

And then your --dirty list would be lines beginning with ' M' while your --cached list would be lines beginning with 'A[ M]'

Does that seem plausible?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

07 February 2023, 13:33:33

On 2023-02-07 Tu 07:59, Magnus Hagander wrote:

On Tue, Feb 7, 2023 at 1:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>
>
> Well, we did talk about adding a pre-commit hook to the repository, with
> instructions for how to enable it. And I don't see a problem with adding the
> pre-receive we're discussing here to src/tools/something.
>
> Yeah. I don't think we are seriously considering putting any restrictions
> in place on gitmaster
>
> I could have sworn that was exactly what we were discussing, a pre-receive
> hook on gitmaster.
>
>
> That's one idea that's been put forward, but it seems clear that some people are nervous about it.
>
> Maybe a better course would be to continue improving the toolset and get more people comfortable with using it locally and then talk about integrating it upstream.
>

Yeah, that sounds more reasonable to me as well.

If we wanted something "in between" we could perhaps also have a async ci job that runs after each commit and sends an emali to the committer if the commit doesn't match up, instead of rejecting it hard but still getting some relatively fast feedback.

Sure, worth trying. We can always turn it off and no harm done if it doesn't suit. I'd probably start by having it email a couple of guinea pigs like you and me before turning it loose on committers generally. LMK if you need help with it.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

07 February 2023, 15:17:44

Amit Kapila <amit.kapila16@gmail.com> writes:
> On Tue, Feb 7, 2023 at 5:16 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 2023-02-06 Mo 23:43, Noah Misch wrote:
>>>> Yeah.  I don't think we are seriously considering putting any restrictions
>>>> in place on gitmaster

>>> I could have sworn that was exactly what we were discussing, a pre-receive
>>> hook on gitmaster.

>> That's one idea that's been put forward, but it seems clear that some people are nervous about it.
>> Maybe a better course would be to continue improving the toolset and get more people comfortable with using it
locallyand then talk about integrating it upstream. 

> Yeah, that sounds more reasonable to me as well.

+1.  Even if we end up with such a hook, we need to walk before we
can run.  The tooling of which we speak doesn't even exist today,
so it's unlikely to be bug-free, fast, and convenient to use
just two months from now.  Let's have some people use whatever
is proposed for awhile locally, and see what their experience is.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Justin Pryzby

Date:

07 February 2023, 15:25:42

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:
> Justin Pryzby <pryzby@telsasoft.com> writes:

> Hmmm ... inserting all of those as the default options would likely
> make it impossible to update pg_bsd_indent itself with anything like
> its current indent style (not that it's terribly consistent about
> that).  I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.

> > Would you want to make those the default options of the in-tree indent ?
> > Or provide a shortcut like --postgresql ?
> 
> But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports.  But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool.  The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code.  That could be
supported in pg-bsd-indent with a one line change.  I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.

On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".

-- 
Justin

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

07 February 2023, 15:46:01

Justin Pryzby <pryzby@telsasoft.com> writes:
> On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:
>> But it's not clear to me why you're allergic to the perl wrapper?

> My allergy is to the totality of the process, not to the perl component.
> It's a bit weird to enforce a coding style that no upstream indent tool
> supports.  But what's weirder is that, *having forked the indent tool*,
> it still doesn't implement the desired style, and the perl wrapper tries
> to work around that.

> It would be more reasonable if the forked C program knew how to handle
> the stuff for which the perl script currently has kludges to munge the
> source code before indenting and then un-munging afterwards.

[ shrug... ]  If you want to put cycles into that, nobody is stopping
you.  For me, it sounds like make-work.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

07 February 2023, 15:46:51

On 2023-02-07 Tu 10:25, Justin Pryzby wrote:

On Sat, Feb 04, 2023 at 12:37:11PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

Hmmm ... inserting all of those as the default options would likely
make it impossible to update pg_bsd_indent itself with anything like
its current indent style (not that it's terribly consistent about
that).  I could see inventing a --postgresql shortcut switch perhaps.

Or you could add ./.indent.pro, or ./src/tools/indent.profile for it to
read.

Would you want to make those the default options of the in-tree indent ?
Or provide a shortcut like --postgresql ?

But it's not clear to me why you're allergic to the perl wrapper?

My allergy is to the totality of the process, not to the perl component.
It's a bit weird to enforce a coding style that no upstream indent tool
supports.  But what's weirder is that, *having forked the indent tool*,
it still doesn't implement the desired style, and the perl wrapper tries
to work around that.

It would be more reasonable if the forked C program knew how to handle
the stuff for which the perl script currently has kludges to munge the
source code before indenting and then un-munging afterwards.

Or if the indentation were handled by the (or a) perl script itself.

Or if the perl script handled everything that an unpatched "ident"
didn't handle, rather than some things but not others, demanding use of
a patched indent as well as a perl wrapper (not just for looping around
files and fancy high-level shortcuts like indenting staged files).

On the one hand, "indent" ought to handle all the source-munging stuff.
On the other hand, it'd be better to use an unpatched indent tool.  The
current way is the worst of both worlds.

Currently, the perl wrapper supports the "/* -"-style comments that
postgres wants to use (why?) by munging the source code.  That could be
supported in pg-bsd-indent with a one line change.  I think an even
better option would be to change postgres' C files to use "/*-" without
a space, which requires neither perl munging nor patching indent.

Historically we used to do a heck of a lot more in pgindent that is currently done in the pre_indent and post_indent functions. If you want to spend time implementing that logic in pg_bsd_indent so we can remove the remaining bits of that processing then go for it.

On a less critical note, I wonder if it's a good idea to import
pgbsdindent as a git "submodule".

Meh, git submodules can be a pain in the neck in my limited experience. I'd rather steer clear.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

07 February 2023, 16:10:56

On Tue, Feb 7, 2023 at 8:17 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> My git-fu is probably not all that it should be. I think we could possibly get at this list of files by running
>
>   git status --porcelain --untracked-files=no --ignored=no -- .
>
> And then your --dirty list would be lines beginning with ' M' while your --cached list would be lines beginning with
'A[M]'

>
> Does that seem plausible?

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

So to indent files that are dirty, you would look at:

git diff --name-status

For what's cached:

git diff --name-status --cached

For the combination of the two:

git diff --name-status HEAD

For a prior commit:

git show --name-status $COMMITID

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

07 February 2023, 16:32:30

On Tue, 7 Feb 2023 at 17:11, Robert Haas <robertmhaas@gmail.com> wrote:
> I don't know if that works or not, but it does seem plausible, at
> least. My idea would have been to use the --name-status option, which
> works for both git diff and git show. You just look and see which
> lines in the output start with M or A and then take the file names
> from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

07 February 2023, 16:57:31

On Tue, Feb 7, 2023 at 11:32 AM Jelte Fennema <postgres@jeltef.nl> wrote:
> On Tue, 7 Feb 2023 at 17:11, Robert Haas <robertmhaas@gmail.com> wrote:
> > I don't know if that works or not, but it does seem plausible, at
> > least. My idea would have been to use the --name-status option, which
> > works for both git diff and git show. You just look and see which
> > lines in the output start with M or A and then take the file names
> > from those lines.
>
> If you add `--diff-filter=ACMR`, then git diff/show will only show
> Added, Copied, Modified, and Renamed files.
>
> The pre-commit hook that Andrew added to the wiki uses that in
> combination with --name-only to get the list of files that you want to
> check on commit:
> https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

Thanks, that sounds nicer than what I suggested.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

07 February 2023, 17:21:19

> On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:
>
> Here's a quick patch for 1 and 3. Would also need to adjust the docco.
>
>
>
> This time with patch.

When supplying the --commit flag it still formats all files for me. I
was able to fix that by replacing:
# no non-option arguments given. so do everything in the current directory
$code_base ||= '.' unless @ARGV;

with:
# no files, dirs or commits given. so do everything in the current directory
$code_base ||= '.' unless @ARGV || @commits;

Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

08 February 2023, 12:41:59

On 2023-02-07 Tu 12:21, Jelte Fennema wrote:

On Mon, Feb 6, 2023 at 10:21 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Here's a quick patch for 1 and 3. Would also need to adjust the docco.



This time with patch.

When supplying the --commit flag it still formats all files for me. I
was able to fix that by replacing:
# no non-option arguments given. so do everything in the current directory
$code_base ||= '.' unless @ARGV;

with:
# no files, dirs or commits given. so do everything in the current directory
$code_base ||= '.' unless @ARGV || @commits;

Yeah, thanks for testing. Here's a new patch with that change and the comment adjusted.


Does the code-base flag still make sense if you can simply pass a
directory as regular args now?

Probably not. I'll look into removing it.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-directory+git-enhancements-v2.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

08 February 2023, 13:27:07

On 2023-02-08 We 07:41, Andrew Dunstan wrote:

On 2023-02-07 Tu 12:21, Jelte Fennema wrote:
Does the code-base flag still make sense if you can simply pass a
directory as regular args now?
Probably not. I'll look into removing it.

What we should probably do is remove all the build stuff along with $code_base. It dates back to the time when I developed this as an out of tree replacement for the old pgindent, and is just basically wasted space now. After I get done with the current round of enhancements I'll reorganize the script and get rid of things we don't need any more.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

08 February 2023, 17:06:22

With the new patch --commit works as expected for me now. And sounds
good to up the script a bit afterwards.

On Wed, 8 Feb 2023 at 14:27, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2023-02-08 We 07:41, Andrew Dunstan wrote:
>
>
> On 2023-02-07 Tu 12:21, Jelte Fennema wrote:
>
>
> Does the code-base flag still make sense if you can simply pass a
> directory as regular args now?
>
>
> Probably not. I'll look into removing it.
>
>
>
>
> What we should probably do is remove all the build stuff along with $code_base. It dates back to the time when I
developedthis as an out of tree replacement for the old pgindent, and is just basically wasted space now. After I get
donewith the current round of enhancements I'll reorganize the script and get rid of things we don't need any more. 
>
>
> cheers
>
>
> andrew
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

08 February 2023, 22:09:45

On 2023-02-08 We 12:06, Jelte Fennema wrote:

With the new patch --commit works as expected for me now. And sounds
good to up the script a bit afterwards.

Thanks, I have committed this. Still looking at Robert's other request.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

RE: run pgindent on a regular basis / scripted manner

From

"Shinoda, Noriyoshi (PN Japan FSIP)"

Date:

09 February 2023, 02:29:41

Hi,
I tried the committed pgindent.
The attached small patch changes spaces in the usage message to tabs.
Options other than --commit start with a tab.

Regards,

Noriyoshi Shinoda

From: Andrew Dunstan <andrew@dunslane.net>
Sent: Thursday, February 9, 2023 7:10 AM
To: Jelte Fennema <postgres@jeltef.nl>
Cc: Robert Haas <robertmhaas@gmail.com>; Tom Lane <tgl@sss.pgh.pa.us>; Justin Pryzby <pryzby@telsasoft.com>; Andres Freund <andres@anarazel.de>; Noah Misch <noah@leadboat.com>; Peter Geoghegan <pg@bowt.ie>; Bruce Momjian <bruce@momjian.us>; Magnus Hagander <magnus@hagander.net>; Alvaro Herrera <alvherre@2ndquadrant.com>; Stephen Frost <sfrost@snowman.net>; Jesse Zhang <sbjesse@gmail.com>; pgsql-hackers@postgresql.org
Subject: Re: run pgindent on a regular basis / scripted manner

On 2023-02-08 We 12:06, Jelte Fennema wrote:

With the new patch --commit works as expected for me now. And sounds

good to up the script a bit afterwards.

Thanks, I have committed this. Still looking at Robert's other request.

cheers

andrew

--

Andrew Dunstan

EDB: https://www.enterprisedb.com

Attachment

pgindent_usage_v1.diff

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

09 February 2023, 18:34:34

On 2023-02-08 We 21:29, Shinoda, Noriyoshi (PN Japan FSIP) wrote:

@font-face {font-family:"ＭＳ明朝"; panose-1:2 2 6 9 4 2 5 8 3 4;}@font-face {font-family:"ＭＳゴシック"; panose-1:2 11 6 9 7 2 5 8 2 4;}@font-face {font-family:"Cambria Math"; panose-1:2 4 5 3 5 4 6 3 2 4;}@font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;}@font-face {font-family:"ＭＳＰゴシック"; panose-1:2 11 6 0 7 2 5 8 2 4;}@font-face {font-family:"\@ＭＳＰゴシック"; panose-1:2 11 6 0 7 2 5 8 2 4;}@font-face {font-family:"\@ＭＳ明朝"; panose-1:2 2 6 9 4 2 5 8 3 4;}@font-face {font-family:"\@ＭＳゴシック"; panose-1:2 11 6 9 7 2 5 8 2 4;}p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0mm; font-size:12.0pt; font-family:"ＭＳＰゴシック";}a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;}pre {mso-style-priority:99; mso-style-link:"HTML 書式付き $文字$"; margin:0mm; font-size:12.0pt; font-family:"ＭＳゴシック";}span.HTML {mso-style-name:"HTML 書式付き $文字$"; mso-style-priority:99; mso-style-link:"HTML 書式付き"; font-family:"Courier New";}.MsoChpDefault {mso-style-type:export-only; font-size:10.0pt;}div.WordSection1 {page:WordSection1;}
Hi,
I tried the committed pgindent.
The attached small patch changes spaces in the usage message to tabs.
Options other than --commit start with a tab.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

RE: run pgindent on a regular basis / scripted manner

From

"shiy.fnst@fujitsu.com"

Date:

10 February 2023, 02:37:40

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> Thanks, I have committed this. Still looking at Robert's other request.
>

Hi,

I tried the new option --commit and found that it seems to try to indent files
which are deleted in the specified commit and reports an error.

cannot open file "src/backend/access/brin/brin.c": No such file or directory

It looks we should filter such files.

Regards,
Shi Yu

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

10 February 2023, 09:25:44

Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.

On Fri, 10 Feb 2023 at 03:37, shiy.fnst@fujitsu.com
<shiy.fnst@fujitsu.com> wrote:
>
> On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> > Thanks, I have committed this. Still looking at Robert's other request.
> >
>
> Hi,
>
> I tried the new option --commit and found that it seems to try to indent files
> which are deleted in the specified commit and reports an error.
>
> cannot open file "src/backend/access/brin/brin.c": No such file or directory
>
> It looks we should filter such files.
>
> Regards,
> Shi Yu

Attachment

0001-Fix-pgindent-commit-flag-and-add-dirty-and-staged-fl.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

10 February 2023, 14:26:50

On 2023-02-07 Tu 11:32, Jelte Fennema wrote:

On Tue, 7 Feb 2023 at 17:11, Robert Haas <robertmhaas@gmail.com> wrote:

I don't know if that works or not, but it does seem plausible, at
least. My idea would have been to use the --name-status option, which
works for both git diff and git show. You just look and see which
lines in the output start with M or A and then take the file names
from those lines.

If you add `--diff-filter=ACMR`, then git diff/show will only show
Added, Copied, Modified, and Renamed files.

The pre-commit hook that Andrew added to the wiki uses that in
combination with --name-only to get the list of files that you want to
check on commit:
https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

OK, here's a patch based on Robert's and Jelte's ideas.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-dirty-cached.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

10 February 2023, 15:21:49

On 2023-02-10 Fr 04:25, Jelte Fennema wrote:

Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.

[please don't top-post]

I don't think just adding a diff filter is really a sufficient fix. The file might have been deleted since the commit(s) in question. Here's a more general fix for missing files.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-missing-file-fix.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

12 February 2023, 14:16:25

On 2023-02-10 Fr 10:21, Andrew Dunstan wrote:

On 2023-02-10 Fr 04:25, Jelte Fennema wrote:
Ah yes, I had seen that when I read the initial --commit patch but
then forgot about it when the flag didn't work at all when I tried it.

Attached is a patch that fixes the issue. And also implements the
--dirty and --staged flags in pgindent that Robert Haas requested.
I don't think just adding a diff filter is really a sufficient fix. The file might have been deleted since the commit(s) in question. Here's a more general fix for missing files.

OK, I've pushed this along with a check to make sure we only process each file once.

I'm not sure how much more I really want to do here. Given the way pgindent now processes command line arguments, maybe the best thing is for people to use that. Use of git aliases can help. Something like these for example

[alias]

    dirty = diff --name-only --diff-filter=ACMU -- .
   staged = diff --name-only --cached --diff-filter=ACMU -- .
   dstaged = diff --name-only --diff-filter=ACMU HEAD -- .

and then you could do

pgindent `git dirty`

The only danger would be if there were no dirty files. Maybe we need a switch to inhibit using the current directory if there are no command line files.

Thoughts?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 February 2023, 16:24:14

Andrew Dunstan <andrew@dunslane.net> writes:
> ... then you could do
>      pgindent `git dirty`
> The only danger would be if there were no dirty files. Maybe we need a 
> switch to inhibit using the current directory if there are no command 
> line files.

It seems like "indent the whole tree" is about to become a minority
use-case.  Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

12 February 2023, 20:41:56

On 2023-02-12 Su 11:24, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

... then you could do     pgindent `git dirty`
The only danger would be if there were no dirty files. Maybe we need a 
switch to inhibit using the current directory if there are no command 
line files.

It seems like "indent the whole tree" is about to become a minority
use-case.  Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

I don't think we need --all-files. The attached gets rid of the build and code-base cruft, which is now in any case obsolete given we've put pg_bsd_indent in our code base. So the way to spell this instead of "pgindent --all-files" would be "pgindent ."

I added a warning if there are no files at all specified.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pgindent-remove-build-and-code-base.patch

Re: run pgindent on a regular basis / scripted manner

From

Justin Pryzby

Date:

12 February 2023, 20:59:36

On Sun, Feb 12, 2023 at 11:24:14AM -0500, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > ... then you could do
> >      pgindent `git dirty`
> > The only danger would be if there were no dirty files. Maybe we need a 
> > switch to inhibit using the current directory if there are no command 
> > line files.
> 
> It seems like "indent the whole tree" is about to become a minority
> use-case.  Maybe instead of continuing to privilege that case, we
> should say that it's invoked by some new switch like --all-files,
> and without that only the stuff identified by command-line arguments
> gets processed.

It seems like if pgindent knows about git, it ought to process only
tracked files.  Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

At least --commit doesn't seem to work when run outside of the root
source dir.

-- 
Justin

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 February 2023, 21:13:35

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-02-12 Su 11:24, Tom Lane wrote:
>> It seems like "indent the whole tree" is about to become a minority
>> use-case.  Maybe instead of continuing to privilege that case, we
>> should say that it's invoked by some new switch like --all-files,
>> and without that only the stuff identified by command-line arguments
>> gets processed.

> I don't think we need --all-files. The attached gets rid of the build 
> and code-base cruft, which is now in any case obsolete given we've put 
> pg_bsd_indent in our code base. So the way to spell this instead of 
> "pgindent --all-files" would be "pgindent ."

Ah, of course.

> I added a warning if there are no files at all specified.

LGTM.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

13 February 2023, 12:57:25

On 2023-02-12 Su 15:59, Justin Pryzby wrote:

It seems like if pgindent knows about git, it ought to process only
tracked files.  Then, it wouldn't need to manually exclude generated
files, and it wouldn't process vpath builds and who-knows-what else it
finds in CWD.

for vpath builds use an exclude file that excludes the vpath you use.

I don't really want restrict this to tracked files because it would mean you can't pgindent files before you `git add` them. And we would still need to do manual exclusion for some files that are tracked, e.g. the snowball files.


At least --commit doesn't seem to work when run outside of the root
source dir.

Yeah, I'll fix that, thanks for mentioning.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

13 February 2023, 13:27:37

On 2023-02-12 Su 16:13, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-02-12 Su 11:24, Tom Lane wrote:

It seems like "indent the whole tree" is about to become a minority
use-case.  Maybe instead of continuing to privilege that case, we
should say that it's invoked by some new switch like --all-files,
and without that only the stuff identified by command-line arguments
gets processed.

I don't think we need --all-files. The attached gets rid of the build 
and code-base cruft, which is now in any case obsolete given we've put 
pg_bsd_indent in our code base. So the way to spell this instead of 
"pgindent --all-files" would be "pgindent ."

Ah, of course.

I added a warning if there are no files at all specified.

LGTM.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

13 February 2023, 14:02:39

On Sun, 12 Feb 2023 at 15:16, Andrew Dunstan <andrew@dunslane.net> wrote:
> I'm not sure how much more I really want to do here. Given the way pgindent now processes command line arguments,
maybethe best thing is for people to use that. Use of git aliases can help. Something like these for example

>
>
> [alias]
>
>     dirty = diff --name-only --diff-filter=ACMU -- .
>     staged = diff --name-only --cached --diff-filter=ACMU -- .
>     dstaged = diff --name-only --diff-filter=ACMU HEAD -- .
>
>
> and then you could do
>
>     pgindent `git dirty`
>
>
> The only danger would be if there were no dirty files. Maybe we need a switch to inhibit using the current directory
ifthere are no command line files.

>
>
> Thoughts?

I think indenting staged or dirty files is probably the most common
operation that people want to do with pgindent. So I think that having
dedicated flags makes sense. I agree that it's not strictly necessary
and git aliases help a lot. But the git aliases require you to set
them up. To me making the most common operation as easy as possible to
do, seems worth the few extra lines to pgindent.

Sidenote: You mentioned untracked files in another email. I think that
the --dirty flag should probably also include untracked files. A
command to do so is: git ls-files --others --exclude-standard

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

13 February 2023, 16:46:59

On 2023-02-13 Mo 09:02, Jelte Fennema wrote:

On Sun, 12 Feb 2023 at 15:16, Andrew Dunstan <andrew@dunslane.net> wrote:

I'm not sure how much more I really want to do here. Given the way pgindent now processes command line arguments, maybe the best thing is for people to use that. Use of git aliases can help. Something like these for example


[alias]
    dirty = diff --name-only --diff-filter=ACMU -- .    staged = diff --name-only --cached --diff-filter=ACMU -- .    dstaged = diff --name-only --diff-filter=ACMU HEAD -- .


and then you could do
    pgindent `git dirty`


The only danger would be if there were no dirty files. Maybe we need a switch to inhibit using the current directory if there are no command line files.


Thoughts?

I think indenting staged or dirty files is probably the most common
operation that people want to do with pgindent. So I think that having
dedicated flags makes sense. I agree that it's not strictly necessary
and git aliases help a lot. But the git aliases require you to set
them up. To me making the most common operation as easy as possible to
do, seems worth the few extra lines to pgindent.

OK, but I'd like to hear from more people about what they want. Experience tells me that making assumptions about how people work is not a good idea. I doubt anyone's work pattern is like mine. I don't want to implement an option that three people are going to use.


Sidenote: You mentioned untracked files in another email. I think that
the --dirty flag should probably also include untracked files. A
command to do so is: git ls-files --others --exclude-standard

Thanks for the info.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

13 February 2023, 18:29:19

On Mon, 13 Feb 2023 at 17:47, Andrew Dunstan <andrew@dunslane.net> wrote:
> OK, but I'd like to hear from more people about what they want. Experience tells me that making assumptions about how
peoplework is not a good idea. I doubt anyone's work pattern is like mine. I don't want to implement an option that
threepeople are going to use. 


In the general case I agree with you. But in this specific case I
don't. To me the whole point of this email thread is to nudge people
towards indenting the changes that they are committing. Thus indenting
those changes (either before or after adding) is the workflow that we
want to make as easy as possible. Because even if it's not people
their current workflow, by adding the flag it hopefully becomes their
workflow, because it's so easy to use. So my point is we want to
remove as few hurdles as possible for people to indent their changes
(and setting up git aliases or pre-commit hooks are all hurdles).

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

14 February 2023, 16:46:24

On 2023-02-13 Mo 13:29, Jelte Fennema wrote:

On Mon, 13 Feb 2023 at 17:47, Andrew Dunstan <andrew@dunslane.net> wrote:

OK, but I'd like to hear from more people about what they want. Experience tells me that making assumptions about how people work is not a good idea. I doubt anyone's work pattern is like mine. I don't want to implement an option that three people are going to use.


In the general case I agree with you. But in this specific case I
don't. To me the whole point of this email thread is to nudge people
towards indenting the changes that they are committing. Thus indenting
those changes (either before or after adding) is the workflow that we
want to make as easy as possible. Because even if it's not people
their current workflow, by adding the flag it hopefully becomes their
workflow, because it's so easy to use. So my point is we want to
remove as few hurdles as possible for people to indent their changes
(and setting up git aliases or pre-commit hooks are all hurdles).

(ITYM "remove as many hurdles as possible"). It remains to be seen how much easier any of this will make life for committers, at least. But I concede this might make life a bit simpler for developers generally.

Anyway, let's talk about the details of what is proposed.

So far, we have had the following categories suggested: dirty, staged, dirty+staged, untracked. Are there any others?

Another issue is whether or not to restrict these to files under the current directory. I think we probably should, or at least provide a --relative option.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

15 February 2023, 11:28:07

> (ITYM "remove as many hurdles as possible").

yes, I messed up rewriting that sentence from "having as few hurdles
as possible" to "removing as many hurdles as possible"

> So far, we have had the following categories suggested: dirty, staged, dirty+staged, untracked. Are there any
others?

The two workflows that make most sense to me personally are:
1. staged (indent anything that you're staging for a commit)
2. dirty+staged+untracked (indent anything you've been working on that
is not committed yet)

The obvious way of having --dirty, --staged, and --untracked flags
would require 3 flags for this second (to me seemingly) common
operation. That seems quite unfortunate. So I would propose the
following two flags for those purposes:
1. --staged/--cached (--cached is in line with git, but I personally
think --staged is clearer, git has --staged-only but that seems long
for no reason)
2. --uncommitted

And maybe for completeness we could have the following flags, so you
could target any combination of staged/untracked/dirty files:
3. --untracked (untracked files only)
4. --dirty (tracked files with changes that are not staged)

But I don't know in what workflow people would actually use them.

> Another issue is whether or not to restrict these to files under the current directory. I think we probably should,
orat least provide a --relative option.

Good point, I think it makes sense to restrict it to the current
directory by default. You can always cd to the root of the repo if you
want to format everything.

Re: run pgindent on a regular basis / scripted manner

From

Justin Pryzby

Date:

15 February 2023, 18:45:52

On Mon, Feb 13, 2023 at 07:57:25AM -0500, Andrew Dunstan wrote:
> 
> On 2023-02-12 Su 15:59, Justin Pryzby wrote:
> > It seems like if pgindent knows about git, it ought to process only
> > tracked files.  Then, it wouldn't need to manually exclude generated
> > files, and it wouldn't process vpath builds and who-knows-what else it
> > finds in CWD.
> 
> I don't really want restrict this to tracked files because it would mean you
> can't pgindent files before you `git add` them.

I think you'd allow indenting files which were either tracked *or*
specified on the command line.

Also, it makes a more sense to "add" the file before indenting it, to
allow checking the output and remove unrelated changes.  So that doesn't
seem to me like a restriction of any significance.

But I would never want to indent an untracked file unless I specified
it.

-- 
Justin

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

15 February 2023, 21:00:41

> Also, it makes a more sense to "add" the file before indenting it, to
> allow checking the output and remove unrelated changes.  So that doesn't
> seem to me like a restriction of any significance.

For my workflow it would be the same, but afaik there's two ways that
people commonly use git (mine is 1):
1. Adding changes/files to the staging area using and then committing
those changes:
    git add (-p)/emacs magit/some other editor integration
2. Just add everything that's changed and commit all of it:
    git add -A + git commit/git commit -a

For workflow 1, a --staged/--cached flag would be enough IMHO. But
that's not at all helpful for workflow 2. That's why I proposed
--uncommitted too, to make indenting easier for workflow 2.

> But I would never want to indent an untracked file unless I specified
> it.

Would the --uncommitted flag I proposed be enough of an explicit way
of specifying that you want to indent untracked files?

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

16 February 2023, 04:34:04

On Wed, Feb 15, 2023 at 12:45:52PM -0600, Justin Pryzby wrote:
> On Mon, Feb 13, 2023 at 07:57:25AM -0500, Andrew Dunstan wrote:
> > On 2023-02-12 Su 15:59, Justin Pryzby wrote:
> > > It seems like if pgindent knows about git, it ought to process only
> > > tracked files.  Then, it wouldn't need to manually exclude generated
> > > files, and it wouldn't process vpath builds and who-knows-what else it
> > > finds in CWD.
> > 
> > I don't really want restrict this to tracked files because it would mean you
> > can't pgindent files before you `git add` them.
> 
> I think you'd allow indenting files which were either tracked *or*
> specified on the command line.
> 
> Also, it makes a more sense to "add" the file before indenting it, to
> allow checking the output and remove unrelated changes.  So that doesn't
> seem to me like a restriction of any significance.
> 
> But I would never want to indent an untracked file unless I specified
> it.

Agreed.  I use pgindent three ways:

1. Indent everything that changed between master and the current branch.  Most
   common, since I develop nontrivial patches on branches.
2. Indent all staged files.  For trivial changes.
3. Indent all tracked files.  For typedefs.list changes.

That said, pre-2023 pgindent changed untracked files if called without a file
list.  I've lived with that and could continue to do so.

RE: run pgindent on a regular basis / scripted manner

From

"shiy.fnst@fujitsu.com"

Date:

16 February 2023, 08:26:07

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> Thanks, I have committed this. Still looking at Robert's other request.
>

Hi,

Commit #068a243b7 supported directories to be non-option arguments of pgindent.
But the help text doesn't mention that. Should we update it? Attach a small
patch which did that.

Regards,
Shi Yu

Attachment

v1-0001-Update-help-text-of-pgindent.patch

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

16 February 2023, 16:44:01

On 2023-02-16 Th 03:26, shiy.fnst@fujitsu.com wrote:

On Thu, Feb 9, 2023 6:10 AM Andrew Dunstan <andrew@dunslane.net> wrote:

Thanks, I have committed this. Still looking at Robert's other request.

Hi,

Commit #068a243b7 supported directories to be non-option arguments of pgindent.
But the help text doesn't mention that. Should we update it? Attach a small
patch which did that.

Thanks, pushed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

21 April 2023, 07:58:17

Now that the PG16 feature freeze happened I think it's time to bump
this thread again. As far as I remember everyone that responded (even
previously silent people) were themselves proponents of being more
strict around pgindent.

I think there's two things needed to actually start doing this:
1. We need to reindent the tree to create an indented baseline
2. We need some automation to complain about unindented code being committed

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b. If that doesn't achieve 2 for some reason, or turns
out to have problems we can always change to a afterwards.

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

22 April 2023, 08:50:22

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:
> For 2 the upstream thread listed two approaches:
> a. Install a pre-receive git hook on the git server that rejects
> pushes to master that are not indented
> b. Add a test suite that checks if the code is correctly indented, so
> the build farm would complain about it. (Suggested by Peter E)
>
> I think both a and b would work to achieve 2. But as Peter E said, b
> indeed sounds like less of a divergence of the status quo. So my vote
> would be for b.

FWIW, I think that there is value for both of them.  Anyway, isn't 'a'
exactly the same as 'b' in design?  Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched.  I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

22 April 2023, 11:42:36

On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them.  Anyway, isn't 'a'
exactly the same as 'b' in design?  Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched.  I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

22 April 2023, 12:10:18

On Sat, Apr 22, 2023 at 07:42:36AM -0400, Andrew Dunstan wrote:
> Perhaps we should start with a buildfarm module, which would run pg_indent
> --show-diff.

Nice, I didn't know this one and it has been mentioned a bit on this
thread.  Indeed, it is possible to just rely on that.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

22 April 2023, 12:47:12

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2023-04-22 Sa 04:50, Michael Paquier wrote:
>
> On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:
>
> For 2 the upstream thread listed two approaches:
> a. Install a pre-receive git hook on the git server that rejects
> pushes to master that are not indented
> b. Add a test suite that checks if the code is correctly indented, so
> the build farm would complain about it. (Suggested by Peter E)
>
> I think both a and b would work to achieve 2. But as Peter E said, b
> indeed sounds like less of a divergence of the status quo. So my vote
> would be for b.
>
> FWIW, I think that there is value for both of them.  Anyway, isn't 'a'
> exactly the same as 'b' in design?  Both require a build of
> pg_bsd_indent, meaning that 'a' would also need to run an equivalent
> of the regression test suite, but it would be actually costly
> especially if pg_bsd_indent itself is patched.  I think that getting
> more noisy on this matter with 'b' would be enough, but as an extra
> PG_TEST_EXTRA for committers to set.
>
> Such a test suite would need a dependency to the 'git' command itself,
> which is not something that could be safely run in a release tarball,
> in any case.
>
>
> Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run
onone animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do. 

Just to be clear, you guys are aware we  already have a git repo
that's supposed to track "head + pg_indent" at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
right?

I see it is currently not working and this has not been noticed by
anyone, so I guess it kind of indicates nobody is using it today. The
reason appears to be that it uses pg_bsd_indent that's in our apt
repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
service that would actually be useful, this could certainly be ficked
pretty easy.

But bottom line is that if pgindent is as predictable as it should be,
it might be easier to use that one central place that already does it
rather than have to build a buildfarm module?

//Magnus

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

22 April 2023, 13:23:59

On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:
>
> On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> > Also, pgindent takes tens of seconds to run, so hooking that into the git
> > push process would slow this down quite a bit.
>
> The pre-receive hook would do a full pgindent when you change typedefs.list.
> Otherwise, it would reindent only the files being changed.  The average push
> need not take tens of seconds.

It would probably ont be tens of seconds, but it would be slow. It
would need to do a clean git checkout into an isolated environment and
spawn in there, and just that takes time. And it would have to also
know to rebuild pg_bsd_indent on demand, which would require a full
./configure run (or meson equivalent). etc.

So while it might not be tens of seconds, it most definitely won't be fast.

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

22 April 2023, 14:12:06

On 2023-04-22 Sa 08:47, Magnus Hagander wrote:

On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:


On 2023-04-22 Sa 04:50, Michael Paquier wrote:

On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:

For 2 the upstream thread listed two approaches:
a. Install a pre-receive git hook on the git server that rejects
pushes to master that are not indented
b. Add a test suite that checks if the code is correctly indented, so
the build farm would complain about it. (Suggested by Peter E)

I think both a and b would work to achieve 2. But as Peter E said, b
indeed sounds like less of a divergence of the status quo. So my vote
would be for b.

FWIW, I think that there is value for both of them.  Anyway, isn't 'a'
exactly the same as 'b' in design?  Both require a build of
pg_bsd_indent, meaning that 'a' would also need to run an equivalent
of the regression test suite, but it would be actually costly
especially if pg_bsd_indent itself is patched.  I think that getting
more noisy on this matter with 'b' would be enough, but as an extra
PG_TEST_EXTRA for committers to set.

Such a test suite would need a dependency to the 'git' command itself,
which is not something that could be safely run in a release tarball,
in any case.


Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.


Just to be clear, you guys are aware we  already have a git repo
that's supposed to track "head + pg_indent" at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
right?

I see it is currently not working and this has not been noticed by
anyone, so I guess it kind of indicates nobody is using it today. The
reason appears to be that it uses pg_bsd_indent that's in our apt
repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
service that would actually be useful, this could certainly be ficked
pretty easy.

But bottom line is that if pgindent is as predictable as it should be,
it might be easier to use that one central place that already does it
rather than have to build a buildfarm module?

Now that pg_bsd_indent is in the core code why not just use that?

Happy if you can make something work without further effort on my part :-)

cheers

andew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 April 2023, 14:24:13

Magnus Hagander <magnus@hagander.net> writes:
> On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>> For 2 the upstream thread listed two approaches:
>> a. Install a pre-receive git hook on the git server that rejects
>> pushes to master that are not indented
>> b. Add a test suite that checks if the code is correctly indented, so
>> the build farm would complain about it. (Suggested by Peter E)
>>
>> I think both a and b would work to achieve 2. But as Peter E said, b
>> indeed sounds like less of a divergence of the status quo. So my vote
>> would be for b.

I am absolutely against a pre-receive hook on gitmaster.  A buildfarm
check seems far more appropriate.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 April 2023, 14:39:48

Jelte Fennema <postgres@jeltef.nl> writes:
> I think there's two things needed to actually start doing this:
> 1. We need to reindent the tree to create an indented baseline

As far as (1) goes, I've been holding off on that because there
are some large patches that still seem in danger of getting
reverted, notably 2489d76c4 and follow-ups.  A pgindent run
would change any such reversions from being mechanical into
possibly a fair bit of work.  We still have a couple of weeks
before it's necessary to make such decisions, so I don't want
to do the pgindent run before that.

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options.  But I don't believe
we have a final proposal on that, much less committed code.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

22 April 2023, 15:10:12

On 2023-04-22 Sa 10:39, Tom Lane wrote:

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options.  But I don't believe
we have a final proposal on that, much less committed code.

Well, I posted a fairly concrete suggestion with an example patch upthread at

<https://www.postgresql.org/message-id/47011581-ddec-1a87-6828-6edfabe6b7b6%40dunslane.net>

I still think that's worth doing.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 April 2023, 15:37:08

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-04-22 Sa 10:39, Tom Lane wrote:
>> Another obstacle in the way of (1) is that there was some discussion
>> of changing perltidy version and/or options.  But I don't believe
>> we have a final proposal on that, much less committed code.

> Well, I posted a fairly concrete suggestion with an example patch
> upthread at
> <https://www.postgresql.org/message-id/47011581-ddec-1a87-6828-6edfabe6b7b6%40dunslane.net>
> I still think that's worth doing.

OK, so plan is (a) update perltidyrc to add --valign-exclusion-list,
(b) adjust pgindent/README to recommend perltidy version 20221112.

Questions:

* I see that there's now a 20230309 release, should we consider that
instead?

* emacs.samples provides pgsql-perl-style that claims to match
perltidy's rules.  Does that need any adjustment?  I don't see
anything in it that looks relevant, but I'm not terribly familiar
with emacs' Perl formatting options.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

22 April 2023, 19:52:24

On 2023-04-22 Sa 11:37, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-22 Sa 10:39, Tom Lane wrote:

Another obstacle in the way of (1) is that there was some discussion
of changing perltidy version and/or options.  But I don't believe
we have a final proposal on that, much less committed code.

Well, I posted a fairly concrete suggestion with an example patch 
upthread at
<https://www.postgresql.org/message-id/47011581-ddec-1a87-6828-6edfabe6b7b6%40dunslane.net>
I still think that's worth doing.

OK, so plan is (a) update perltidyrc to add --valign-exclusion-list,
(b) adjust pgindent/README to recommend perltidy version 20221112.

Questions:

* I see that there's now a 20230309 release, should we consider that
instead?

A test I just ran gave identical results to those from 20221112


* emacs.samples provides pgsql-perl-style that claims to match
perltidy's rules.  Does that need any adjustment?  I don't see
anything in it that looks relevant, but I'm not terribly familiar
with emacs' Perl formatting options.

At least w.r.t. the vertical alignment issue, AFAICT the emacs style does not attempt to align anything vertically except the first non-space thing on the line. So if anything, by abandoning a lot of vertical alignment it would actually be closer to what the sample emacs style does.

The great advantage of not doing this alignment is that there is far less danger of perltidy trying to realign lines that have not in fact changed, because some nearby line has changed. So we'd have a good deal less pointless churn.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 April 2023, 19:58:13

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-04-22 Sa 11:37, Tom Lane wrote:
>> * I see that there's now a 20230309 release, should we consider that
>> instead?

> A test I just ran gave identical results to those from 20221112

Cool, let's use perltidy 20230309 then.

> The great advantage of not doing this alignment is that there is far 
> less danger of perltidy trying to realign lines that have not in fact 
> changed, because some nearby line has changed. So we'd have a good deal 
> less pointless churn.

Yes, exactly.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

22 April 2023, 19:59:06

(Given that another commentator is "absolutely against" a hook, this message
is mostly for readers considering this for other projects.)

On Sat, Apr 22, 2023 at 03:23:59PM +0200, Magnus Hagander wrote:
> On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:
> > On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> > > Also, pgindent takes tens of seconds to run, so hooking that into the git
> > > push process would slow this down quite a bit.
> >
> > The pre-receive hook would do a full pgindent when you change typedefs.list.
> > Otherwise, it would reindent only the files being changed.  The average push
> > need not take tens of seconds.
> 
> It would probably ont be tens of seconds, but it would be slow. It
> would need to do a clean git checkout into an isolated environment and
> spawn in there, and just that takes time.

That would be slow, but I wouldn't do it that way.  I'd make "pg_bsd_ident
--pre-receive --show-diff" that, instead of reading from the filesystem, gets
the bytes to check from the equivalent of this Perl-like pseudocode:

while (<>) {
    my($old_hash, $new_hash, $ref) = split;
    foreach my $filename (split /\n/, `git diff --name-only $old_hash..$new_hash`) {
        $file_content = `git show $new_hash $filename`;
    }
}

I just ran pgindent on the file name lists of the last 1000 commits, and
runtime was less than 0.5s for each of 998/1000 commits.  There's more a real
implementation might handle:

- pg_bsd_indent changes
- typedefs.list changes
- skip if the break-glass "pgindent: no" appears in a commit message
- commits changing so many files that a clean "git checkout" would be faster

> And it would have to also
> know to rebuild pg_bsd_indent on demand, which would require a full
> ./configure run (or meson equivalent). etc.
> 
> So while it might not be tens of seconds, it most definitely won't be fast.

A project more concerned about elapsed time than detecting all defects might
even choose to take no synchronous action for pg_bsd_indent and typedefs.list
changes.  When a commit changes either of those, the probability that the
committer already ran pgindent rises substantially.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

22 April 2023, 20:15:23

Noah Misch <noah@leadboat.com> writes:
> - skip if the break-glass "pgindent: no" appears in a commit message

There are two things that bother me about putting this functionality
into a server hook, beyond the possible speed issue:

* The risk of failure.  I don't have a terribly difficult time imagining
situations where things get sufficiently wedged that the server accepts
*no* commits, not even ones fixing the problem.  An override such as
you suggest here could assuage that fear, perhaps.

* The lack of user-friendliness.  AFAIK, if a pre-receive hook fails
you learn little except that it failed.  This could be extremely
frustrating to debug, especially in a situation where your local
pgindent is giving you different results than the server gets.

The idea of a buildfarm animal failing if --show-diff isn't empty
is attractive to me mainly because it is far nicer from the
debuggability standpoint.

Maybe, after we get some amount of experience with trying to keep
things always indent-clean, we will decide that it's reliable enough
to enforce in a server hook.  I think going for that right away is
sheer folly though.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Noah Misch

Date:

22 April 2023, 20:33:25

On Sat, Apr 22, 2023 at 04:15:23PM -0400, Tom Lane wrote:
> Noah Misch <noah@leadboat.com> writes:
> > - skip if the break-glass "pgindent: no" appears in a commit message
> 
> There are two things that bother me about putting this functionality
> into a server hook, beyond the possible speed issue:
> 
> * The risk of failure.  I don't have a terribly difficult time imagining
> situations where things get sufficiently wedged that the server accepts
> *no* commits, not even ones fixing the problem.  An override such as
> you suggest here could assuage that fear, perhaps.

I agree that deserves some worry.

> * The lack of user-friendliness.  AFAIK, if a pre-receive hook fails
> you learn little except that it failed.

That is incorrect.  The client gets whatever the hook prints.  I'd probably
make it print the first 10000 lines of the diff.

I'm okay with a buildfarm animal.  It's going to result in a more-cluttered
git history as people push things, break that animal, and push followup fixes.
While that's sad, I expect the level of clutter will go down pretty quickly
and will soon be no worse than we already get from typo-fix pushes.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 April 2023, 15:04:33

On 2023-04-22 Sa 15:58, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-04-22 Sa 11:37, Tom Lane wrote:

* I see that there's now a 20230309 release, should we consider that
instead?

A test I just ran gave identical results to those from 20221112

Cool, let's use perltidy 20230309 then.

OK, so when would we do this? The change to 20230309 + valign changes is fairly large:

188 files changed, 3657 insertions(+), 3395 deletions(-)

Maybe right before we fork the tree?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

23 April 2023, 15:08:42

On 2023-04-22 Sa 16:15, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

- skip if the break-glass "pgindent: no" appears in a commit message

There are two things that bother me about putting this functionality
into a server hook, beyond the possible speed issue:

* The risk of failure.  I don't have a terribly difficult time imagining
situations where things get sufficiently wedged that the server accepts
*no* commits, not even ones fixing the problem.  An override such as
you suggest here could assuage that fear, perhaps.

* The lack of user-friendliness.  AFAIK, if a pre-receive hook fails
you learn little except that it failed.  This could be extremely
frustrating to debug, especially in a situation where your local
pgindent is giving you different results than the server gets.

The idea of a buildfarm animal failing if --show-diff isn't empty
is attractive to me mainly because it is far nicer from the
debuggability standpoint.

Maybe, after we get some amount of experience with trying to keep
things always indent-clean, we will decide that it's reliable enough
to enforce in a server hook.  I think going for that right away is
sheer folly though.

What I'll do for now is set up a buildfarm module that will log the differences but won't error out if there are any. At least that way we'll know better what we're dealing with.

I don't really like Noah's idea of having a pre-receive hook possibly spew thousands of diff lines to the terminal.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

23 April 2023, 15:16:10

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-04-22 Sa 15:58, Tom Lane wrote:
>> Cool, let's use perltidy 20230309 then.

> OK, so when would we do this? The change to 20230309 + valign changes is 
> fairly large:

I think we could go ahead and commit the perltidyrc and README changes
now.  But the ensuing reformatting should happen as part of the mass
pgindent run, probably next month sometime.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

23 April 2023, 15:29:15

On Sun, 23 Apr 2023 at 17:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I think we could go ahead and commit the perltidyrc and README changes
> now.  But the ensuing reformatting should happen as part of the mass
> pgindent run, probably next month sometime.

I think it's better to make the changes close together, not with a
month in between. Otherwise no-one will be able to run perltidy on
their patches, because the config and the files are even more out of
sync than they are now. So I'd propose to commit the perltidyrc
changes right before the pgindent run.

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

23 April 2023, 21:52:24

On Sat, Apr 22, 2023 at 4:12 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
>
> On 2023-04-22 Sa 08:47, Magnus Hagander wrote:
>
> On Sat, Apr 22, 2023 at 1:42 PM Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 2023-04-22 Sa 04:50, Michael Paquier wrote:
>
> On Fri, Apr 21, 2023 at 09:58:17AM +0200, Jelte Fennema wrote:
>
> For 2 the upstream thread listed two approaches:
> a. Install a pre-receive git hook on the git server that rejects
> pushes to master that are not indented
> b. Add a test suite that checks if the code is correctly indented, so
> the build farm would complain about it. (Suggested by Peter E)
>
> I think both a and b would work to achieve 2. But as Peter E said, b
> indeed sounds like less of a divergence of the status quo. So my vote
> would be for b.
>
> FWIW, I think that there is value for both of them.  Anyway, isn't 'a'
> exactly the same as 'b' in design?  Both require a build of
> pg_bsd_indent, meaning that 'a' would also need to run an equivalent
> of the regression test suite, but it would be actually costly
> especially if pg_bsd_indent itself is patched.  I think that getting
> more noisy on this matter with 'b' would be enough, but as an extra
> PG_TEST_EXTRA for committers to set.
>
> Such a test suite would need a dependency to the 'git' command itself,
> which is not something that could be safely run in a release tarball,
> in any case.
>
>
> Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run
onone animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do. 
>
> Just to be clear, you guys are aware we  already have a git repo
> that's supposed to track "head + pg_indent" at
> https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent
> right?
>
> I see it is currently not working and this has not been noticed by
> anyone, so I guess it kind of indicates nobody is using it today. The
> reason appears to be that it uses pg_bsd_indent that's in our apt
> repos and that's 2.1.1 and not 2.1.2 at this point. But if this is a
> service that would actually be useful, this could certainly be ficked
> pretty easy.
>
> But bottom line is that if pgindent is as predictable as it should be,
> it might be easier to use that one central place that already does it
> rather than have to build a buildfarm module?
>
>
> Now that pg_bsd_indent is in the core code why not just use that?

yeah, it just required building. And the lazy approach was to use the DEB :)

For a quick fix I've built the current HEAD and have it just using
that one -- right now it'll fail again when a change is made to it,
but I'll get that cleaned up.

It's back up and running, and results are at
https://git.postgresql.org/gitweb/?p=postgresql-pgindent.git;a=shortlog;h=refs/heads/master-pgindent

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

23 April 2023, 22:01:24

On Sat, Apr 22, 2023 at 9:59 PM Noah Misch <noah@leadboat.com> wrote:
>
> (Given that another commentator is "absolutely against" a hook, this message
> is mostly for readers considering this for other projects.)
>
> On Sat, Apr 22, 2023 at 03:23:59PM +0200, Magnus Hagander wrote:
> > On Tue, Feb 7, 2023 at 5:43 AM Noah Misch <noah@leadboat.com> wrote:
> > > On Mon, Feb 06, 2023 at 06:17:02PM +0100, Peter Eisentraut wrote:
> > > > Also, pgindent takes tens of seconds to run, so hooking that into the git
> > > > push process would slow this down quite a bit.
> > >
> > > The pre-receive hook would do a full pgindent when you change typedefs.list.
> > > Otherwise, it would reindent only the files being changed.  The average push
> > > need not take tens of seconds.
> >
> > It would probably ont be tens of seconds, but it would be slow. It
> > would need to do a clean git checkout into an isolated environment and
> > spawn in there, and just that takes time.
>
> That would be slow, but I wouldn't do it that way.  I'd make "pg_bsd_ident
> --pre-receive --show-diff" that, instead of reading from the filesystem, gets
> the bytes to check from the equivalent of this Perl-like pseudocode:
>
> while (<>) {
>     my($old_hash, $new_hash, $ref) = split;
>     foreach my $filename (split /\n/, `git diff --name-only $old_hash..$new_hash`) {
>         $file_content = `git show $new_hash $filename`;
>     }
> }
>
> I just ran pgindent on the file name lists of the last 1000 commits, and
> runtime was less than 0.5s for each of 998/1000 commits.  There's more a real
> implementation might handle:
>
> - pg_bsd_indent changes
> - typedefs.list changes
> - skip if the break-glass "pgindent: no" appears in a commit message
> - commits changing so many files that a clean "git checkout" would be faster

Wouldn't there also be the case of a header file change that could
potentially invalidate a whole lot of C files?

There's also the whole potential problem of isolations. We need to run
the whole thing in an isolated environment (because any way in at this
stage could lead to an exploit if a committer key is compromised at
any point). And at least in the second case, it might not have access
to view that data yet because it's not in... Could probably be worked
around, but not trivially so.

(But as mentioned above, I think the conclusion is we don't want this
enforced in a receive hook anyway)


> > And it would have to also
> > know to rebuild pg_bsd_indent on demand, which would require a full
> > ./configure run (or meson equivalent). etc.
> >
> > So while it might not be tens of seconds, it most definitely won't be fast.
>
> A project more concerned about elapsed time than detecting all defects might
> even choose to take no synchronous action for pg_bsd_indent and typedefs.list
> changes.  When a commit changes either of those, the probability that the
> committer already ran pgindent rises substantially.

True, but it's far from 100% -- and if you got something in that
didn't work, then the *next* committer would have to clean it up....

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

24 April 2023, 14:09:22

On 23.04.23 17:29, Jelte Fennema wrote:
> On Sun, 23 Apr 2023 at 17:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think we could go ahead and commit the perltidyrc and README changes
>> now.  But the ensuing reformatting should happen as part of the mass
>> pgindent run, probably next month sometime.
> 
> I think it's better to make the changes close together, not with a
> month in between. Otherwise no-one will be able to run perltidy on
> their patches, because the config and the files are even more out of
> sync than they are now. So I'd propose to commit the perltidyrc
> changes right before the pgindent run.

Does anyone find perltidy useful?  To me, it functions more like a 
JavaScript compiler in that once you process the source code, it is no 
longer useful for manual editing.  If we are going to have the buildfarm 
check indentation and that is going to be extended to Perl code, I have 
some concerns about that.

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

24 April 2023, 14:14:31

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> Does anyone find perltidy useful?  To me, it functions more like a 
> JavaScript compiler in that once you process the source code, it is no 
> longer useful for manual editing.  If we are going to have the buildfarm 
> check indentation and that is going to be extended to Perl code, I have 
> some concerns about that.

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines.  But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future.  Did you have other specific concerns?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

26 April 2023, 07:38:40

On 24.04.23 16:14, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
>> Does anyone find perltidy useful?  To me, it functions more like a
>> JavaScript compiler in that once you process the source code, it is no
>> longer useful for manual editing.  If we are going to have the buildfarm
>> check indentation and that is going to be extended to Perl code, I have
>> some concerns about that.
> 
> I certainly don't like its current behavior where adding/changing one
> line can have side-effects on nearby lines.  But we have a proposal
> to clean that up, and I'm cautiously optimistic that it'll be better
> in future.  Did you have other specific concerns?

I think the worst is how it handles multi-line data structures like

         $newnode->command_ok(
             [
                 'psql', '-X',
                 '-v',   'ON_ERROR_STOP=1',
                 '-c',   $upcmds,
                 '-d',   $oldnode->connstr($updb),
             ],
             "ran version adaptation commands for database $updb");

or

         $node->command_fails_like(
             [
                 'pg_basebackup',   '-D',
                 "$tempdir/backup", '--compress',
                 $cft->[0]
             ],
             qr/$cfail/,
             'client ' . $cft->[2]);

Perhaps that is included in the upcoming changes you are referring to?

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

26 April 2023, 13:27:05

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> On 24.04.23 16:14, Tom Lane wrote:
>> I certainly don't like its current behavior where adding/changing one
>> line can have side-effects on nearby lines.  But we have a proposal
>> to clean that up, and I'm cautiously optimistic that it'll be better
>> in future.  Did you have other specific concerns?

> I think the worst is how it handles multi-line data structures like

>          $newnode->command_ok(
>              [
>                  'psql', '-X',
>                  '-v',   'ON_ERROR_STOP=1',
>                  '-c',   $upcmds,
>                  '-d',   $oldnode->connstr($updb),
>              ],
>              "ran version adaptation commands for database $updb");

Yeah, I agree, there is no case where that doesn't suck.  I don't
mind it imposing specific placements of brackets and so on ---
that's very analogous to what pgindent will do.  But it likes to
re-flow comma-separated lists, and generally manages to make a
complete logical hash of them when it does, as in your other
example:

>          $node->command_fails_like(
>              [
>                  'pg_basebackup',   '-D',
>                  "$tempdir/backup", '--compress',
>                  $cft->[0]
>              ],
>              qr/$cfail/,
>              'client ' . $cft->[2]);

Can we fix it to preserve the programmer's choices of line breaks
in comma-separated lists?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

26 April 2023, 19:44:47

On 2023-04-26 We 09:27, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:

On 24.04.23 16:14, Tom Lane wrote:

I certainly don't like its current behavior where adding/changing one
line can have side-effects on nearby lines.  But we have a proposal
to clean that up, and I'm cautiously optimistic that it'll be better
in future.  Did you have other specific concerns?

I think the worst is how it handles multi-line data structures like

         $newnode->command_ok(             [                 'psql', '-X',                 '-v',   'ON_ERROR_STOP=1',                 '-c',   $upcmds,                 '-d',   $oldnode->connstr($updb),             ],             "ran version adaptation commands for database $updb");

Yeah, I agree, there is no case where that doesn't suck.  I don't
mind it imposing specific placements of brackets and so on ---
that's very analogous to what pgindent will do.  But it likes to
re-flow comma-separated lists, and generally manages to make a
complete logical hash of them when it does, as in your other
example:

         $node->command_fails_like(             [                 'pg_basebackup',   '-D',                 "$tempdir/backup", '--compress',                 $cft->[0]             ],             qr/$cfail/,             'client ' . $cft->[2]);

Can we fix it to preserve the programmer's choices of line breaks
in comma-separated lists?

I doubt there's something like that. You can freeze arbitrary blocks of code like this (from the manual)

#<<<  format skipping: do not let perltidy change my nice formatting
        my @list = (1,
                    1, 1,
                    1, 2, 1,
                    1, 3, 3, 1,
                    1, 4, 6, 4, 1,);
#>>>

But that gets old and ugly pretty quickly.

There is a --freeze-newlines option, but it's global. I don't think we want that.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

26 April 2023, 20:05:07

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-04-26 We 09:27, Tom Lane wrote:
>> Yeah, I agree, there is no case where that doesn't suck.  I don't
>> mind it imposing specific placements of brackets and so on ---
>> that's very analogous to what pgindent will do.  But it likes to
>> re-flow comma-separated lists, and generally manages to make a
>> complete logical hash of them when it does, as in your other
>> example:

> I doubt there's something like that.

I had a read-through of the latest version's man page, and found
this promising-looking entry:

-boc, --break-at-old-comma-breakpoints

    The -boc flag is another way to prevent comma-separated lists from
    being reformatted. Using -boc on the above example, plus additional
    flags to retain the original style, yields

    # perltidy -boc -lp -pt=2 -vt=1 -vtc=1
    my @list = (1,
                1, 1,
                1, 2, 1,
                1, 3, 3, 1,
                1, 4, 6, 4, 1,);

    A disadvantage of this flag compared to the methods discussed above is
    that all tables in the file must already be nicely formatted.

I've not tested this, but it looks like it would do what we need,
modulo needing to fix all the existing damage by hand ...

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Alvaro Herrera

Date:

28 April 2023, 09:25:45

On 2023-Feb-05, Andrew Dunstan wrote:

> So here's a diff made from running perltidy v20221112 with the additional
> setting --valign-exclusion-list=", = => || && if unless"

I ran this experimentally with perltidy 20230309, and compared that with
the --novalign behavior (not to propose the latter -- just to be aware
of what else is vertical alignment doing.)

Based on the differences between both, I think we'll definitely want to
include =~ and |= in this list, and I think we should discuss whether to
also include "or" (for "do_stuff or die()" type of constructs) and "qw"
(mainly used in 'use Foo qw(one two)' import lists).  All these have
effects (albeit smaller than the list you gave) on our existing code.

If you change from an exclusion list to --novalign then you lose
alignment of trailing # comments, which personally I find a loss, even
though they're still a multi-line effect.  Another change would be that
it ditches alignment of "{" but that only changes msvc/Install.pm, so I
think we shouldn't worry; and then there's this one:

-use PostgreSQL::Test::Utils          ();
+use PostgreSQL::Test::Utils ();
 use PostgreSQL::Test::BackgroundPsql ();

which I think we could just change to qw() if we cared enough (but I bet
we don't).

All in all, I think sticking to
--valign-exclusion-list=", = => =~ |= || && if or qw unless"
is a good deal.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Ellos andaban todos desnudos como su madre los parió, y también las mujeres,
aunque no vi más que una, harto moza, y todos los que yo vi eran todos
mancebos, que ninguno vi de edad de más de XXX años" (Cristóbal Colón)

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

28 April 2023, 18:08:18

On Wed, Apr 26, 2023 at 03:44:47PM -0400, Andrew Dunstan wrote:
> On 2023-04-26 We 09:27, Tom Lane wrote:
> I doubt there's something like that. You can freeze arbitrary blocks of code
> like this (from the manual)
> 
> #<<<  format skipping: do not let perltidy change my nice formatting
>         my @list = (1,
>                     1, 1,
>                     1, 2, 1,
>                     1, 3, 3, 1,
>                     1, 4, 6, 4, 1,);
> #>>>   
> 
> 
> But that gets old and ugly pretty quickly.

Can those comments be added by a preprocessor before calling perltidy,
and then removed on completion?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Embrace your flaws.  They make you human, rather than perfect,
  which you will never be.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

30 April 2023, 12:57:39

On 2023-04-28 Fr 05:25, Alvaro Herrera wrote:

On 2023-Feb-05, Andrew Dunstan wrote:

So here's a diff made from running perltidy v20221112 with the additional
setting --valign-exclusion-list=", = => || && if unless"

I ran this experimentally with perltidy 20230309, and compared that with
the --novalign behavior (not to propose the latter -- just to be aware
of what else is vertical alignment doing.)

Thanks for looking.


Based on the differences between both, I think we'll definitely want to
include =~ and |= in this list, and I think we should discuss whether to
also include "or" (for "do_stuff or die()" type of constructs) and "qw"
(mainly used in 'use Foo qw(one two)' import lists).  All these have
effects (albeit smaller than the list you gave) on our existing code.

I'm good with all of these I think



If you change from an exclusion list to --novalign then you lose
alignment of trailing # comments, which personally I find a loss, even
though they're still a multi-line effect.  Another change would be that
it ditches alignment of "{" but that only changes msvc/Install.pm, so I
think we shouldn't worry; and then there's this one:

-use PostgreSQL::Test::Utils          ();
+use PostgreSQL::Test::Utils (); use PostgreSQL::Test::BackgroundPsql ();

which I think we could just change to qw() if we cared enough (but I bet
we don't).

Yeah, me too.



All in all, I think sticking to
--valign-exclusion-list=", = => =~ |= || && if or qw unless"
is a good deal.

wfm

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

30 April 2023, 13:02:04

On 2023-04-28 Fr 14:08, Bruce Momjian wrote:

On Wed, Apr 26, 2023 at 03:44:47PM -0400, Andrew Dunstan wrote:

On 2023-04-26 We 09:27, Tom Lane wrote:
I doubt there's something like that. You can freeze arbitrary blocks of code
like this (from the manual)

#<<<  format skipping: do not let perltidy change my nice formatting
        my @list = (1,
                    1, 1,
                    1, 2, 1,
                    1, 3, 3, 1,
                    1, 4, 6, 4, 1,);
#>>>   


But that gets old and ugly pretty quickly.

Can those comments be added by a preprocessor before calling perltidy,
and then removed on completion?

I imagine so, but we'd need a way of determining algorithmically which lines to protect. That might not be at all simple. And then we'd have the maintenance burden of the preprocessor.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

30 April 2023, 14:32:16

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-04-28 Fr 14:08, Bruce Momjian wrote:
>> Can those comments be added by a preprocessor before calling perltidy,
>> and then removed on completion?

> I imagine so, but we'd need a way of determining algorithmically which 
> lines to protect. That might not be at all simple. And then we'd have 
> the maintenance burden of the preprocessor.

Yeah, it's hard to see how you'd do that without writing a full Perl
parser.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 May 2023, 21:10:48

I wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> I doubt there's something like that.

> I had a read-through of the latest version's man page, and found
> this promising-looking entry:
>     -boc, --break-at-old-comma-breakpoints

Sadly, this seems completely not ready for prime time.  I experimented
with it under perltidy 20230309, and found that it caused hundreds
of kilobytes of gratuitous changes that don't seem to have a direct
connection to the claimed purpose.  Most of these seemed to be from
forcing a line break after a function call's open paren, like

@@ -50,10 +50,12 @@ detects_heap_corruption(
 #
 fresh_test_table('test');
 $node->safe_psql('postgres', q(VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test));
-detects_no_corruption("verify_heapam('test')",
+detects_no_corruption(
+    "verify_heapam('test')",
     "all-frozen not corrupted table");
 corrupt_first_page('test');
-detects_heap_corruption("verify_heapam('test')",
+detects_heap_corruption(
+    "verify_heapam('test')",
     "all-frozen corrupted table");
 detects_no_corruption(
     "verify_heapam('test', skip := 'all-frozen')",

although in some places it just wanted to insert a space, like this:

@@ -77,9 +81,9 @@ print "standby 2: $result\n";
 is($result, qq(33|0|t), 'check streamed sequence content on standby 2');
 
 # Check that only READ-only queries can run on standbys
-is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
     3, 'read-only queries on standby 1');
-is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
     3, 'read-only queries on standby 2');
 
 # Tests for connection parameter target_session_attrs


So I don't think we want that.  Maybe in some future version it'll
be more under control.

Barring objections, I'll use the attached on Friday.

            regards, tom lane

commit 7874d0f178f2bcdc889ce410d3e126e6750d96b4
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed May 17 16:43:38 2023 -0400

    Make agreed updates in perltidy options.

    Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql

diff --git a/src/tools/pgindent/README b/src/tools/pgindent/README
index 43c736b0a1..08874d12eb 100644
--- a/src/tools/pgindent/README
+++ b/src/tools/pgindent/README
@@ -14,16 +14,16 @@ PREREQUISITES:
    sibling directory src/tools/pg_bsd_indent; see the directions
    in that directory's README file.

-2) Install perltidy.  Please be sure it is version 20170521 (older and newer
+2) Install perltidy.  Please be sure it is version 20230309 (older and newer
    versions make different formatting choices, and we want consistency).
    You can get the correct version from
    https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/
    To install, follow the usual install process for a Perl module
    ("man perlmodinstall" explains it).  Or, if you have cpan installed,
    this should work:
-   cpan SHANCOCK/Perl-Tidy-20170521.tar.gz
+   cpan SHANCOCK/Perl-Tidy-20230309.tar.gz
    Or if you have cpanm installed, you can just use:
-   cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20170521.tar.gz
+   cpanm https://cpan.metacpan.org/authors/id/S/SH/SHANCOCK/Perl-Tidy-20230309.tar.gz

 DOING THE INDENT RUN:

diff --git a/src/tools/pgindent/perltidyrc b/src/tools/pgindent/perltidyrc
index 9f09f0a64e..589d6e1f06 100644
--- a/src/tools/pgindent/perltidyrc
+++ b/src/tools/pgindent/perltidyrc
@@ -14,3 +14,4 @@
 --paren-vertical-tightness=2
 --paren-vertical-tightness-closing=2
 --noblanks-before-comments
+--valign-exclusion-list=", = => =~ |= || && if or qw unless"

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

18 May 2023, 12:59:28

On 2023-05-17 We 17:10, Tom Lane wrote:

I wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I doubt there's something like that.

I had a read-through of the latest version's man page, and found
this promising-looking entry:	-boc, --break-at-old-comma-breakpoints

Sadly, this seems completely not ready for prime time.  I experimented
with it under perltidy 20230309, and found that it caused hundreds
of kilobytes of gratuitous changes that don't seem to have a direct
connection to the claimed purpose.  Most of these seemed to be from
forcing a line break after a function call's open paren, like

@@ -50,10 +50,12 @@ detects_heap_corruption( # fresh_test_table('test'); $node->safe_psql('postgres', q(VACUUM (FREEZE, DISABLE_PAGE_SKIPPING) test));
-detects_no_corruption("verify_heapam('test')",
+detects_no_corruption(
+	"verify_heapam('test')", 	"all-frozen not corrupted table"); corrupt_first_page('test');
-detects_heap_corruption("verify_heapam('test')",
+detects_heap_corruption(
+	"verify_heapam('test')", 	"all-frozen corrupted table"); detects_no_corruption( 	"verify_heapam('test', skip := 'all-frozen')",

although in some places it just wanted to insert a space, like this:

@@ -77,9 +81,9 @@ print "standby 2: $result\n"; is($result, qq(33|0|t), 'check streamed sequence content on standby 2');  # Check that only READ-only queries can run on standbys
-is($node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_1->psql('postgres', 'INSERT INTO tab_int VALUES (1)'), 	3, 'read-only queries on standby 1');
-is($node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'),
+is( $node_standby_2->psql('postgres', 'INSERT INTO tab_int VALUES (1)'), 	3, 'read-only queries on standby 2');  # Tests for connection parameter target_session_attrs


So I don't think we want that.  Maybe in some future version it'll
be more under control.

Barring objections, I'll use the attached on Friday.

LGTM

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

15 June 2023, 15:26:58

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan <andrew@dunslane.net> wrote:
> Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run
onone animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.
 

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

15 June 2023, 16:12:27

On 2023-06-15 Th 11:26, Jelte Fennema wrote:

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan <andrew@dunslane.net> wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

The short answer is that some high priority demands from $dayjob got in the way. However, I hope to have it done soon.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

17 June 2023, 14:08:32

On 2023-06-15 Th 12:12, Andrew Dunstan wrote:

On 2023-06-15 Th 11:26, Jelte Fennema wrote:

On Sat, 22 Apr 2023 at 13:42, Andrew Dunstan <andrew@dunslane.net> wrote:

Perhaps we should start with a buildfarm module, which would run pg_indent --show-diff. That would only need to run on one animal, so a failure wouldn't send the whole buildfarm red. This would be pretty easy to do.

Just to be clear on where we are. Is there anything blocking us from
doing this, except for the PG16 branch cut? (that I guess is planned
somewhere in July?)

Just doing this for pgindent and not for perltidy would already be a
huge improvement over the current situation IMHO.

The short answer is that some high priority demands from $dayjob got in the way. However, I hope to have it done soon.

See <https://github.com/PGBuildFarm/client-code/commit/f9c1c15048b412d34ccda8020d989b3a7b566c05>

I have set up a new buildfarm animal called koel which will run the module.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

19 June 2023, 21:07:03

Andrew Dunstan <andrew@dunslane.net> writes:
> I have set up a new buildfarm animal called koel which will run the module.

Is koel tracking the right repo?  It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=koel&dt=2023-06-19%2019%3A49%3A03

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

20 June 2023, 02:09:08

On Sat, Jun 17, 2023 at 10:08:32AM -0400, Andrew Dunstan wrote:
> See <https://github.com/PGBuildFarm/client-code/commit/f9c1c15048b412d34ccda8020d989b3a7b566c05>
> I have set up a new buildfarm animal called koel which will run the module.

That's really cool!  Thanks for taking the time to do that!
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

20 June 2023, 12:04:39

On 2023-06-19 Mo 17:07, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

I have set up a new buildfarm animal called koel which will run the module.

Is koel tracking the right repo?  It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=koel&dt=2023-06-19%2019%3A49%3A03

Yeah, I changed it so that instead of just checking new commits it would check the whole tree. The problem with the incremental approach is that the next run it might turn green again but the issue would not have been fixed.

I think this is a one-off issue. Once we clean up the tree the problem would disappear and the commits it shows would be correct. I imaging that's going to happen any day now?

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

20 June 2023, 13:08:36

Andrew Dunstan <andrew@dunslane.net> writes:
> On 2023-06-19 Mo 17:07, Tom Lane wrote:
>> Is koel tracking the right repo?  It just spit up with a bunch of
>> diffs that seem to have little to do with the commit it's claiming
>> caused them:

> Yeah, I changed it so that instead of just checking new commits it would 
> check the whole tree. The problem with the incremental approach is that 
> the next run it might turn green again but the issue would not have been 
> fixed.

Ah.

> I think this is a one-off issue. Once we clean up the tree the problem 
> would disappear and the commits it shows would be correct. I imaging 
> that's going to happen any day now?

I can go fix the problems now that we know there are some (already).
However, if what you're saying is that koel only checks recently-changed
files, that's going to be pretty misleading in future too.  If people
don't react to such reports right away, they'll disappear, no?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

20 June 2023, 13:21:14

On 2023-06-20 Tu 09:08, Tom Lane wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

On 2023-06-19 Mo 17:07, Tom Lane wrote:

Is koel tracking the right repo?  It just spit up with a bunch of
diffs that seem to have little to do with the commit it's claiming
caused them:

Yeah, I changed it so that instead of just checking new commits it would 
check the whole tree. The problem with the incremental approach is that 
the next run it might turn green again but the issue would not have been 
fixed.

Ah.

I think this is a one-off issue. Once we clean up the tree the problem 
would disappear and the commits it shows would be correct. I imaging 
that's going to happen any day now?

I can go fix the problems now that we know there are some (already).
However, if what you're saying is that koel only checks recently-changed
files, that's going to be pretty misleading in future too.  If people
don't react to such reports right away, they'll disappear, no?

That's what would have happened if I hadn't changed the way it worked (and that's why I changed it). Now it doesn't just check recent commits, it checks the whole tree, and will stay red until the tree is fixed.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

11 August 2023, 20:59:40

On Sat, Jun 17, 2023 at 7:08 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> I have set up a new buildfarm animal called koel which will run the module.

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

I don't think that it makes sense to invent yet another rule for
.git-blame-ignore-revs, though. Will we need another buildfarm member
to enforce that rule, too?

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

11 August 2023, 21:25:12

Hi,

On 2023-08-11 13:59:40 -0700, Peter Geoghegan wrote:
> On Sat, Jun 17, 2023 at 7:08 AM Andrew Dunstan <andrew@dunslane.net> wrote:
> > I have set up a new buildfarm animal called koel which will run the module.
> 
> I'm starting to have doubts about this policy. There have now been
> quite a few follow-up "fixes" to indentation issues that koel
> complained about. None of these fixups have been included in
> .git-blame-ignore-revs. If things continue like this then "git blame"
> is bound to become much less usable over time.

I'm not sure I buy that that's going to be a huge problem - most of the time
such fixups are pretty small compared to larger reindents.


> I don't think that it makes sense to invent yet another rule for
> .git-blame-ignore-revs, though. Will we need another buildfarm member
> to enforce that rule, too?

We could a test that fails when there's some mis-indented code. That seems
like it might catch things earlier?

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

11 August 2023, 21:48:09

On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:
> > I don't think that it makes sense to invent yet another rule for
> > .git-blame-ignore-revs, though. Will we need another buildfarm member
> > to enforce that rule, too?
>
> We could a test that fails when there's some mis-indented code. That seems
> like it might catch things earlier?

It definitely would. That would go a long way towards addressing my
concerns. But I suspect that that would run into problems that stem
from the fact that the buildfarm is testing something that isn't all
that simple. Don't typedefs need to be downloaded from some other
blessed buildfarm animal?

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

11 August 2023, 22:30:02

Peter Geoghegan <pg@bowt.ie> writes:
> On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:
>> We could a test that fails when there's some mis-indented code. That seems
>> like it might catch things earlier?

+1 for including this in CI tests

> It definitely would. That would go a long way towards addressing my
> concerns. But I suspect that that would run into problems that stem
> from the fact that the buildfarm is testing something that isn't all
> that simple. Don't typedefs need to be downloaded from some other
> blessed buildfarm animal?

No.  I presume koel is using src/tools/pgindent/typedefs.list,
which has always been the "canonical" list but up to now we've
been lazy about maintaining it.  Part of the new regime is that
typedefs.list should now be updated on-the-fly by patches that
add new typedefs.

We should still compare against the buildfarm's list periodically;
but I imagine that the primary result of that will be to remove
no-longer-used typedefs from typedefs.list.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

11 August 2023, 22:46:40

On Fri, Aug 11, 2023 at 3:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> No.  I presume koel is using src/tools/pgindent/typedefs.list,
> which has always been the "canonical" list but up to now we've
> been lazy about maintaining it.  Part of the new regime is that
> typedefs.list should now be updated on-the-fly by patches that
> add new typedefs.

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

> We should still compare against the buildfarm's list periodically;
> but I imagine that the primary result of that will be to remove
> no-longer-used typedefs from typedefs.list.

I believe that I came up with my current workflow due to the
difficulty of maintaining the typedef file itself. Random
platform/binutils implementation details created a lot of noise,
presumably because my setup wasn't exactly the same as Bruce's setup,
in whatever way. For example, the order of certain lines would change,
in a way that had nothing whatsoever to do with structs that my patch
added.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

11 August 2023, 23:02:06

Peter Geoghegan <pg@bowt.ie> writes:
> My workflow up until now has avoiding making updates to typedefs.list
> in patches. I only update typedefs locally, for long enough to indent
> my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.

> I guess that I can't do that anymore. Hopefully maintaining the
> typedefs.list file isn't as inconvenient as it once seemed to me to
> be.

I don't think it'll be a problem.  If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

11 August 2023, 23:17:03

Peter Geoghegan <pg@bowt.ie> writes:
> I'm starting to have doubts about this policy. There have now been
> quite a few follow-up "fixes" to indentation issues that koel
> complained about. None of these fixups have been included in
> .git-blame-ignore-revs. If things continue like this then "git blame"
> is bound to become much less usable over time.

FWIW, I'm much more optimistic than that.  I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that.  If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

11 August 2023, 23:18:04

On Fri, 11 Aug 2023 at 23:00, Peter Geoghegan <pg@bowt.ie> wrote:
> I'm starting to have doubts about this policy. There have now been
> quite a few follow-up "fixes" to indentation issues that koel
> complained about.

I think one thing that would help a lot in reducing the is for
committers to set up the local git commit hook that's on the wiki:
https://wiki.postgresql.org/wiki/Working_with_Git

That one fails the commit if there's wrongly indented files in the
commit. And if you still want to opt out for whatever reason you can
use git commit --no-verify

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

11 August 2023, 23:20:24

Hi,

On 2023-08-11 18:30:02 -0400, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Fri, Aug 11, 2023 at 2:25 PM Andres Freund <andres@anarazel.de> wrote:
> >> We could a test that fails when there's some mis-indented code. That seems
> >> like it might catch things earlier?
> 
> +1 for including this in CI tests

I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
course would include CI automatically.


> > It definitely would. That would go a long way towards addressing my
> > concerns. But I suspect that that would run into problems that stem
> > from the fact that the buildfarm is testing something that isn't all
> > that simple. Don't typedefs need to be downloaded from some other
> > blessed buildfarm animal?
> 
> No.  I presume koel is using src/tools/pgindent/typedefs.list,
> which has always been the "canonical" list but up to now we've
> been lazy about maintaining it.  Part of the new regime is that
> typedefs.list should now be updated on-the-fly by patches that
> add new typedefs.

Yea. Otherwise nobody else can indent reliably, without repeating the work of
adding typedefs.list entries of all the patches since the last time it was
updated in the repository.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 August 2023, 00:11:34

Andres Freund <andres@anarazel.de> writes:
> On 2023-08-11 18:30:02 -0400, Tom Lane wrote:
>> +1 for including this in CI tests

> I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
> course would include CI automatically.

Hmm.  I'm allergic to anything that significantly increases the cost
of check-world, and this seems like it'd do that.

Maybe we could automate it, but not as part of check-world per se?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

12 August 2023, 00:27:38

On Fri, Aug 11, 2023 at 08:11:34PM -0400, Tom Lane wrote:
> Hmm.  I'm allergic to anything that significantly increases the cost
> of check-world, and this seems like it'd do that.
>
> Maybe we could automate it, but not as part of check-world per se?

It does not have to be part of check-world by default, as we could
make it optional with PG_TEST_EXTRA.  I bet that most committers set
this option for most of the test suites anyway, so the extra cost is
OK from here.  I don't find a single indent run to be that costly,
especially with parallelism:
$ time ./src/tools/pgindent/pgindent .
real 0m5.039s
user 0m3.403s
sys 0m1.540s
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

12 August 2023, 00:46:42

Hi,

On 2023-08-11 20:11:34 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-08-11 18:30:02 -0400, Tom Lane wrote:
> >> +1 for including this in CI tests
>
> > I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
> > course would include CI automatically.
>
> Hmm.  I'm allergic to anything that significantly increases the cost
> of check-world, and this seems like it'd do that.

Hm, compared to the cost of check-world it's not that large, but still,
annoying to make it larger.

We can make it lot cheaper, but perhaps not in a general enough fashion that
it's suitable for a test.

pgindent already can query git (for --commit). We could teach pgindent to
ask git what remote branch is being tracked, and constructed a list of files
of the difference between the remote branch and the local branch?

That option could do something like:
git diff --name-only $(git rev-parse --abbrev-ref --symbolic-full-name @{upstream})

That's pretty quick, even for a relatively large delta.

> Maybe we could automate it, but not as part of check-world per se?

We should definitely do that.  Another related thing that'd be useful to
script is updating typedefs.list with the additional typedefs found
locally. Right now the script for that still lives in the

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

12 August 2023, 15:57:25

On 2023-08-11 Fr 19:17, Tom Lane wrote:

Peter Geoghegan <pg@bowt.ie> writes:

I'm starting to have doubts about this policy. There have now been
quite a few follow-up "fixes" to indentation issues that koel
complained about. None of these fixups have been included in
.git-blame-ignore-revs. If things continue like this then "git blame"
is bound to become much less usable over time.

FWIW, I'm much more optimistic than that.  I think what we're seeing
is just the predictable result of not all committers having yet
incorporated "pgindent it before committing" into their workflow.
The need for followup fixes should diminish as people start doing
that.  If you want to hurry things along, peer pressure on committers
who clearly aren't bothering is the solution.

Yeah, part of the point of creating koel was to give committers a bit of a nudge in that direction.

With a git pre-commit hook it's pretty painless.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

12 August 2023, 21:03:37

On 2023-08-11 Fr 19:02, Tom Lane wrote:

Peter Geoghegan <pg@bowt.ie> writes:

My workflow up until now has avoiding making updates to typedefs.list
in patches. I only update typedefs locally, for long enough to indent
my code. The final patch doesn't retain any typedefs.list changes.

Yeah, I've done the same and will have to stop.

I guess that I can't do that anymore. Hopefully maintaining the
typedefs.list file isn't as inconvenient as it once seemed to me to
be.

I don't think it'll be a problem.  If your rule is "add new typedef
names added by your patch to typedefs.list, keeping them in
alphabetical order" then it doesn't seem very complicated, and
hopefully conflicts between concurrently-developed patches won't
be common.

My recollection is that missing typedefs cause indentation that kinda sticks out like a sore thumb.

The reason we moved to a buildfarm based typedefs list was that some typedefs are platform dependent, so any list really needs to be the union of the found typedefs on various platforms, and the buildfarm was a convenient vehicle for doing that. But that doesn't mean you shouldn't manually add a typedef you have added in your code.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

12 August 2023, 21:14:04

Hi,

On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:
> On 2023-08-11 Fr 19:02, Tom Lane wrote:
> > Peter Geoghegan<pg@bowt.ie>  writes:
> > > My workflow up until now has avoiding making updates to typedefs.list
> > > in patches. I only update typedefs locally, for long enough to indent
> > > my code. The final patch doesn't retain any typedefs.list changes.
> > Yeah, I've done the same and will have to stop.
> > 
> > > I guess that I can't do that anymore. Hopefully maintaining the
> > > typedefs.list file isn't as inconvenient as it once seemed to me to
> > > be.
> > I don't think it'll be a problem.  If your rule is "add new typedef
> > names added by your patch to typedefs.list, keeping them in
> > alphabetical order" then it doesn't seem very complicated, and
> > hopefully conflicts between concurrently-developed patches won't
> > be common.
>
> My recollection is that missing typedefs cause indentation that kinda sticks
> out like a sore thumb.
> 
> The reason we moved to a buildfarm based typedefs list was that some
> typedefs are platform dependent, so any list really needs to be the union of
> the found typedefs on various platforms, and the buildfarm was a convenient
> vehicle for doing that. But that doesn't mean you shouldn't manually add a
> typedef you have added in your code.

It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that *adds* (but doesn't remove) local typedefs would make this less
painful.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

12 August 2023, 22:46:57

Andres Freund <andres@anarazel.de> writes:
> On 2023-08-12 17:03:37 -0400, Andrew Dunstan wrote:
>> My recollection is that missing typedefs cause indentation that kinda sticks
>> out like a sore thumb.

Yeah, it's usually pretty obvious: "typedef *var" gets changed to
"typedef * var", and similar oddities.

> It's a somewhat annoying task though, find all the typedefs, add them to the
> right place in the file (we have an out of order entry right now). I think a
> script that *adds* (but doesn't remove) local typedefs would make this less
> painful.

My practice has always been "add typedefs until pgindent doesn't do
anything I don't want".  If you have a new typedef that doesn't happen
to be used in a way that pgindent mangles, it's not that critical
to get it into the file right away.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

13 August 2023, 00:13:27

On Sat, Aug 12, 2023 at 3:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > It's a somewhat annoying task though, find all the typedefs, add them to the
> > right place in the file (we have an out of order entry right now). I think a
> > script that *adds* (but doesn't remove) local typedefs would make this less
> > painful.
>
> My practice has always been "add typedefs until pgindent doesn't do
> anything I don't want".  If you have a new typedef that doesn't happen
> to be used in a way that pgindent mangles, it's not that critical
> to get it into the file right away.

We seem to be seriously contemplating making every patch author do
this every time they need to get the tests to pass (after adding or
renaming a struct). Is that really an improvement over the old status
quo?

In principle I'm in favor of strictly enforcing indentation rules like
this. But it seems likely that our current tooling just isn't up to
the task.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

13 August 2023, 00:20:17

Peter Geoghegan <pg@bowt.ie> writes:
> We seem to be seriously contemplating making every patch author do
> this every time they need to get the tests to pass (after adding or
> renaming a struct). Is that really an improvement over the old status
> quo?

Hm.  I was envisioning that we should expect committers to deal
with this, not original patch submitters.  So that's an argument
against including it in the CI tests.  But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

13 August 2023, 00:53:29

On Sat, Aug 12, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hm.  I was envisioning that we should expect committers to deal
> with this, not original patch submitters.  So that's an argument
> against including it in the CI tests.  But I'm in favor of anything
> we can do to make it more painless for committers to fix up patch
> indentation.

Making it a special responsibility for committers comes with the same
set of problems that we see with catversion bumps. People are much
more likely to forget to do something that must happen last.

Maybe I'm wrong -- maybe the new policy is practicable. It might even
turn out to be worth the bother. Time will tell.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

13 August 2023, 14:33:21

On 2023-08-12 Sa 20:53, Peter Geoghegan wrote:

On Sat, Aug 12, 2023 at 5:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm.  I was envisioning that we should expect committers to deal
with this, not original patch submitters.  So that's an argument
against including it in the CI tests.  But I'm in favor of anything
we can do to make it more painless for committers to fix up patch
indentation.

I agree with this.

Making it a special responsibility for committers comes with the same
set of problems that we see with catversion bumps. People are much
more likely to forget to do something that must happen last.

After I'd been caught by this once or twice I implemented a git hook test for that too - in fact it was the first hook I did. It's not perfect but it's saved me a couple of times:

check_catalog_version () {

    # only do this on master
    test "$branch" = "master" || return 0

    case "$files" in
        *src/include/catalog/catversion.h*)
            return 0;
            ;;
        *src/include/catalog/*)
            ;;
        *)
            return 0;
            ;;
    esac

    # changes include catalog but not catversion.h, so warn about it
    {
        echo 'Commit on master alters catalog but catversion not bumped'
        echo 'It can be forced with git commit --no-verify'
    } >&2

    exit 1
}

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

13 August 2023, 23:33:40

On Sun, Aug 13, 2023 at 10:33:21AM -0400, Andrew Dunstan wrote:
> After I'd been caught by this once or twice I implemented a git hook test
> for that too - in fact it was the first hook I did. It's not perfect but
> it's saved me a couple of times:
>
> check_catalog_version () {

I find that pretty cool.  Thanks for sharing.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

14 August 2023, 14:04:04

On 12.08.23 23:14, Andres Freund wrote:
> It's a somewhat annoying task though, find all the typedefs, add them to the
> right place in the file (we have an out of order entry right now). I think a
> script that*adds*  (but doesn't remove) local typedefs would make this less
> painful.

I was puzzled once that there does not appear to be such a script 
available.  Whatever the buildfarm does (before it merges it all 
together) should be available locally.  Then the workflow could be

type type type
compile
update typedefs
pgindent
commit

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

14 August 2023, 14:08:30

On 12.08.23 02:11, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2023-08-11 18:30:02 -0400, Tom Lane wrote:
>>> +1 for including this in CI tests
> 
>> I didn't even mean CI - I meant 'make check-world' / 'meson test'. Which of
>> course would include CI automatically.
> 
> Hmm.  I'm allergic to anything that significantly increases the cost
> of check-world, and this seems like it'd do that.
> 
> Maybe we could automate it, but not as part of check-world per se?

Also, during development, the code in progress is not always perfectly 
formatted, but I do want to keep running the test suites.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

14 August 2023, 19:58:34

On 2023-08-14 Mo 10:04, Peter Eisentraut wrote:

On 12.08.23 23:14, Andres Freund wrote:
It's a somewhat annoying task though, find all the typedefs, add them to the
right place in the file (we have an out of order entry right now). I think a
script that*adds* (but doesn't remove) local typedefs would make this less
painful.

I was puzzled once that there does not appear to be such a script available. Whatever the buildfarm does (before it merges it all together) should be available locally. Then the workflow could be

type type type
compile
update typedefs
pgindent
commit

It's a bit more complicated :-)

You can see what the buildfarm does at <https://github.com/PGBuildFarm/client-code/blob/ec4cf43613a74cb88f228efcde09931cf9fd57e7/run_build.pl#L2562> It's been somewhat fragile over the years, which most people other than Tom and I have probably not noticed.

On most platforms it needs postgres to have been installed before looking for the typedefs.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Nathan Bossart

Date:

15 August 2023, 20:31:09

On Fri, Aug 11, 2023 at 01:59:40PM -0700, Peter Geoghegan wrote:
> I'm starting to have doubts about this policy. There have now been
> quite a few follow-up "fixes" to indentation issues that koel
> complained about. None of these fixups have been included in
> .git-blame-ignore-revs. If things continue like this then "git blame"
> is bound to become much less usable over time.

Should we add those?  Patch attached.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

v1-0001-Add-a-few-recent-commits-to-.git-blame-ignore-rev.patch

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

16 August 2023, 20:15:55

On Tue, Aug 15, 2023 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Should we add those?  Patch attached.

I think that that makes sense. I just don't want to normalize updating
.git-blame-ignore-revs very frequently. (Actually, it's more like I
don't want to normalize any scheme that makes updating the ignore list
very frequently start to seem reasonable.)

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Nathan Bossart

Date:

17 August 2023, 14:40:41

On Wed, Aug 16, 2023 at 01:15:55PM -0700, Peter Geoghegan wrote:
> On Tue, Aug 15, 2023 at 1:31 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> Should we add those?  Patch attached.
> 
> I think that that makes sense.

Committed.

> I just don't want to normalize updating
> .git-blame-ignore-revs very frequently. (Actually, it's more like I
> don't want to normalize any scheme that makes updating the ignore list
> very frequently start to seem reasonable.)

Agreed.  I've found myself habitually running pgindent since becoming a
committer, but I'm sure I'll forget it one of these days.  From a quick
skim of this thread, it sounds like a pre-commit hook [0] might be the best
option at the moment.

[0] https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

16 October 2023, 00:52:17

On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Maybe I'm wrong -- maybe the new policy is practicable. It might even
> turn out to be worth the bother. Time will tell.

(Two months pass.)

There were two independent fixup commits to address complaints from
koel just today (from two different committers). Plus there was a
third issue (involving a third committer) this past Wednesday.

This policy isn't working.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 00:45:00

Peter Geoghegan <pg@bowt.ie> writes:
> There were two independent fixup commits to address complaints from
> koel just today (from two different committers). Plus there was a
> third issue (involving a third committer) this past Wednesday.

> This policy isn't working.

Two thoughts about that:

1. We should not commit indent fixups on behalf of somebody else's
misindented commit.  Peer pressure on committers who aren't being
careful about this is the only way to improve matters; so complain
to the person at fault until they fix it.

2. We could raise awareness of this issue by adding indent verification
to CI testing.  I hesitate to suggest that, though, for a couple of
reasons:
   2a. It seems fairly expensive, though I might be misjudging.
   2b. It's often pretty handy to submit patches that aren't fully
   indent-clean; I have such a patch in flight right now at [1].

2b could be ameliorated by making the indent check be a separate
test process that doesn't obscure the results of other testing.

            regards, tom lane

[1] https://www.postgresql.org/message-id/2617358.1697501956%40sss.pgh.pa.us

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 01:22:51

On Mon, Oct 16, 2023 at 5:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Two thoughts about that:
>
> 1. We should not commit indent fixups on behalf of somebody else's
> misindented commit.  Peer pressure on committers who aren't being
> careful about this is the only way to improve matters; so complain
> to the person at fault until they fix it.

Thomas Munro's recent commit 01529c704008 was added to
.git-blame-ignore-revs by Michael Paquier, despite the fact that
Munro's commit technically isn't just a pure indentation fix (it also
fixed some typos). It's hard to judge Michael too harshly for this,
since in general it's harder to commit things when koel is already
complaining about existing misindetation -- I get why he'd prefer to
take care of that first.

> 2. We could raise awareness of this issue by adding indent verification
> to CI testing.  I hesitate to suggest that, though, for a couple of
> reasons:
>    2a. It seems fairly expensive, though I might be misjudging.
>    2b. It's often pretty handy to submit patches that aren't fully
>    indent-clean; I have such a patch in flight right now at [1].

It's also often handy to make a minor change to a comment or something
at the last minute, without necessarily having the comment indented
perfectly.

> 2b could be ameliorated by making the indent check be a separate
> test process that doesn't obscure the results of other testing.

I was hoping that "go back to the old status quo" would also appear as
an option.

My main objection to the new policy is that it's not quite clear what
process I should go through in order to be 100% confident that koel
won't start whining (short of waiting around for koel to whine). I
know how to run pgindent, of course, and haven't had any problems so
far...but it still seems quite haphazard. If we're going to make this
a hard rule, enforced on every commit, it should be dead easy to
comply with the rule.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 01:32:12

Peter Geoghegan <pg@bowt.ie> writes:
> My main objection to the new policy is that it's not quite clear what
> process I should go through in order to be 100% confident that koel
> won't start whining (short of waiting around for koel to whine). I
> know how to run pgindent, of course, and haven't had any problems so
> far...but it still seems quite haphazard. If we're going to make this
> a hard rule, enforced on every commit, it should be dead easy to
> comply with the rule.

But it's *not* a hard rule --- we explicitly rejected mechanisms
that would make it so (such as a precommit hook).  I view "koel
is unhappy" as something that you ought to clean up, but if you
don't get to it for a day or three there's not much harm done.

In theory koel might complain even if you'd locally gotten
clean results from pgindent (as a consequence of skew in the
typedef lists being used, for example).  We've not seen cases
of that so far though.  Right now I think we just need to raise
committers' awareness of this enough that they routinely run
pgindent on the files they're touching.  In the problem cases
so far, they very clearly didn't.  I don't see much point in
worrying about second-order problems until that first-order
problem is tamped down.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 01:57:46

On Mon, Oct 16, 2023 at 6:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But it's *not* a hard rule --- we explicitly rejected mechanisms
> that would make it so (such as a precommit hook).  I view "koel
> is unhappy" as something that you ought to clean up, but if you
> don't get to it for a day or three there's not much harm done.

It's hard to square that with what you said about needing greater peer
pressure on committers.

> Right now I think we just need to raise
> committers' awareness of this enough that they routinely run
> pgindent on the files they're touching.  In the problem cases
> so far, they very clearly didn't.  I don't see much point in
> worrying about second-order problems until that first-order
> problem is tamped down.

Realistically, if you're the committer that broke koel, you are at
least the subject of mild disapproval -- you have likely
inconvenienced others. I always try to avoid that -- it pretty much
rounds up to "hard rule" in my thinking. Babysitting koel really does
seem like it could cut into my dinner plans or what have you.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

17 October 2023, 02:57:01

On Mon, Oct 16, 2023 at 08:45:00PM -0400, Tom Lane wrote:
> 2. We could raise awareness of this issue by adding indent verification
> to CI testing.  I hesitate to suggest that, though, for a couple of
> reasons:
>    2a. It seems fairly expensive, though I might be misjudging.
>    2b. It's often pretty handy to submit patches that aren't fully
>    indent-clean; I have such a patch in flight right now at [1].
>
> 2b could be ameliorated by making the indent check be a separate
> test process that doesn't obscure the results of other testing.

I see an extra reason with not doing that: this increases the
difficulty when it comes to send and maintain patches to the lists and
newcomers would need to learn more tooling.  I don't think that we
should make that more complicated for code-formatting reasons.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 10:34:35

On Tue, 17 Oct 2023 at 03:23, Peter Geoghegan <pg@bowt.ie> wrote:
> My main objection to the new policy is that it's not quite clear what
> process I should go through in order to be 100% confident that koel
> won't start whining (short of waiting around for koel to whine). I
> know how to run pgindent, of course, and haven't had any problems so
> far...but it still seems quite haphazard. If we're going to make this
> a hard rule, enforced on every commit, it should be dead easy to
> comply with the rule.

I think *it is* dead easy to comply. If you run the following commands
before committing/after rebasing, then koel should always be happy:

src/tools/pgindent/pgindent src # works always but a bit slow
src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
# much faster, but only works if you DID NOT change typedefs.list

If you have specific cases where it does not work. Then I think we
should talk about those/fix them. But looking at the last few commits
in .git-blame-ignore-revs I only see examples of people simply not
running pgindent before they commit.

I guess it's easy to forget, but that's why the wiki contains a
pre-commit hook[1] that you can use to remind yourself/run pgindent
automatically. The only annoying thing is that it does not trigger
when rebasing, but you can work around that by using rebase its -x
flag[2].

[1]: https://wiki.postgresql.org/wiki/Working_with_Git#Using_git_hooks
[2]: https://adamj.eu/tech/2022/11/07/pre-commit-run-hooks-rebase/

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 10:49:13

On Tue, 17 Oct 2023 at 04:57, Michael Paquier <michael@paquier.xyz> wrote:
> I see an extra reason with not doing that: this increases the
> difficulty when it comes to send and maintain patches to the lists and
> newcomers would need to learn more tooling.  I don't think that we
> should make that more complicated for code-formatting reasons.

Honestly, I don't think it's a huge hurdle for newcomers. Most open
source projects have a CI job that runs automatic code formatting, so
it's a pretty common thing for contributors to deal with. And as long
as we keep it a separate CI job from the normal tests, committers can
even choose to commit the patch if the formatting job fails, after
running pgindent themselves.

And personally as a contributor it's a much nicer experience to see
quickly in CI that I messed up the code style, then to hear it a
week/month later in an email when someone took the time to review and
mentions the styling is way off all over the place.

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 12:45:44

On Sun, Oct 15, 2023 at 8:52 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Maybe I'm wrong -- maybe the new policy is practicable. It might even
> > turn out to be worth the bother. Time will tell.
>
> (Two months pass.)
>
> There were two independent fixup commits to address complaints from
> koel just today (from two different committers). Plus there was a
> third issue (involving a third committer) this past Wednesday.
>
> This policy isn't working.

+1. I think this is more annoying than the status quo ante.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 13:52:36

On Tue, Oct 17, 2023 at 6:34 AM Jelte Fennema <postgres@jeltef.nl> wrote:
> I think *it is* dead easy to comply. If you run the following commands
> before committing/after rebasing, then koel should always be happy:
>
> src/tools/pgindent/pgindent src # works always but a bit slow
> src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
> # much faster, but only works if you DID NOT change typedefs.list

In isolation, that's true, but the list of mistakes that you can make
while committing which will inconvenience everyone working on the
project is very long. Another one that comes up frequently is
forgetting to bump CATALOG_VERSION_NO, but you also need a good commit
message, and good comments, and a good Discussion link in the commit
message, and the right list of authors and reviewers, and to update
the docs (with spaces, not tabs) and the Makefiles (with tabs, not
spaces) and the meson stuff and, as if that weren't enough already,
you actually need the code to work! And that includes not only working
regularly but also with CLOBBER_CACHE_ALWAYS and debug_parallel_query
and so on. It's very easy to miss something somewhere. I put a LOT of
work into polishing my commits before I push them, and it's still not
that uncommon that I screw something up.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 14:03:54

On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > This policy isn't working.
>
> +1. I think this is more annoying than the status quo ante.

Although ... I do think it's spared me some rebasing pain, and that
does have some real value. I wonder if we could think of other
alternatives. For example, maybe we could have a bot. If you push a
commit that's not indented properly, the bot reindents the tree,
updates git-blame-ignore-revs, and sends you an email admonishing you
for your error. Or we could have a server-side hook that will refuse
the misindented commit, with some kind of override for emergency
situations. What I really dislike about the current situation is that
it's doubling down on the idea that committers have to be perfect and
get everything right every time. Turns out, that's hard to do. If not,
why do people keep screwing things up? Somebody could theorize - and
this seems to be Tom and Jelte's theory, though perhaps I'm
misinterpreting their comments - that the people who have made
mistakes here are just lazy, and what they need to do is up their
game.

But I don't buy that. First, I think that most of our committers are
pretty intelligent and hard-working people who are trying to do the
right thing. We can't all be Tom Lane, no matter how hard we may try.
Second, even if it were true that the offending committers are "just
lazy," all of our contributors and many senior non-committer
contributors are people who have put thousands, if not tens of
thousands, of hours into the project. Making them feel bad serves us
poorly. At the end of the day, it doesn't matter whether it's too much
of a pain for the perfect committers we'd like to have. It matters
whether it's too much of a pain for the human committers that we do
have.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 14:23:22

Robert Haas <robertmhaas@gmail.com> writes:
> On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> +1. I think this is more annoying than the status quo ante.

> Although ... I do think it's spared me some rebasing pain, and that
> does have some real value. I wonder if we could think of other
> alternatives.

An alternative I was thinking about after reading your earlier email
was going back to the status quo ante, but doing the manual tree-wide
reindents significantly more often than once a year.  Adding one at
the conclusion of each commitfest would be a natural thing to do,
for instance.  It's hard to say what frequency would lead to the
least rebasing pain, but we know once-a-year isn't ideal.

> For example, maybe we could have a bot. If you push a
> commit that's not indented properly, the bot reindents the tree,
> updates git-blame-ignore-revs, and sends you an email admonishing you
> for your error.

I'm absolutely not in favor of completely-automated reindents.
pgindent is a pretty stupid tool and it will sometimes do stupid
things, which you have to correct for by tweaking the input
formatting.  The combination of the tool and human supervision
generally produces pretty good results, but the tool alone
not so much.

> Or we could have a server-side hook that will refuse
> the misindented commit, with some kind of override for emergency
> situations.

Even though I'm in the camp that would like the tree correctly
indented at all times, I remain very much against a commit hook.
I think that'd be significantly more annoying than the current
situation, which you're already unhappy about the annoying-ness of.

The bottom line here, I think, is that there's a subset of committers
that would like perfectly mechanically-indented code at all times,
and there's another subset that just doesn't care that much.
We don't (and shouldn't IMO) have a mechanism to let one set force
their views on the other set.  The current approach is clearly
insufficient for that, and I don't think trying to institute stronger
enforcement is going to make anybody happy.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 14:55:36

On Tue, Oct 17, 2023 at 7:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Tue, Oct 17, 2023 at 8:45 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >> +1. I think this is more annoying than the status quo ante.
>
> > Although ... I do think it's spared me some rebasing pain, and that
> > does have some real value. I wonder if we could think of other
> > alternatives.
>
> An alternative I was thinking about after reading your earlier email
> was going back to the status quo ante, but doing the manual tree-wide
> reindents significantly more often than once a year.  Adding one at
> the conclusion of each commitfest would be a natural thing to do,
> for instance.  It's hard to say what frequency would lead to the
> least rebasing pain, but we know once-a-year isn't ideal.

That seems like the best alternative we have. The old status quo did
occasionally allow code with indentation that *clearly* wasn't up to
project standards to slip in. It could stay that way for quite a few
months at a time. That wasn't great either.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 15:00:35

On Tue, 17 Oct 2023 at 16:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Or we could have a server-side hook that will refuse
> > the misindented commit, with some kind of override for emergency
> > situations.
>
> Even though I'm in the camp that would like the tree correctly
> indented at all times, I remain very much against a commit hook.
> I think that'd be significantly more annoying than the current
> situation, which you're already unhappy about the annoying-ness of.

Why do you think that would be significantly more annoying than the
current situation? Instead of getting delayed feedback you get instant
feedback when you push.

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 15:01:44

On Tue, Oct 17, 2023 at 10:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> An alternative I was thinking about after reading your earlier email
> was going back to the status quo ante, but doing the manual tree-wide
> reindents significantly more often than once a year.  Adding one at
> the conclusion of each commitfest would be a natural thing to do,
> for instance.  It's hard to say what frequency would lead to the
> least rebasing pain, but we know once-a-year isn't ideal.

Yes. I suspect once a commitfest still wouldn't be often enough. Maybe
once a month or something would be. But I'm not sure. You might rebase
once over the misindented commit and then have to rebase again over
the indent that fixed it. There's not really anything that quite
substitutes for doing it right on every commit.

> The bottom line here, I think, is that there's a subset of committers
> that would like perfectly mechanically-indented code at all times,
> and there's another subset that just doesn't care that much.
> We don't (and shouldn't IMO) have a mechanism to let one set force
> their views on the other set.  The current approach is clearly
> insufficient for that, and I don't think trying to institute stronger
> enforcement is going to make anybody happy.

I mean, I think we DO have such a mechanism. Everyone agrees that the
buildfarm has to stay green, and we have a buildfarm member that
checks pgindent, so that means everyone has to pgindent. We could
decide to kill that buildfarm member, in which case we go back to
people not having to pgindent, but right now they do.

And if it's going to remain the policy, it's better to enforce that
policy earlier rather than later. I mean, what is the point of having
a system where we let people do the wrong thing and then publicly
embarrass them afterwards? How is that better than preventing them
from doing the wrong thing in the first place? Even if they don't
subjectively feel embarrassed, nobody likes having to log back on in
the evening or the weekend and clean up after something they thought
they were done with.

In fact, that particular experience is one of the worst things about
being a committer. It actively discourages me, at least, from trying
to get other people's patches committed. This particular problem is
minor, but the overall experience of trying to get things committed is
that you have to check 300 things for every patch and if you get every
one of them right then nothing happens and if you get one of them
wrong then you get a bunch of irritated emails criticizing your
laziness, sloppiness, or whatever, and you have to drop everything to
go fix it immediately. What a deal! I'm sure this isn't the only
reason why we have such a huge backlog of patches needing committer
attention, but it sure doesn't help. And there is absolutely zero need
for this to be yet another thing that you can find out you did wrong
in the 1-24 hour period AFTER you type 'git push'.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 15:15:51

On Tue, Oct 17, 2023 at 8:01 AM Robert Haas <robertmhaas@gmail.com> wrote:
> In fact, that particular experience is one of the worst things about
> being a committer. It actively discourages me, at least, from trying
> to get other people's patches committed. This particular problem is
> minor, but the overall experience of trying to get things committed is
> that you have to check 300 things for every patch and if you get every
> one of them right then nothing happens and if you get one of them
> wrong then you get a bunch of irritated emails criticizing your
> laziness, sloppiness, or whatever, and you have to drop everything to
> go fix it immediately. What a deal!

Yep. Enforcing perfect indentation on koel necessitates rechecking
indentation after each and every last-minute fixup affecting C code --
the interactions makes it quite a bit harder to get everything right
on the first push. For example, if I spot an issue with a comment
during final pre-commit review, and fixup that commit, I have to run
pgindent again. On a long enough timeline, I'm going to forget to do
that.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 15:23:10

On Tue, 17 Oct 2023 at 16:04, Robert Haas <robertmhaas@gmail.com> wrote:
> What I really dislike about the current situation is that
> it's doubling down on the idea that committers have to be perfect and
> get everything right every time. Turns out, that's hard to do. If not,
> why do people keep screwing things up? Somebody could theorize - and
> this seems to be Tom and Jelte's theory, though perhaps I'm
> misinterpreting their comments - that the people who have made
> mistakes here are just lazy, and what they need to do is up their
> game.

To clarify, I did not intend to imply people that commit unindented
code are lazy. It's expected that humans forget to run pgindent before
committing from time to time (I do too). That's why I proposed a
server side git hook to reject badly indented commits very early in
this thread. But some others said that buildfarm animals were the way
to go for Postgres development flow. And since I'm not a committer I
left it at that. I was already happy enough that there was consensus
on indenting continuously, so that the semi-regular rebases for the
few open CF entries that I have are a lot less annoying.

But based on the current feedback I think we should seriously consider
a server-side "update" git hook again. People are obviously not
perfect machines. And for whatever reason not everyone installs the
pre-commit hook from the wiki. So the koel keeps complaining. A
server-side hook would solve all of this IMHO.

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 15:23:52

On Tue, Oct 17, 2023 at 11:16 AM Peter Geoghegan <pg@bowt.ie> wrote:
> Yep. Enforcing perfect indentation on koel necessitates rechecking
> indentation after each and every last-minute fixup affecting C code --
> the interactions makes it quite a bit harder to get everything right
> on the first push. For example, if I spot an issue with a comment
> during final pre-commit review, and fixup that commit, I have to run
> pgindent again. On a long enough timeline, I'm going to forget to do
> that.

I also just discovered that my pre-commit hook doesn't work if I pull
commits into master by cherry-picking. I had thought that I could have
my hook just check my commits to master and not all of my local dev
branches where I really don't want to mess with this when I'm just
banging out a rough draft of something. But now I see that I'm going
to need to work harder on this if I actually want it to catch all the
ways I might screw this up.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 15:42:27

On Tue, Oct 17, 2023 at 11:23 AM Jelte Fennema <postgres@jeltef.nl> wrote:
> To clarify, I did not intend to imply people that commit unindented
> code are lazy. It's expected that humans forget to run pgindent before
> committing from time to time (I do too). That's why I proposed a
> server side git hook to reject badly indented commits very early in
> this thread. But some others said that buildfarm animals were the way
> to go for Postgres development flow. And since I'm not a committer I
> left it at that. I was already happy enough that there was consensus
> on indenting continuously, so that the semi-regular rebases for the
> few open CF entries that I have are a lot less annoying.

Thanks for clarifying. I didn't really think you were trying to be
accusatory, but I didn't really understand what else to think either,
so this is helpful context.

> But based on the current feedback I think we should seriously consider
> a server-side "update" git hook again. People are obviously not
> perfect machines. And for whatever reason not everyone installs the
> pre-commit hook from the wiki. So the koel keeps complaining. A
> server-side hook would solve all of this IMHO.

One potential problem with a server-side hook is that if you back-port
a commit to older branches and then push the commits all together
(which is my workflow) then you might get failure to push on some
branches but not others. I don't know if there's any way to avoid
that, but it seems not great. You could think of enforcing the policy
only on master to try to avoid this, but that still leaves a risk that
you manage to push to all the back-branches and not to master.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 15:47:01

On Tue, Oct 17, 2023 at 8:24 AM Robert Haas <robertmhaas@gmail.com> wrote:
> I also just discovered that my pre-commit hook doesn't work if I pull
> commits into master by cherry-picking. I had thought that I could have
> my hook just check my commits to master and not all of my local dev
> branches where I really don't want to mess with this when I'm just
> banging out a rough draft of something. But now I see that I'm going
> to need to work harder on this if I actually want it to catch all the
> ways I might screw this up.

Once you figure all that out, you're still obligated to hand-polish
typedefs.list to be consistent with whatever Bruce's machine's copy of
objdump does (or is it Tom's?). You need to sort the entries so they
kinda look like they originated from the same source as existing
entries, since my Debian machine seems to produce somewhat different
results to RHEL, for whatever reason. It's hard to imagine a worse use
of committer time.

I think that something like this new policy could work if the
underlying tooling was very easy to use and gave perfectly consistent
results on everybody's development machine. Obviously, that just isn't
the case.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 16:03:44

On Tue, 17 Oct 2023 at 17:47, Peter Geoghegan <pg@bowt.ie> wrote:
> Once you figure all that out, you're still obligated to hand-polish
> typedefs.list to be consistent with whatever Bruce's machine's copy of
> objdump does (or is it Tom's?). You need to sort the entries so they
> kinda look like they originated from the same source as existing
> entries, since my Debian machine seems to produce somewhat different
> results to RHEL, for whatever reason. It's hard to imagine a worse use
> of committer time.
>
> I think that something like this new policy could work if the
> underlying tooling was very easy to use and gave perfectly consistent
> results on everybody's development machine. Obviously, that just isn't
> the case.

To make koel pass you don't need to worry about hand-polishing
typedefs.list. koel uses the typedefs.list that's committed into the
repo, just like when you run pgindent yourself. If you forget to
update the typedefs.list with new types, then worst case the pgindent
output will look weird. But it will look weird both on your own
machine and on koel. So afaik the current tooling should give
perfectly consistent results on everybody's development machine. If
you have an example of where it doesn't then we should fix that
problem.

On Tue, 17 Oct 2023 at 17:47, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Tue, Oct 17, 2023 at 8:24 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > I also just discovered that my pre-commit hook doesn't work if I pull
> > commits into master by cherry-picking. I had thought that I could have
> > my hook just check my commits to master and not all of my local dev
> > branches where I really don't want to mess with this when I'm just
> > banging out a rough draft of something. But now I see that I'm going
> > to need to work harder on this if I actually want it to catch all the
> > ways I might screw this up.
>
> Once you figure all that out, you're still obligated to hand-polish
> typedefs.list to be consistent with whatever Bruce's machine's copy of
> objdump does (or is it Tom's?). You need to sort the entries so they
> kinda look like they originated from the same source as existing
> entries, since my Debian machine seems to produce somewhat different
> results to RHEL, for whatever reason. It's hard to imagine a worse use
> of committer time.
>
> I think that something like this new policy could work if the
> underlying tooling was very easy to use and gave perfectly consistent
> results on everybody's development machine. Obviously, that just isn't
> the case.
>
> --
> Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Peter Geoghegan

Date:

17 October 2023, 16:07:05

On Tue, Oct 17, 2023 at 9:03 AM Jelte Fennema <postgres@jeltef.nl> wrote:
> To make koel pass you don't need to worry about hand-polishing
> typedefs.list. koel uses the typedefs.list that's committed into the
> repo, just like when you run pgindent yourself. If you forget to
> update the typedefs.list with new types, then worst case the pgindent
> output will look weird. But it will look weird both on your own
> machine and on koel.

That's beside the point. The point is that I'm obligated to keep
typedef.list up to date in general, a task that is made significantly
harder by random objdump implementation details. And I probably need
to do this not just once per commit, but several times, since in
practice I need to defensively run and rerun pgindent as the patch is
tweaked.

--
Peter Geoghegan

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 16:16:55

Peter Geoghegan <pg@bowt.ie> writes:
> That's beside the point. The point is that I'm obligated to keep
> typedef.list up to date in general, a task that is made significantly
> harder by random objdump implementation details. And I probably need
> to do this not just once per commit, but several times, since in
> practice I need to defensively run and rerun pgindent as the patch is
> tweaked.

Hmm, I've not found it that hard to manage the typedefs list.
If I run pgindent and it adds weird spacing around uses of a new
typedef name, I go "oh, I better add that to the list" and do so.
End of problem.  There's not a requirement that you remove disused
typedef names, nor that you alphabetize perfectly.  I'm content
to update those sorts of details from the buildfarm's list once a
year or so.

This does assume that you inspect pgindent's changes rather than
just accepting them blindly --- but as I commented upthread, the
tool really requires that anyway.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 16:18:51

Robert Haas <robertmhaas@gmail.com> writes:
> One potential problem with a server-side hook is that if you back-port
> a commit to older branches and then push the commits all together
> (which is my workflow) then you might get failure to push on some
> branches but not others. I don't know if there's any way to avoid
> that, but it seems not great. You could think of enforcing the policy
> only on master to try to avoid this, but that still leaves a risk that
> you manage to push to all the back-branches and not to master.

Is that actually possible?  I had the idea that "git push" is an
atomic operation, ie 100% or nothing.  Is it only atomic per-branch?

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 16:19:38

On Tue, Oct 17, 2023 at 12:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Hmm, I've not found it that hard to manage the typedefs list.
> If I run pgindent and it adds weird spacing around uses of a new
> typedef name, I go "oh, I better add that to the list" and do so.
> End of problem.  There's not a requirement that you remove disused
> typedef names, nor that you alphabetize perfectly.  I'm content
> to update those sorts of details from the buildfarm's list once a
> year or so.

+1 to all of that. At least for me, managing typedefs.list isn't the
problem. The problem is remembering to actually do it, and keep it
updated as the patch set is adjusted and rebased.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

17 October 2023, 16:21:00

On Tue, Oct 17, 2023 at 12:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > One potential problem with a server-side hook is that if you back-port
> > a commit to older branches and then push the commits all together
> > (which is my workflow) then you might get failure to push on some
> > branches but not others. I don't know if there's any way to avoid
> > that, but it seems not great. You could think of enforcing the policy
> > only on master to try to avoid this, but that still leaves a risk that
> > you manage to push to all the back-branches and not to master.
>
> Is that actually possible?  I had the idea that "git push" is an
> atomic operation, ie 100% or nothing.  Is it only atomic per-branch?

I believe so. For instance:

[rhaas pgsql]$ git push rhaas
Enumerating objects: 2980, done.
Counting objects: 100% (2980/2980), done.
Delta compression using up to 16 threads
Compressing objects: 100% (940/940), done.
Writing objects: 100% (2382/2382), 454.52 KiB | 7.70 MiB/s, done.
Total 2382 (delta 2024), reused 1652 (delta 1429), pack-reused 0
remote: Resolving deltas: 100% (2024/2024), completed with 579 local objects.
To ssh://git.postgresql.org/users/rhaas/postgres.git
   e434e21e11..2406c4e34c  master -> master
 ! [rejected]              walsummarizer2 -> walsummarizer2 (non-fast-forward)
error: failed to push some refs to
'ssh://git.postgresql.org/users/rhaas/postgres.git'
hint: Updates were rejected because a pushed branch tip is behind its remote
hint: counterpart. Check out this branch and integrate the remote changes
hint: (e.g. 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Maciek Sakrejda

Date:

17 October 2023, 16:53:03

On Tue, Oct 17, 2023, 09:22 Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Oct 17, 2023 at 12:18 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> Is that actually possible? I had the idea that "git push" is an
> atomic operation, ie 100% or nothing. Is it only atomic per-branch?

I believe so.

Git push does have an --atomic flag to treat the entire push as a single operation.

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

17 October 2023, 20:43:04

On Tue, 17 Oct 2023 at 18:53, Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
> Git push does have an --atomic flag to treat the entire push as a single operation.

I decided to play around a bit with server hooks. Attached is a git
"update" hook that rejects pushes to the master branch when the new
HEAD of master does not pass pgindent. It tries to do the minimal
amount of work necessary. Together with the --atomic flag of git push
I think this would work quite well.

Note: It does require that pg_bsd_indent is in PATH. While not perfect
seems like it would be acceptable in practice to me. Its version is
not updated very frequently. So manually updating it on the git server
when we do does not seem like a huge issue to me.

The easiest way to try it out is by cloning the postgres repo in two
different local directories, let's call them A and B. And then
configure directory B to be the origin remote of A. By placing the
update script in B/.git/hooks/ it will execute whenever you push
master from A to B.

Attachment

update

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

17 October 2023, 21:01:38

On Tue, Oct 17, 2023 at 10:43 PM Jelte Fennema <postgres@jeltef.nl> wrote:
>
> On Tue, 17 Oct 2023 at 18:53, Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
> > Git push does have an --atomic flag to treat the entire push as a single operation.
>
> I decided to play around a bit with server hooks. Attached is a git
> "update" hook that rejects pushes to the master branch when the new
> HEAD of master does not pass pgindent. It tries to do the minimal
> amount of work necessary. Together with the --atomic flag of git push
> I think this would work quite well.
>
> Note: It does require that pg_bsd_indent is in PATH. While not perfect
> seems like it would be acceptable in practice to me. Its version is
> not updated very frequently. So manually updating it on the git server
> when we do does not seem like a huge issue to me.

If it doesn't know how to rebuild it, aren't we going to be stuck in a
catch-22 if we need to change it in certain ways? Since an old version
of pg_bsd_indent would reject the patch that might include updating
it. (And when it does, one should expect the push to take quite a long
time, but given the infrequency I agree that part is probably not an
issue)

And unless we're only enforcing it on master, we'd also need to make
provisions for different versions of it on different branches, I
think?

Other than that, I agree it's fairly simple. It does nede a lot more
sandboxing than what's in there now, but that's not too hard of a
problem to solve, *if* this is what we want.

(And of course needs to be integrated with the existing script since
AFAIK you can't chain git hooks unless you do it manually - but that's
mostliy mechanical)

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

17 October 2023, 21:07:46

Magnus Hagander <magnus@hagander.net> writes:
> If it doesn't know how to rebuild it, aren't we going to be stuck in a
> catch-22 if we need to change it in certain ways? Since an old version
> of pg_bsd_indent would reject the patch that might include updating
> it. (And when it does, one should expect the push to take quite a long
> time, but given the infrequency I agree that part is probably not an
> issue)

Everyone who has proposed this has included a caveat that there must
be a way to override the check.  Given that minimal expectation, it
shouldn't be too hard to deal with pg_bsd_indent-updating commits.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

David Rowley

Date:

18 October 2023, 04:40:40

On Wed, 18 Oct 2023 at 01:47, Robert Haas <robertmhaas@gmail.com> wrote:
> On Sat, Aug 12, 2023 at 5:53 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > This policy isn't working.
>
> +1. I think this is more annoying than the status quo ante.

Maybe there are two camps of committers here; ones who care about
committing correctly indented code and ones who do not.

I don't mean that in a bad way, but if a committer just does not care
about correctly pgindented code then he/she likely didn't suffer
enough pain from how things used to be... having to unindent all the
unrelated indent fixes that were committed since the last pgindent run
I personally found slow/annoying/error-prone.

What I do now seems significantly easier. Assuming it's just 1 commit, just:

perl src/tools/pgindent/pgindent --commit HEAD
git diff # manual check to see if everything looks sane.
git commit -a --fixup HEAD
git rebase -i HEAD~2

If we were to go back to how it was before, then why should I go to
the trouble of unindenting all the unrelated indents from code changed
by other committers since the last pgindent run when those committers
are not bothering to and making my job harder each time they commit
incorrectly indented code.

How many of the committers who have broken koel are repeat offenders?
What is their opinion on this?
Did they just forget once or do they hate the process and want to go back?

I'm personally very grateful for all the work that was done to improve
pgindent and set out the new process. I'd really rather not go back to
how things were.

I agree that it's not nice to add yet another way of breaking the
buildfarm and even more so when the committer did make check-world
before committing. We have --enable-tap-tests, we could have
--enable-indent-checks and have pgindent check the code is correctly
indented during make check-world. Then just not have
--enable-indent-checks in CI.

I think many of us have scripts we use instead of typing out all the
configure options we want.  It's likely pretty easy to add
--enable-indent-checks to those.

David

Re: run pgindent on a regular basis / scripted manner

From

Peter Eisentraut

Date:

18 October 2023, 07:20:55

On 18.10.23 06:40, David Rowley wrote:
> I agree that it's not nice to add yet another way of breaking the 
> buildfarm and even more so when the committer did make check-world 
> before committing. We have --enable-tap-tests, we could have 
> --enable-indent-checks and have pgindent check the code is correctly 
> indented during make check-world. Then just not have 
> --enable-indent-checks in CI.

This approach seems like a good improvement, even independent of 
everything else we might do about this.  Making it easier to use and 
less likely to be forgotten.  Also, this way, non-committer contributors 
can opt-in, if they want to earn bonus points.

Re: run pgindent on a regular basis / scripted manner

From

Magnus Hagander

Date:

18 October 2023, 07:56:38

On Tue, Oct 17, 2023 at 11:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Magnus Hagander <magnus@hagander.net> writes:
> > If it doesn't know how to rebuild it, aren't we going to be stuck in a
> > catch-22 if we need to change it in certain ways? Since an old version
> > of pg_bsd_indent would reject the patch that might include updating
> > it. (And when it does, one should expect the push to take quite a long
> > time, but given the infrequency I agree that part is probably not an
> > issue)
>
> Everyone who has proposed this has included a caveat that there must
> be a way to override the check.  Given that minimal expectation, it
> shouldn't be too hard to deal with pg_bsd_indent-updating commits.

I haven't managed to fully keep up with the thread, so I missed that.
And i can't directly find it looking back either - but as long as
three's an actual idea for how to do that, the problem goes away :)

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

18 October 2023, 08:21:44

On Tue, 17 Oct 2023 at 23:01, Magnus Hagander <magnus@hagander.net> wrote:
> And unless we're only enforcing it on master, we'd also need to make
> provisions for different versions of it on different branches, I
> think?

Only enforcing on master sounds fine to me, that's what koel is doing
too afaik. In practice this seems to be enough to solve my main issue
of having to manually remove unrelated indents when rebasing my
patches. Enforcing on different branches seems like it would add a lot
of complexity. So I'm not sure that's worth doing at this point, since
currently some committers are proposing to stop enforcing continuous
indentation because of problems with the current flow. I think we
should only enforce it on more branches once we have the flow
mastered.

> It does need a lot more
> sandboxing than what's in there now, but that's not too hard of a
> problem to solve, *if* this is what we want.

Yeah, I didn't bother with that. Since that seems very tightly coupled
with the environment that the git server is running, and I have no
clue what that environment is.

Re: run pgindent on a regular basis / scripted manner

From

Jelte Fennema

Date:

18 October 2023, 09:07:00

On Wed, 18 Oct 2023 at 06:40, David Rowley <dgrowleyml@gmail.com> wrote:
> How many of the committers who have broken koel are repeat offenders?

I just checked the commits and there don't seem to be real repeat
offenders. The maximum number of times someone broke koel since its
inception is two. That was the case for only two people. The other 8
people only caused one breakage.

> What is their opinion on this?
> Did they just forget once or do they hate the process and want to go back?

The commiters that broke koel since its inception are:
- Alexander Korotkov
- Amit Kapila
- Amit Langote
- Andres Freund
- Etsuro Fujita
- Jeff Davis
- Michael Paquier
- Peter Eisentraut
- Tatsuo Ishii
- Tomas Vondra

I included all of them in the To field of this message, in the hope
that they share their viewpoint. Because otherwise it stays guessing
what they think.

But based on the contents of the fixup commits a commonality seems to
be that the fixup only fixes a few lines, quite often touching only
comments. So it seems like the main reason for breaking koel is
forgetting to re-run pgindent after some final cleanup/wording
changes/typo fixes. And that seems like an expected flaw of being
human instead of a robot, which can only be worked around with better
automation.

> I agree that it's not nice to add yet another way of breaking the
> buildfarm and even more so when the committer did make check-world
> before committing. We have --enable-tap-tests, we could have
> --enable-indent-checks and have pgindent check the code is correctly
> indented during make check-world. Then just not have
> --enable-indent-checks in CI.

I think --enable-indent-checks sounds like a good improvement to the
status quo. But I'm not confident that it will help remove the cases
where only a comment needs to be re-indented. Do commiters really
always run check-world again when only changing a typo in a comment? I
know I probably wouldn't (or at least not always).

Re: run pgindent on a regular basis / scripted manner

From

David Rowley

Date:

18 October 2023, 09:34:45

On Wed, 18 Oct 2023 at 22:07, Jelte Fennema <postgres@jeltef.nl> wrote:
> But based on the contents of the fixup commits a commonality seems to
> be that the fixup only fixes a few lines, quite often touching only
> comments. So it seems like the main reason for breaking koel is
> forgetting to re-run pgindent after some final cleanup/wording
> changes/typo fixes. And that seems like an expected flaw of being
> human instead of a robot, which can only be worked around with better
> automation.

I wonder if you might just be assuming these were caused by
last-minute comment adjustments. I may have missed something on the
thread, but it could be that, or it could be due to the fact that
pgindent just simply does more adjustments to comments than it does
with code lines.

On Wed, 18 Oct 2023 at 06:40, David Rowley <dgrowleyml@gmail.com> wrote:
> > I agree that it's not nice to add yet another way of breaking the
> > buildfarm and even more so when the committer did make check-world
> > before committing. We have --enable-tap-tests, we could have
> > --enable-indent-checks and have pgindent check the code is correctly
> > indented during make check-world. Then just not have
> > --enable-indent-checks in CI.
>
> I think --enable-indent-checks sounds like a good improvement to the
> status quo. But I'm not confident that it will help remove the cases
> where only a comment needs to be re-indented. Do commiters really
> always run check-world again when only changing a typo in a comment? I
> know I probably wouldn't (or at least not always).

I can't speak for others, but I always make edits in a dev branch and
do "make check-world" before doing "git format-patch" before I "git
am" that patch into a clean repo.  Before I push, I'll always run
"make check-world" again as sometimes master might have moved on a few
commits from where the dev branch was taken (perhaps I need to update
the expected output of some newly added EXPLAIN tests if say doing a
planner adjustment).  I personally never adjust any code or comments
after the git am. I only sometimes adjust the commit message.

So, in theory at least, if --enable-indent-checks existed and I used
it, I shouldn't break koel...  let's see if I just jinxed myself.

It would be good to learn how many of the committers out of the ones
you listed that --enable-indent-checks would have saved from breaking
koel.

David

Re: run pgindent on a regular basis / scripted manner

From

Vik Fearing

Date:

18 October 2023, 13:04:13

On 10/17/23 16:23, Tom Lane wrote:
> An alternative I was thinking about after reading your earlier email was 
> going back to the status quo ante, but doing the manual tree-wide 
> reindents significantly more often than once a year. Adding one at the 
> conclusion of each commitfest would be a natural thing to do, for 
> instance. It's hard to say what frequency would lead to the least 
> rebasing pain, but we know once-a-year isn't ideal.

This is basically how the SQL Committee functions.  The Change Proposals 
(patches) submitted every meeting (commitfest) are always against the 
documents as they exist after the application of papers (commits) from 
the previous meeting.

One major difference is that Change Proposals are against the text, and 
patches are against the code.  It is not dissimilar to people saying 
what our documentation should say, and then someone implementing that 
change.

So I am in favor of a pgindent run *at least* at the end of each 
commitfest, giving a full month for patch authors to rebase before the 
next fest.
-- 
Vik Fearing

Re: run pgindent on a regular basis / scripted manner

From

Robert Haas

Date:

18 October 2023, 14:07:46

On Wed, Oct 18, 2023 at 3:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 18.10.23 06:40, David Rowley wrote:
> > I agree that it's not nice to add yet another way of breaking the
> > buildfarm and even more so when the committer did make check-world
> > before committing. We have --enable-tap-tests, we could have
> > --enable-indent-checks and have pgindent check the code is correctly
> > indented during make check-world. Then just not have
> > --enable-indent-checks in CI.
>
> This approach seems like a good improvement, even independent of
> everything else we might do about this.  Making it easier to use and
> less likely to be forgotten.  Also, this way, non-committer contributors
> can opt-in, if they want to earn bonus points.

Yeah. I'm not going to push anything that doesn't pass make
check-world, so this is appealing in that something that I'm already
doing would (or could be configured to) catch this problem also.

--
Robert Haas
EDB: http://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

18 October 2023, 16:45:15

On Tue, Oct 17, 2023 at 11:01:44AM -0400, Robert Haas wrote:
> In fact, that particular experience is one of the worst things about
> being a committer. It actively discourages me, at least, from trying
> to get other people's patches committed. This particular problem is
> minor, but the overall experience of trying to get things committed is
> that you have to check 300 things for every patch and if you get every
> one of them right then nothing happens and if you get one of them
> wrong then you get a bunch of irritated emails criticizing your
> laziness, sloppiness, or whatever, and you have to drop everything to
> go fix it immediately. What a deal! I'm sure this isn't the only
> reason why we have such a huge backlog of patches needing committer
> attention, but it sure doesn't help. And there is absolutely zero need
> for this to be yet another thing that you can find out you did wrong
> in the 1-24 hour period AFTER you type 'git push'.

This comment resonated with me.  I do all my git operations with shell
scripts so I can check for all the mistakes I have made in the past and
generate errors. Even with all of that, committing is an
anxiety-producing activity because any small mistake is quickly revealed
to the world.  There aren't many things I do in a day where mistakes are
so impactful.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

18 October 2023, 19:15:51

Hi,

On 2023-10-16 20:45:00 -0400, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> 2. We could raise awareness of this issue by adding indent verification
> to CI testing.  I hesitate to suggest that, though, for a couple of
> reasons:
>    2a. It seems fairly expensive, though I might be misjudging.

Compared to other things it's not that expensive. On my workstation, which is
slower on a per-core basis than CI, a whole tree pgindent --silent-diff takes
6.8s.  For That's doing things serially, it shouldn't be that hard to parallelize
the per-file processing.

For comparison, the current compiler warnings task takes 6-15min, depending on
the state of the ccache "database". Even when ccache is primed, running
cpluspluscheck or headerscheck is ~30s each. Adding a few more seconds for an
indentation check wouldn't be a problem.

>    2b. It's often pretty handy to submit patches that aren't fully
>    indent-clean; I have such a patch in flight right now at [1].
>
> 2b could be ameliorated by making the indent check be a separate
> test process that doesn't obscure the results of other testing.

The compiler warnings task already executes a number of tests even if prior
tests have failed (to be able to find compiler warnings in different compilers
at once). Adding pgindent cleanliness to that would be fairly simple.

I still think that one of the more important things we ought to do is to make
it trivial to check if code is correctly indented and reindent it for the
user.  I've posted a preliminary patch to add a 'indent-tree' target a few
months back, at
https://postgr.es/m/20230527184201.2zdorrijg2inqt6v%40alap3.anarazel.de

I've updated that patch, now it has
- indent-tree, reindents the entire tree
- indent-head, which pgindent --commit=HEAD
- indent-check, fails if the tree isn't correctly indented
- indent-diff, like indent-check, but also shows the diff

If we tought pgindent to emit the list of files it processes to a dependency
file, we could make it cheap to call indent-check repeatedly, by teaching
meson/ninja to not reinvoke it if the input files haven't changed.  Personally
that'd make it more bearable to script indentation checks to happen
frequently.

I'll look into writing a command to update typedefs.list with all the local
changes.

Greetings,

Andres Freund

Attachment

v2-0001-Add-meson-targets-to-run-pgindent.patch

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

19 October 2023, 00:56:35

Hi,

On 2023-10-18 12:15:51 -0700, Andres Freund wrote:
> I still think that one of the more important things we ought to do is to make
> it trivial to check if code is correctly indented and reindent it for the
> user.  I've posted a preliminary patch to add a 'indent-tree' target a few
> months back, at
> https://postgr.es/m/20230527184201.2zdorrijg2inqt6v%40alap3.anarazel.de
> 
> I've updated that patch, now it has
> - indent-tree, reindents the entire tree
> - indent-head, which pgindent --commit=HEAD
> - indent-check, fails if the tree isn't correctly indented
> - indent-diff, like indent-check, but also shows the diff
> 
> If we tought pgindent to emit the list of files it processes to a dependency
> file, we could make it cheap to call indent-check repeatedly, by teaching
> meson/ninja to not reinvoke it if the input files haven't changed.  Personally
> that'd make it more bearable to script indentation checks to happen
> frequently.
> 
> 
> I'll look into writing a command to update typedefs.list with all the local
> changes.

It turns out that updating the in-tree typedefs.list would be very noisy. On
my local linux system I get
 1 file changed, 422 insertions(+), 1 deletion(-)

On a mac mini I get
 1 file changed, 351 insertions(+), 1 deletion(-)

We could possibly address that by updating the in-tree typedefs.list a bit
more aggressively. Sure looks like the source systems are on the older side.

But in the attached patch I've implemented this slightly differently. If the
tooling to do so is available, the indent-* targets explained above,
use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
which is the combination of a src/tools/pgindent/typedefs.list.local generated
for the local binaries/libraries and the source tree
src/tools/pgindent/typedefs.list.

src/tools/pgindent/typedefs.list.local is generated in fragments, with one
fragment for each build target. That way the whole file doesn't have to be
regenerated all the time, which can save a good bit of time (althoug obviously
less when hacking on the backend).

This makes it quite quick to locally indent, without needing to manually
fiddle around with manually modifying typedefs.list or using a separate
typedefs.list.

In a third commit I added a 'nitpick' configure time option, defaulting to
off, which runs an indentation check. The failure mode of that currently isn't
very helpful though, as it just uses --silent-diff.

All of this currently is meson only, largely because I don't feel like
spending the time messing with the configure build, particularly before there
is any agreement on this being the thing to do.

Greetings,

Andres Freund

Attachment

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

19 October 2023, 01:29:37

Andres Freund <andres@anarazel.de> writes:
> It turns out that updating the in-tree typedefs.list would be very noisy. On
> my local linux system I get
>  1 file changed, 422 insertions(+), 1 deletion(-)
> On a mac mini I get
>  1 file changed, 351 insertions(+), 1 deletion(-)

That seems like it needs a considerably closer look.  What exactly
is getting added/deleted?

> We could possibly address that by updating the in-tree typedefs.list a bit
> more aggressively. Sure looks like the source systems are on the older side.

Really?  Per [1] we've currently got contributions from calliphoridae
which is Debian sid, crake which is Fedora 38, indri/sifaka which are
macOS Sonoma.  Were you really expecting something newer, and if so what?

> But in the attached patch I've implemented this slightly differently. If the
> tooling to do so is available, the indent-* targets explained above,
> use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
> which is the combination of a src/tools/pgindent/typedefs.list.local generated
> for the local binaries/libraries and the source tree
> src/tools/pgindent/typedefs.list.

Hmm ... that allows indenting your C files, but how do you get from that
to a non-noisy patch to commit to typedefs.list?

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?show_list

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

19 October 2023, 02:18:13

Hi,

On 2023-10-18 21:29:37 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > It turns out that updating the in-tree typedefs.list would be very noisy. On
> > my local linux system I get
> >  1 file changed, 422 insertions(+), 1 deletion(-)
> > On a mac mini I get
> >  1 file changed, 351 insertions(+), 1 deletion(-)
>
> That seems like it needs a considerably closer look.  What exactly
> is getting added/deleted?

Types from bison, openssl, libxml, libxslt, icu and libc, at least. If I
enable LLVM, there are even more.

(I think I figured out what's happening further down)

> > We could possibly address that by updating the in-tree typedefs.list a bit
> > more aggressively. Sure looks like the source systems are on the older side.
>
> Really?  Per [1] we've currently got contributions from calliphoridae
> which is Debian sid, crake which is Fedora 38, indri/sifaka which are
> macOS Sonoma.  Were you really expecting something newer, and if so what?

It's quite odd, I see plenty more types than those. I can't really explain why
they're not being picked up on those animals.

E.g. for me bison generated files contain typedefs like

typedef int_least8_t yytype_int8;
typedef signed char yytype_int8;
typedef yytype_int8 yy_state_t;

yet they don't show up in the buildfarm typedefs output.   It's not a thing of
the binary, I checked that the symbols are present.

I have a hard time parsing the buildfarm code for generating the typedefs
file, tbh.

<stare>

Ah, I see. If I interpret that correctly, the code filters out symbols it
doesn't find in in some .[chly] file in the *source* directory. This code is,
uh, barely readable and massively underdocumented.

I guess I need to reimplement that :/.  Don't immediately see how this could
be implemented for in-tree autoconf builds...

> > But in the attached patch I've implemented this slightly differently. If the
> > tooling to do so is available, the indent-* targets explained above,
> > use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
> > which is the combination of a src/tools/pgindent/typedefs.list.local generated
> > for the local binaries/libraries and the source tree
> > src/tools/pgindent/typedefs.list.
>
> Hmm ... that allows indenting your C files, but how do you get from that
> to a non-noisy patch to commit to typedefs.list?

It doesn't provide that on its own. Being able to painlessly indent the files
seems pretty worthwhile already. But clearly it'd much better if we can
automatically update typedefs.list.

Greetings,

Andres Freund

Re: run pgindent on a regular basis / scripted manner

From

Andres Freund

Date:

19 October 2023, 04:49:07

Hi,

On 2023-10-18 19:18:13 -0700, Andres Freund wrote:
> On 2023-10-18 21:29:37 -0400, Tom Lane wrote:
> Ah, I see. If I interpret that correctly, the code filters out symbols it
> doesn't find in in some .[chly] file in the *source* directory. This code is,
> uh, barely readable and massively underdocumented.
>
> I guess I need to reimplement that :/.  Don't immediately see how this could
> be implemented for in-tree autoconf builds...
>
> > > But in the attached patch I've implemented this slightly differently. If the
> > > tooling to do so is available, the indent-* targets explained above,
> > > use/depend on src/tools/pgindent/typedefs.list.merged (in the build dir),
> > > which is the combination of a src/tools/pgindent/typedefs.list.local generated
> > > for the local binaries/libraries and the source tree
> > > src/tools/pgindent/typedefs.list.
> >
> > Hmm ... that allows indenting your C files, but how do you get from that
> > to a non-noisy patch to commit to typedefs.list?
>
> It doesn't provide that on its own. Being able to painlessly indent the files
> seems pretty worthwhile already. But clearly it'd much better if we can
> automatically update typedefs.list.

With code for that added, things seem to work quite nicely.  I added similar
logic to the buildfarm code that builds a list of all tokens in the source
code.

With that, the in-tree typedefs.list can be updated with new tokens found
locally *and* typdefs that aren't used anymore can be removed from the in-tree
typedefs.list (detected by no matching tokens found in the source code).

The only case this approach can't handle is newly referenced typedefs in code
that isn't built locally - which I think isn't particularly common and seems
somewhat fundamental. In those cases typedefs.list still can be updated
manually (and the sorting will still be "fixed" if necessary).

The code is still in a somewhat rough shape and I'll not finish polishing it
today. I've attached the code anyway, don't be too rough :).

The changes from running "ninja update-typedefs indent-tree" on debian and
macos are attached as 0004 - the set of changes looks quite correct to me.

The buildfarm code filtered out a few typedefs manually:
  push(@badsyms, 'date', 'interval', 'timestamp', 'ANY');
but I don't really see why? Possibly that was needed with an older
pg_bsd_indent to prevent odd stuff?

Right now building a new unified typedefs.list and copying it to the source
tree are still separate targets, but that probably makes less sense now? Or
perhaps it should be copied to the source tree when reindenting?

I've only handled linux and macos in the typedefs gathering code. But the
remaining OSs should be "just a bit of work" [TM].

Greetings,

Andres Freund

Attachment

Re: run pgindent on a regular basis / scripted manner

From

Jeff Davis

Date:

24 October 2023, 00:50:52

On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:
> It would be good to learn how many of the committers out of the ones
> you listed that --enable-indent-checks would have saved from breaking
> koel.

I'd find that a useful option.

Regards,
    Jeff Davis

Re: run pgindent on a regular basis / scripted manner

From

Amit Langote

Date:

24 October 2023, 01:16:55

On Tue, Oct 24, 2023 at 9:51 Jeff Davis <pgsql@j-davis.com> wrote:

On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:
> It would be good to learn how many of the committers out of the ones
> you listed that --enable-indent-checks would have saved from breaking
> koel.

I'd find that a useful option.

+1. While I’ve made it part of routine to keep my local work pgindented since breaking Joel once, an option like this would still be useful.

Re: run pgindent on a regular basis / scripted manner

From

Michael Paquier

Date:

24 October 2023, 04:31:32

On Tue, Oct 24, 2023 at 10:16:55AM +0900, Amit Langote wrote:
> +1.  While I’ve made it part of routine to keep my local work pgindented
> since breaking Joel once, an option like this would still be useful.

I'd be OK with an option like that.  It is one of these things to type
once in a script, then forget about it.
--
Michael

Attachment

signature.asc

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

24 October 2023, 13:40:01

On 2023-10-17 Tu 09:52, Robert Haas wrote:
> On Tue, Oct 17, 2023 at 6:34 AM Jelte Fennema <postgres@jeltef.nl> wrote:
>> I think *it is* dead easy to comply. If you run the following commands
>> before committing/after rebasing, then koel should always be happy:
>>
>> src/tools/pgindent/pgindent src # works always but a bit slow
>> src/tools/pgindent/pgindent $(git diff --name-only --diff-filter=ACMR)
>> # much faster, but only works if you DID NOT change typedefs.list
> In isolation, that's true, but the list of mistakes that you can make
> while committing which will inconvenience everyone working on the
> project is very long. Another one that comes up frequently is
> forgetting to bump CATALOG_VERSION_NO, but you also need a good commit
> message, and good comments, and a good Discussion link in the commit
> message, and the right list of authors and reviewers, and to update
> the docs (with spaces, not tabs) and the Makefiles (with tabs, not
> spaces) and the meson stuff and, as if that weren't enough already,
> you actually need the code to work! And that includes not only working
> regularly but also with CLOBBER_CACHE_ALWAYS and debug_parallel_query
> and so on. It's very easy to miss something somewhere. I put a LOT of
> work into polishing my commits before I push them, and it's still not
> that uncommon that I screw something up.


Yes, there's a lot to look out for, and you're a damn sight better at it 
than I am. But we should try to automate the things that can be 
automated, even if that leaves many tasks that can't be. I have three 
things in my pre-commit hook: a check for catalog updates, a check for 
new typedefs, and an indent check. And every one of them has saved me 
from doing things I should not be doing. They aren't perfect but they 
are useful.

Slightly off topic, but apropos your message, maybe we should recommend 
a standard git commit template.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Bruce Momjian

Date:

24 October 2023, 13:46:34

On Tue, Oct 24, 2023 at 09:40:01AM -0400, Andrew Dunstan wrote:
> Slightly off topic, but apropos your message, maybe we should recommend a
> standard git commit template.

I use this and then automatically remove any sections that are empty.

---------------------------------------------------------------------------


|--- gitweb subject length limit ----------------|-email limit-|


Reported-by:

Diagnosed-by:

Bug:

Discussion:

Author:

Reviewed-by:

Tested-by:

Backpatch-through:

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

24 October 2023, 13:53:00

On 2023-10-18 We 05:07, Jelte Fennema wrote:
> I think --enable-indent-checks sounds like a good improvement to the
> status quo. But I'm not confident that it will help remove the cases
> where only a comment needs to be re-indented. Do commiters really
> always run check-world again when only changing a typo in a comment? I
> know I probably wouldn't (or at least not always).

Yeah. In fact I'm betting that a lot of the offending commits we've seen 
come into this category. You build, you check, then you do some final 
polish. That's where a pre-commit hook can save you.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Amit Kapila

Date:

25 October 2023, 09:42:33

On Tue, Oct 24, 2023 at 6:21 AM Jeff Davis <pgsql@j-davis.com> wrote:
>
> On Wed, 2023-10-18 at 22:34 +1300, David Rowley wrote:
> > It would be good to learn how many of the committers out of the ones
> > you listed that --enable-indent-checks would have saved from breaking
> > koel.
>
> I'd find that a useful option.
>

+1.

--
With Regards,
Amit Kapila.

Re: run pgindent on a regular basis / scripted manner

From

Étienne BERSAC

Date:

27 October 2023, 07:14:38

Hello,


> Yes, there's a lot to look out for, and you're a damn sight better at
> it
> than I am. But we should try to automate the things that can be
> automated, even if that leaves many tasks that can't be. I have three
> things in my pre-commit hook: a check for catalog updates, a check
> for
> new typedefs, and an indent check.

Could you share your configuration ? Could we provide more helper and
integration to help produce consistent code ?

For logfmt extension, I configured clang-formatd so that Emacs format
the buffer on save. Any editor running clangd will use this. This is
ease my mind about formatting. I need to investigate how to use
pgindent instead or at lease ensure clang-format produce same output as
pgindent.

https://gitlab.com/dalibo/logfmt/-/blob/0d808b368e649b23ac06ce2657354b67be398b21/.clang-format

Automate nitpicking in CI is good, but checking locally before sending
the patch will save way more round-trip.

Regards,
Étienne

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

27 October 2023, 12:14:52

On 2023-10-27 Fr 03:14, Étienne BERSAC wrote:
> Hello,
>
>
>> Yes, there's a lot to look out for, and you're a damn sight better at
>> it
>> than I am. But we should try to automate the things that can be
>> automated, even if that leaves many tasks that can't be. I have three
>> things in my pre-commit hook: a check for catalog updates, a check
>> for
>> new typedefs, and an indent check.
> Could you share your configuration ? Could we provide more helper and
> integration to help produce consistent code ?


Sure. pre-commit hook file attached. I'm sure this could be improved on.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Attachment

pre-commit

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

28 October 2023, 15:47:44

On 2023-08-12 Sa 11:57, Andrew Dunstan wrote:
>
>
> On 2023-08-11 Fr 19:17, Tom Lane wrote:
>> Peter Geoghegan<pg@bowt.ie>  writes:
>>> I'm starting to have doubts about this policy. There have now been
>>> quite a few follow-up "fixes" to indentation issues that koel
>>> complained about. None of these fixups have been included in
>>> .git-blame-ignore-revs. If things continue like this then "git blame"
>>> is bound to become much less usable over time.
>> FWIW, I'm much more optimistic than that.  I think what we're seeing
>> is just the predictable result of not all committers having yet
>> incorporated "pgindent it before committing" into their workflow.
>> The need for followup fixes should diminish as people start doing
>> that.  If you want to hurry things along, peer pressure on committers
>> who clearly aren't bothering is the solution.
>
>
> Yeah, part of the point of creating koel was to give committers a bit 
> of a nudge in that direction.
>
> With a git pre-commit hook it's pretty painless.
>
>

Based on recent experience, where a lot koel's recent complaints seem to 
be about comments, I'd like to suggest a modest adjustment.

First, we should provide a mode of pgindent that doesn't reflow 
comments. pg_bsd_indent has a flag for this (-nfcb), so this should be 
relatively simple.  Second, koel could use that mode, so that it 
wouldn't complain about comments it thinks need to be reflowed. Of 
course, we'd fix these up with our regular pgindent runs.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: run pgindent on a regular basis / scripted manner

From

Tom Lane

Date:

28 October 2023, 16:09:15

Andrew Dunstan <andrew@dunslane.net> writes:
> Based on recent experience, where a lot koel's recent complaints seem to 
> be about comments, I'd like to suggest a modest adjustment.

> First, we should provide a mode of pgindent that doesn't reflow 
> comments. pg_bsd_indent has a flag for this (-nfcb), so this should be 
> relatively simple.  Second, koel could use that mode, so that it 
> wouldn't complain about comments it thinks need to be reflowed. Of 
> course, we'd fix these up with our regular pgindent runs.

Seems like a bit of a kluge.  Maybe it's the right thing to do, but
I don't think we have enough data points yet to be confident that
it'd meaningfully reduce the number of breakages.

On a more abstract level: the point of trying to maintain indent
cleanliness is so that if you modify a file and then want to run
pgindent on your own changes, you don't get incidental changes
elsewhere in the file.  This solution would break that, so I'm
not sure it isn't throwing the baby out with the bathwater.

            regards, tom lane

Re: run pgindent on a regular basis / scripted manner

From

Andrew Dunstan

Date:

29 October 2023, 14:22:27

On 2023-10-28 Sa 12:09, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Based on recent experience, where a lot koel's recent complaints seem to
>> be about comments, I'd like to suggest a modest adjustment.
>> First, we should provide a mode of pgindent that doesn't reflow
>> comments. pg_bsd_indent has a flag for this (-nfcb), so this should be
>> relatively simple.  Second, koel could use that mode, so that it
>> wouldn't complain about comments it thinks need to be reflowed. Of
>> course, we'd fix these up with our regular pgindent runs.
> Seems like a bit of a kluge.  Maybe it's the right thing to do, but
> I don't think we have enough data points yet to be confident that
> it'd meaningfully reduce the number of breakages.
>
> On a more abstract level: the point of trying to maintain indent
> cleanliness is so that if you modify a file and then want to run
> pgindent on your own changes, you don't get incidental changes
> elsewhere in the file.  This solution would break that, so I'm
> not sure it isn't throwing the baby out with the bathwater.


Yeah, could be.


cheers


andrew.

--
Andrew Dunstan
EDB: https://www.enterprisedb.com