Re: RFC: adding pytest as a supported test framework - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: RFC: adding pytest as a supported test framework
Date
Msg-id CAOYmi+ks2X+35EOi6GS2J304T5AW53EoAz+s5vkwrGAaPixS3w@mail.gmail.com
Whole thread Raw
In response to Re: RFC: adding pytest as a supported test framework  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: RFC: adding pytest as a supported test framework
List pgsql-hackers
On Thu, Jun 13, 2024 at 1:27 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jun 13, 2024 at 4:06 PM Jacob Champion
> <jacob.champion@enterprisedb.com> wrote:
> > There was a four-step plan sketch at the end of that email, titled "A
> > Plan". That was not intended to be "the final detailed plan", because
> > I was soliciting feedback on the exact pieces people wanted to try to
> > implement first, and I am not the God Emperor of Pytest. But it was
> > definitely A Plan.
>
> Well, OK, now I feel a bit dumb. I guess I missed that or forgot about it.

No worries. It's a really long thread. :D

But also: do you have opinions on what to fill in as steps 2
(something we have no ability to test today) and 3 (something we do
test today, but hate)?

My vote for step 2 is "client and server separation", perhaps by
testing libpq fallback against a server that claims support for
different build-time options. I don't want to have a vote in step 3,
because part of that step is proving that this framework can provide
value for a part of the project I don't really know much about.

> > I think this is way too much expectation for a v1 patch. If you were
> > committing this by yourself, would you agree to develop the entirety
> > of PostgreSQL::Test in a single commit, without the benefit of the
> > buildfarm checking you as you went, and other people trying to write
> > tests with it?
>
> Eh... I'm confused. PostgreSQL::Test::Cluster is more than half of the
> code in that directory, and without it you wouldn't be able to write
> most of the TAP tests that we have today.

Well, in my defense, you said "PostgreSQL::Test::whatever", which I
assumed meant all of it, including Kerberos.pm and SSL::Server and
AdjustUpgrade and... That seemed like way too much to me (and still
does!), but if that's not what you were arguing then never mind.

Yes, Cluster.pm seems like a pretty natural thing to ask for. I
imagine it's one of the first things we're going to need. And yet...

> You would really want to
> call this project done without having an equivalent?

...I have this really weird sneaking suspicion that, if a replacement
of the end-to-end Perl acceptance tests can be made an explicit
anti-goal in the short term, we might not necessarily need an
"equivalent" for v1. I realize that seems bizarre, because of course
we need a way to start the server if we want to test the server. But
frankly, starting a server is Pretty Easy (tm), and Cluster.pm has to
do a lot more than that because IMO it's designed for a variety of
acceptance-oriented tasks. 3000+ lines!

If there's widespread interest (as opposed to being just my own
personal fever dream) in testing Postgres components as individual
pieces rather than setting up the world, then I wonder if the
functionality from Cluster.pm couldn't be pared down a lot. Maybe you
don't need a centralized ->psql() or a ->command_ok() helper, because
you're usually not trying to test psql and other utilities during your
server-only tests.

Maybe you can just stand up a standby without a primary and drive it
via mock replication. Do you need quite as many "poll and wait for
some asynchronous result" type things when you're not waiting for a
result to cascade through a multinode system? Does something like (for
example) ->pg_recvlogical_upto() really have to be implemented in our
"core" fixtures or can it be done more easily by whoever needs that in
the future? Maybe You Ain't Gonna Need It.

If (he said, atop his naive idealistic soapbox) we can find a way to
put off writing utilities until we write the tests that need them,
without procrastinating, and without putting all of the negative
externalities of that approach on the committers with low-quality
copy-paste proliferation, and I'd like a pony while I'm at it, then I
think the result might end up being pretty coherent and maintainable.
Then not having "at least as much in-tree support for writing tests as
we have today" for the very first commit would be a feature and not a
bug.

Now, maybe if the collective ability to do that existed, we would have
done it already with Perl, but I do actually wonder whether that's
true or not.

Or, maybe, the very first suggestion for Step 3 will be something that
needs absolutely everything in Cluster.pm. So be it; I can live
without a pony.

> You would really want to
> call this project done without having an equivalent?

(A cop-out but not-really-cop-out alternative answer to this question
is that this project is not going to be "done" any more than Postgres
will ever be "done", and that's part of what I'm arguing should be
considered natural and okay. I understand that it is easier for me to
take that stance when I am not on the hook for maintaining it, so I
don't expect us to necessarily see eye-to-eye on it.)

> > Can you elaborate on why that's not an okay outcome?
>
> Well, you just argued that it should be an okay outcome, and I do sort
> of see your point, but I refer you to my earlier reply about the
> difficulty of getting anything reverted in the culture as it stands.

Earlier reply was:

> As a community, we're really bad at this. Once something gets
> committed, getting a consensus to revert it is really hard, especially
> if a major release has happened meanwhile, but most of the time even
> if it hasn't. It might be a little easier in this case, since after
> all it's not a directly user-visible feature. But historically what
> happens if somebody says "hey, there are six unfixed problems with
> this feature!" is that everybody says "well, you're free to fix the
> problems if you want, but you're not allowed to revert the feature."
> And that is *exactly* how we end up with stuff like the current TAP
> test framework: ripping that out would mean removing all the TAP tests
> that depend on it, and that wouldn't have achieved consensus two
> months after the feature went in, let alone today.

Well... I don't know how to fix that. Here's a draft proposal after a
few minutes of thought, which may need to be discarded after a few
more minutes of thought.

If there's agreement that New Tests -- not necessarily written in
Python, but I selfishly hope they are -- exist on a probationary
status, then maybe part of that is going to have to be an agreement:
New features have to be able to have some minimum maintainability
level *on the basis of the Perl tests only*, while the probationary
period is in effect. It can't be the equivalent maintainability level,
because that's either proof that the New Tests are giving us nothing,
or proof that everyone is being forced to implement the exact same
tests in both Perl and New Test. Neither is good.

Since we're currently focused on end-to-end acceptance with Perl, that
is probably a lower bar than what we'd maybe prefer, but I think that
is the bar we have right now. It also exists as a forcing function to
make sure that the additional tests are adding value over what we get
with the Perl, which may paradoxically increase the chances of New
Test success. (I can't tell if this is magical thinking or not.)

So if a committer doesn't want responsibility for the feature if the
new tests were deleted, they don't commit. Maybe that's unrealistic
and too painful. It does increase the review requirements of
committers quite a bit. It might disqualify my OAuth work (which is
maybe evidence in its favor?). Maybe it increases the foot-in-the-door
effect too much. Maybe there would have to be some trust-building
where right now there is not? Not sure.

> Now, it has been suggested to me by at least one other person involved
> with the project that we need to be more open to the kind of thing
> that you propose here: add experimental things and take them out if it
> doesn't work out. I can definitely understand that this might be a
> culturally better approach than what we currently do. So maybe that's
> the way forward, but it is hard (at least for me) to get past the fear
> of being the one left holding the bag, and I suspect that other
> committers have similar fears. What exactly we should do about that,
> I'm not sure.

Yeah.

> I have zero desire to write tests in Python. If I could convince
> everyone here to spend their time and energy improving the stuff we
> have in Perl instead of introducing a whole new test framework, I
> would 100% do that. But I'm pretty sure that I can't, and I think the
> project needs to pick from among realistic options rather than
> theoretical ones. Said differently, it's not all about me.

Then, for what it's worth: I really do want to make sure that your
life, and the life of all the other committers, does not get
significantly harder if this goes in. I don't think it will, but if
I'm wrong, I want it to come back out, and then we can regroup or
pivot entirely and move forward together.

--Jacob



pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: RFC: adding pytest as a supported test framework
Next
From: Masahiko Sawada
Date:
Subject: Re: Revive num_dead_tuples column of pg_stat_progress_vacuum