Thread: Unit tests for SLRU

Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi hackers,

I learned from Peter [1] that SLRU test coverage leaves much to be
desired and makes it difficult to refactor it. Here is a draft of a
patch that tries to address it.

I used src/test/modules/test_* modules as an example. While on it, I
moved the Asserts() outside of SimpleLruInit(). It didn't seem to be a
right place to check the IsUnderPostmaster value, and complicated the
test implementation. I also renamed SlruSyncFileTag() to
SlruSyncSegment() and changed its signature. I think it makes the
interface easier to reason about.

I noticed that SLRU uses int's for slotno, while FileTag->slotno is
uint32. Can't this cause us any grief? Finally, I believe
SimpleLruWritePage() name is confusing, because in fact it works with
a slot, not a page. But I didn't change the name in my patch, yet.

If time permits, please take a quick look at the patch and let me know
if I'm moving the right direction. There will be more tests in the
final version, but I would appreciate an early feedback.

[1]: https://postgr.es/m/220fab30-dff0-b055-f803-4338219f1021%40enterprisedb.com

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Daniel Gustafsson
Date:
> On 31 Mar 2022, at 16:30, Aleksander Alekseev <aleksander@timescale.com> wrote:

Thanks for hacking on increasing test coverage!

> While on it, I moved the Asserts() outside of SimpleLruInit().  It didn't seem
> to be a right place to check the IsUnderPostmaster value, and complicated the
> test implementation.

+ *
+ * Returns false if the cache didn't exist before the call, true otherwise.
  */
-void
+bool
 SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns,

If we're changing the API to make it testable, that should be noted in the
comment and how the return value should be interpreted (and when it can be
ignored).  It also doesn't seem all that appealing that SimpleLruInit can
return false on successful function invocation, it makes for a confusing API.

--
Daniel Gustafsson        https://vmware.com/




Re: Unit tests for SLRU

From
Noah Misch
Date:
On Thu, Mar 31, 2022 at 05:30:41PM +0300, Aleksander Alekseev wrote:
> I used src/test/modules/test_* modules as an example.

> If time permits, please take a quick look at the patch and let me know
> if I'm moving the right direction. There will be more tests in the
> final version, but I would appreciate an early feedback.

The default place for this kind of test is regress.c, with plain "make check"
calling the regress.c function.  src/test/modules is for things requiring an
extension module or otherwise unable to run through regress.c.



Re: Unit tests for SLRU

From
Pavel Borisov
Date:
пт, 1 апр. 2022 г. в 07:47, Noah Misch <noah@leadboat.com>:
On Thu, Mar 31, 2022 at 05:30:41PM +0300, Aleksander Alekseev wrote:
> I used src/test/modules/test_* modules as an example.

> If time permits, please take a quick look at the patch and let me know
> if I'm moving the right direction. There will be more tests in the
> final version, but I would appreciate an early feedback.

The default place for this kind of test is regress.c, with plain "make check"
calling the regress.c function.  src/test/modules is for things requiring an
extension module or otherwise unable to run through regress.c.
+1 for placement c functions into regress.c if it's possible for the aim of simplification.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi Daniel,

> It also doesn't seem all that appealing that SimpleLruInit can
> return false on successful function invocation, it makes for a confusing API.

Agree. I think using an additional `bool *found` argument would be less confusing.

Noah, Pavel,

>> The default place for this kind of test is regress.c, with plain "make check"
>> calling the regress.c function.  src/test/modules is for things requiring an
>> extension module or otherwise unable to run through regress.c.
>
> +1 for placement c functions into regress.c if it's possible for the aim of simplification.

Thanks for your feedback. I'm OK with placing the tests to regress.c, but I would like to double-check if this would be a right place for them.

I'm reading src/test/modules/README and it says:

"""
src/test/modules contains PostgreSQL extensions that are primarily or entirely
intended for testing PostgreSQL and/or to serve as example code. [...]

If you're adding new hooks or other functionality exposed as C-level API this
is where to add the tests for it.
"""

SLRU looks like a quite general-purpose container. I can imagine how someone may decide to use it in an extension. Wouldn't it be more logical to place it near:

src/test/modules/test_rbtree
src/test/modules/test_shm_mq

... etc?

Again, I don't have a strong opinion here. If you insist, I will place the tests to regress.c.
 
--
Best regards,
Aleksander Alekseev

Re: Unit tests for SLRU

From
Maxim Orlov
Date:
Again, I don't have a strong opinion here. If you insist, I will place the tests to regress.c.

It is up to committer to decide, but I think it would be good to place tests in regression.
In my opinion, many things from core may be used by extensions. And then it is up to extension authors to make relevant tests.

For me, it's enough to make a reasonable test coverage for SLRU without investing too much effort into generalization. 

--
Best regards,
Maxim Orlov.

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi hackers,

>> Again, I don't have a strong opinion here. If you insist, I will place the
>> tests to regress.c.
>
> It is up to committer to decide, but I think it would be good to place tests
> in regression. In my opinion, many things from core may be used by extensions.
> And then it is up to extension authors to make relevant tests.
>
> For me, it's enough to make a reasonable test coverage for SLRU without
> investing too much effort into generalization.

OK, here is an updated version of the patch. Changes comparing to v1:

* Tests are moved to regress.c, as asked by the majority;
* SimpleLruInit() returns void as before, per Daniel's feedback;
* Most of the initial refactorings were reverted in order to keep the patch
  as small as possible, per Maxim's feedback.

LCOV indicates 65% of code coverage in terms of lines of code. I'm currently
working on improving this number.

More feedback is always welcome!

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi hackers,

> OK, here is an updated version of the patch. Changes comparing to v1:
>
> * Tests are moved to regress.c, as asked by the majority;
> * SimpleLruInit() returns void as before, per Daniel's feedback;
> * Most of the initial refactorings were reverted in order to keep the patch
>   as small as possible, per Maxim's feedback.

Here is version 3 of the patch.

Test coverage is 92.3% of functions, 73.4% of lines of code. Not all error
handling was covered, and I couldn't cover SimpleLruWaitIO(). The latest
requires writing a concurrent test, which from what I know is not exactly what
unit tests are for. We can make it public if we want to, but considering the
simplicity of the function and the existence of many other tests I didn't find
it necessary.

I think the tests are about as good as they will ever get.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi hackers,

> Here is version 3 of the patch.
> [...]
> I think the tests are about as good as they will ever get.

Here is version 4. Same as v3, but with resolved conflicts against the
current `master` branch.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Pavel Borisov
Date:
Hi hackers,

> Here is version 3 of the patch.
> [...]
> I think the tests are about as good as they will ever get.

Here is version 4. Same as v3, but with resolved conflicts against the
current `master` branch.
Hi, Alexander!
The test seems good enough to be pushed.

Only one thing to note. Maybe it would be good not to copy-paste Assert after every call of SimpleLruInit, putting it into the wrapper function instead. So the test can call calling the inner function (without assert) and all other callers using the wrapper. Not sure about naming though. Maybe  rename current SimpleLruInit -> SimpleLruInitInner and a new wrapper being under the old name (SimpleLruInit).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Wed, Apr 13, 2022 at 03:51:30PM +0400, Pavel Borisov wrote:
> Only one thing to note. Maybe it would be good not to copy-paste Assert
> after every call of SimpleLruInit, putting it into the wrapper function
> instead. So the test can call calling the inner function (without assert)
> and all other callers using the wrapper. Not sure about naming though.
> Maybe rename current SimpleLruInit -> SimpleLruInitInner and a new wrapper
> being under the old name (SimpleLruInit).

I have looked at what you have here..

This patch redesigns SimpleLruInit() so as the caller would be now in
charge of checking that the SLRU has been created in the context of
the postmaster (aka when initializing shared memory).  While this
should work as long as the amount of shared memory area is correctly
sized in _PG_init() and that this area is initialized, then attached
later like for autoprewarm.c (this counts for LWLockRegisterTranche(),
for example), I am not really convinced that this is something that a
patch aimed at extending testing coverage should redesign, especially
with a routine as old as that.  If you don't know what you are doing,
it could easily lead to problems with external code.  Note that I
don't object to the addition of a new code path or a routine that
would be able to create a SLRU on-the-fly with less restrictions, but
I am not convinced that this we should change this behavior (well,
there is a new argument that would force a recompilation).  I am not
sure what could be the use cases in favor of a SLRU created outside
the _PG_init() phase, but perhaps you have more imagination than I do
for such matters ;p

FWIW, I'd like to think that the proper way of doing things for this
test facility is to initialize a SLRU through a loading of _PG_init()
when processing shared_preload_libraries, meaning that you'd better
put this facility in src/test/modules/ with a custom configuration
file with shared_preload_libraries set and a NO_INSTALLCHECK, without
touching at SimpleLruInit().
--
Michael

Attachment

Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Thu, Nov 10, 2022 at 04:01:02PM +0900, Michael Paquier wrote:
> I have looked at what you have here..

The comment at the top of SimpleLruInit() for sync_handler has been
fixed as of 5ca3645, and backpatched.  Nice catch.
--
Michael

Attachment

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi Michael,

Thanks for the review and also for committing 5ca3645.

> This patch redesigns SimpleLruInit() [...]
> I am not really convinced that this is something that a
> patch aimed at extending testing coverage should redesign, especially
> with a routine as old as that.
> [...] you'd better put this facility in src/test/modules/

Fair enough. PFA the corrected patch v5.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Thu, Nov 10, 2022 at 06:40:44PM +0300, Aleksander Alekseev wrote:
> Fair enough. PFA the corrected patch v5.

Is there a reason why you need a TAP test here?  It is by design more
expensive than pg_regress and it does not require --enable-tap-tests.
See for example what we do for snapshot_too_old, commit_ts,
worker_spi, etc., where each module uses a custom configuration file.

Hmm.  If I were to write that, I think that I would make the SLRU
directory configurable as a PGC_POSTMASTER, at least, for the purpose
of the exercise, and also split test_slru() into more user-callable
functions so as it would be possible to mix more cross-check scenarios
with the low-level C APIs if need be, with adapted input parameters.
--
Michael

Attachment

Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Fri, Nov 11, 2022 at 02:11:08PM +0900, Michael Paquier wrote:
> Is there a reason why you need a TAP test here?  It is by design more
> expensive than pg_regress and it does not require --enable-tap-tests.
> See for example what we do for snapshot_too_old, commit_ts,
> worker_spi, etc., where each module uses a custom configuration file.

I have put my hands on that, and I found that the tests were a bit
overengineered.  First, SimpleLruDoesPhysicalPageExist() is not that
much necessary before and after each operation, like truncation or
deletion, as the previous pages were doing equal tests.  The hardcoded
page number lacks a bit of flexibility and readability IMO, especially
when combined with the number of pages per segments, as well.

I have reworked that as per the attached, that provides basically the
same coverage, going through a SQL interface for the whole thing.
Like all the other tests of its kind, this does not use a TAP test,
relying on a custom configuration file instead.  This still needs some
polishing, but the basics are here.

What do you think?
--
Michael

Attachment

Re: Unit tests for SLRU

From
Aleksander Alekseev
Date:
Hi Michael,

> I have reworked that as per the attached, that provides basically the
> same coverage, going through a SQL interface for the whole thing.
> Like all the other tests of its kind, this does not use a TAP test,
> relying on a custom configuration file instead.  This still needs some
> polishing, but the basics are here.

Many thanks for the updated patch. I didn't know one can run tests
with a custom postgresql.conf without using TAP tests.

> What do you think?

It looks much better than before. I replaced strcpy() with strncpy()
and pgindent'ed the code. Other than that to me it looks ready to be
committed.

-- 
Best regards,
Aleksander Alekseev

Attachment

Re: Unit tests for SLRU

From
Daniel Gustafsson
Date:
> On 15 Nov 2022, at 11:15, Aleksander Alekseev <aleksander@timescale.com> wrote:

>> What do you think?
>
> It looks much better than before. I replaced strcpy() with strncpy()
> and pgindent'ed the code.

+    /* write given data to the page */
+    strncpy(TestSlruCtl->shared->page_buffer[slotno], data, BLCKSZ - 1);

Would it make sense to instead use pg_pwrite to closer match the code being
tested?

> Other than that to me it looks ready to be committed.

Agreed, reading over it nothing sticks out.

--
Daniel Gustafsson        https://vmware.com/




Re: Unit tests for SLRU

From
Pavel Borisov
Date:
Hi, Alexander!
> > I have reworked that as per the attached, that provides basically the
> > same coverage, going through a SQL interface for the whole thing.
> > Like all the other tests of its kind, this does not use a TAP test,
> > relying on a custom configuration file instead.  This still needs some
> > polishing, but the basics are here.
>
> Many thanks for the updated patch. I didn't know one can run tests
> with a custom postgresql.conf without using TAP tests.
>
> > What do you think?
>
> It looks much better than before. I replaced strcpy() with strncpy()
> and pgindent'ed the code. Other than that to me it looks ready to be
> committed.
I've looked through the patch again. I agree it looks better and can
be committed.
Mark it as RfC now.

Regards,
Pavel Borisov



Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Tue, Nov 15, 2022 at 11:39:20AM +0100, Daniel Gustafsson wrote:
> +    /* write given data to the page */
> +    strncpy(TestSlruCtl->shared->page_buffer[slotno], data, BLCKSZ - 1);
>
> Would it make sense to instead use pg_pwrite to closer match the code being
> tested?

Hmm.  I am not exactly sure what we'd gain with that, as it would
imply that we need to write directly to the file using SlruFileName()
after doing ourselves a OpenTransientFile(), duplicating what
SlruPhysicalWritePage() does to create a fd to feed to a pg_pwrite()?
Or I misunderstood your point.
--
Michael

Attachment

Re: Unit tests for SLRU

From
Michael Paquier
Date:
On Tue, Nov 15, 2022 at 02:43:06PM +0400, Pavel Borisov wrote:
> I've looked through the patch again. I agree it looks better and can
> be committed.
> Mark it as RfC now.

Okay, applied, then.  The SQL function names speak by themselves, even
if some of them refer to pages but they act on segments, but that's
easy enough to see the difference through the code when we do segment
number compilations, as well.
--
Michael

Attachment