Thread: Unit tests for SLRU
Hi hackers, I learned from Peter [1] that SLRU test coverage leaves much to be desired and makes it difficult to refactor it. Here is a draft of a patch that tries to address it. I used src/test/modules/test_* modules as an example. While on it, I moved the Asserts() outside of SimpleLruInit(). It didn't seem to be a right place to check the IsUnderPostmaster value, and complicated the test implementation. I also renamed SlruSyncFileTag() to SlruSyncSegment() and changed its signature. I think it makes the interface easier to reason about. I noticed that SLRU uses int's for slotno, while FileTag->slotno is uint32. Can't this cause us any grief? Finally, I believe SimpleLruWritePage() name is confusing, because in fact it works with a slot, not a page. But I didn't change the name in my patch, yet. If time permits, please take a quick look at the patch and let me know if I'm moving the right direction. There will be more tests in the final version, but I would appreciate an early feedback. [1]: https://postgr.es/m/220fab30-dff0-b055-f803-4338219f1021%40enterprisedb.com -- Best regards, Aleksander Alekseev
Attachment
> On 31 Mar 2022, at 16:30, Aleksander Alekseev <aleksander@timescale.com> wrote: Thanks for hacking on increasing test coverage! > While on it, I moved the Asserts() outside of SimpleLruInit(). It didn't seem > to be a right place to check the IsUnderPostmaster value, and complicated the > test implementation. + * + * Returns false if the cache didn't exist before the call, true otherwise. */ -void +bool SimpleLruInit(SlruCtl ctl, const char *name, int nslots, int nlsns, If we're changing the API to make it testable, that should be noted in the comment and how the return value should be interpreted (and when it can be ignored). It also doesn't seem all that appealing that SimpleLruInit can return false on successful function invocation, it makes for a confusing API. -- Daniel Gustafsson https://vmware.com/
On Thu, Mar 31, 2022 at 05:30:41PM +0300, Aleksander Alekseev wrote: > I used src/test/modules/test_* modules as an example. > If time permits, please take a quick look at the patch and let me know > if I'm moving the right direction. There will be more tests in the > final version, but I would appreciate an early feedback. The default place for this kind of test is regress.c, with plain "make check" calling the regress.c function. src/test/modules is for things requiring an extension module or otherwise unable to run through regress.c.
пт, 1 апр. 2022 г. в 07:47, Noah Misch <noah@leadboat.com>:
On Thu, Mar 31, 2022 at 05:30:41PM +0300, Aleksander Alekseev wrote:
> I used src/test/modules/test_* modules as an example.
> If time permits, please take a quick look at the patch and let me know
> if I'm moving the right direction. There will be more tests in the
> final version, but I would appreciate an early feedback.
The default place for this kind of test is regress.c, with plain "make check"
calling the regress.c function. src/test/modules is for things requiring an
extension module or otherwise unable to run through regress.c.
+1 for placement c functions into regress.c if it's possible for the aim of simplification.
--
Hi Daniel,
> It also doesn't seem all that appealing that SimpleLruInit can
> return false on successful function invocation, it makes for a confusing API.
Agree. I think using an additional `bool *found` argument would be less confusing.
Noah, Pavel,
>> The default place for this kind of test is regress.c, with plain "make check"
>> calling the regress.c function. src/test/modules is for things requiring an
>> extension module or otherwise unable to run through regress.c.
>
> +1 for placement c functions into regress.c if it's possible for the aim of simplification.
Thanks for your feedback. I'm OK with placing the tests to regress.c, but I would like to double-check if this would be a right place for them.
I'm reading src/test/modules/README and it says:
"""
src/test/modules contains PostgreSQL extensions that are primarily or entirely
intended for testing PostgreSQL and/or to serve as example code. [...]
If you're adding new hooks or other functionality exposed as C-level API this
is where to add the tests for it.
"""
SLRU looks like a quite general-purpose container. I can imagine how someone may decide to use it in an extension. Wouldn't it be more logical to place it near:
src/test/modules/test_rbtree
src/test/modules/test_shm_mq
Again, I don't have a strong opinion here. If you insist, I will place the tests to regress.c.
--
Best regards,
Aleksander Alekseev
> It also doesn't seem all that appealing that SimpleLruInit can
> return false on successful function invocation, it makes for a confusing API.
Agree. I think using an additional `bool *found` argument would be less confusing.
Noah, Pavel,
>> The default place for this kind of test is regress.c, with plain "make check"
>> calling the regress.c function. src/test/modules is for things requiring an
>> extension module or otherwise unable to run through regress.c.
>
> +1 for placement c functions into regress.c if it's possible for the aim of simplification.
Thanks for your feedback. I'm OK with placing the tests to regress.c, but I would like to double-check if this would be a right place for them.
I'm reading src/test/modules/README and it says:
"""
src/test/modules contains PostgreSQL extensions that are primarily or entirely
intended for testing PostgreSQL and/or to serve as example code. [...]
If you're adding new hooks or other functionality exposed as C-level API this
is where to add the tests for it.
"""
SLRU looks like a quite general-purpose container. I can imagine how someone may decide to use it in an extension. Wouldn't it be more logical to place it near:
src/test/modules/test_rbtree
src/test/modules/test_shm_mq
... etc?
Again, I don't have a strong opinion here. If you insist, I will place the tests to regress.c.
--
Best regards,
Aleksander Alekseev
Again, I don't have a strong opinion here. If you insist, I will place the tests to regress.c.
It is up to committer to decide, but I think it would be good to place tests in regression.
In my opinion, many things from core may be used by extensions. And then it is up to extension authors to make relevant tests.
For me, it's enough to make a reasonable test coverage for SLRU without investing too much effort into generalization.
Best regards,
Maxim Orlov.
Hi hackers, >> Again, I don't have a strong opinion here. If you insist, I will place the >> tests to regress.c. > > It is up to committer to decide, but I think it would be good to place tests > in regression. In my opinion, many things from core may be used by extensions. > And then it is up to extension authors to make relevant tests. > > For me, it's enough to make a reasonable test coverage for SLRU without > investing too much effort into generalization. OK, here is an updated version of the patch. Changes comparing to v1: * Tests are moved to regress.c, as asked by the majority; * SimpleLruInit() returns void as before, per Daniel's feedback; * Most of the initial refactorings were reverted in order to keep the patch as small as possible, per Maxim's feedback. LCOV indicates 65% of code coverage in terms of lines of code. I'm currently working on improving this number. More feedback is always welcome! -- Best regards, Aleksander Alekseev
Attachment
Hi hackers, > OK, here is an updated version of the patch. Changes comparing to v1: > > * Tests are moved to regress.c, as asked by the majority; > * SimpleLruInit() returns void as before, per Daniel's feedback; > * Most of the initial refactorings were reverted in order to keep the patch > as small as possible, per Maxim's feedback. Here is version 3 of the patch. Test coverage is 92.3% of functions, 73.4% of lines of code. Not all error handling was covered, and I couldn't cover SimpleLruWaitIO(). The latest requires writing a concurrent test, which from what I know is not exactly what unit tests are for. We can make it public if we want to, but considering the simplicity of the function and the existence of many other tests I didn't find it necessary. I think the tests are about as good as they will ever get. -- Best regards, Aleksander Alekseev
Attachment
Hi hackers, > Here is version 3 of the patch. > [...] > I think the tests are about as good as they will ever get. Here is version 4. Same as v3, but with resolved conflicts against the current `master` branch. -- Best regards, Aleksander Alekseev
Attachment
Hi hackers,
> Here is version 3 of the patch.
> [...]
> I think the tests are about as good as they will ever get.
Here is version 4. Same as v3, but with resolved conflicts against the
current `master` branch.
Hi, Alexander!
The test seems good enough to be pushed.
Only one thing to note. Maybe it would be good not to copy-paste Assert after every call of SimpleLruInit, putting it into the wrapper function instead. So the test can call calling the inner function (without assert) and all other callers using the wrapper. Not sure about naming though. Maybe rename current SimpleLruInit -> SimpleLruInitInner and a new wrapper being under the old name (SimpleLruInit).
--
On Wed, Apr 13, 2022 at 03:51:30PM +0400, Pavel Borisov wrote: > Only one thing to note. Maybe it would be good not to copy-paste Assert > after every call of SimpleLruInit, putting it into the wrapper function > instead. So the test can call calling the inner function (without assert) > and all other callers using the wrapper. Not sure about naming though. > Maybe rename current SimpleLruInit -> SimpleLruInitInner and a new wrapper > being under the old name (SimpleLruInit). I have looked at what you have here.. This patch redesigns SimpleLruInit() so as the caller would be now in charge of checking that the SLRU has been created in the context of the postmaster (aka when initializing shared memory). While this should work as long as the amount of shared memory area is correctly sized in _PG_init() and that this area is initialized, then attached later like for autoprewarm.c (this counts for LWLockRegisterTranche(), for example), I am not really convinced that this is something that a patch aimed at extending testing coverage should redesign, especially with a routine as old as that. If you don't know what you are doing, it could easily lead to problems with external code. Note that I don't object to the addition of a new code path or a routine that would be able to create a SLRU on-the-fly with less restrictions, but I am not convinced that this we should change this behavior (well, there is a new argument that would force a recompilation). I am not sure what could be the use cases in favor of a SLRU created outside the _PG_init() phase, but perhaps you have more imagination than I do for such matters ;p FWIW, I'd like to think that the proper way of doing things for this test facility is to initialize a SLRU through a loading of _PG_init() when processing shared_preload_libraries, meaning that you'd better put this facility in src/test/modules/ with a custom configuration file with shared_preload_libraries set and a NO_INSTALLCHECK, without touching at SimpleLruInit(). -- Michael
Attachment
On Thu, Nov 10, 2022 at 04:01:02PM +0900, Michael Paquier wrote: > I have looked at what you have here.. The comment at the top of SimpleLruInit() for sync_handler has been fixed as of 5ca3645, and backpatched. Nice catch. -- Michael
Attachment
Hi Michael, Thanks for the review and also for committing 5ca3645. > This patch redesigns SimpleLruInit() [...] > I am not really convinced that this is something that a > patch aimed at extending testing coverage should redesign, especially > with a routine as old as that. > [...] you'd better put this facility in src/test/modules/ Fair enough. PFA the corrected patch v5. -- Best regards, Aleksander Alekseev
Attachment
On Thu, Nov 10, 2022 at 06:40:44PM +0300, Aleksander Alekseev wrote: > Fair enough. PFA the corrected patch v5. Is there a reason why you need a TAP test here? It is by design more expensive than pg_regress and it does not require --enable-tap-tests. See for example what we do for snapshot_too_old, commit_ts, worker_spi, etc., where each module uses a custom configuration file. Hmm. If I were to write that, I think that I would make the SLRU directory configurable as a PGC_POSTMASTER, at least, for the purpose of the exercise, and also split test_slru() into more user-callable functions so as it would be possible to mix more cross-check scenarios with the low-level C APIs if need be, with adapted input parameters. -- Michael
Attachment
On Fri, Nov 11, 2022 at 02:11:08PM +0900, Michael Paquier wrote: > Is there a reason why you need a TAP test here? It is by design more > expensive than pg_regress and it does not require --enable-tap-tests. > See for example what we do for snapshot_too_old, commit_ts, > worker_spi, etc., where each module uses a custom configuration file. I have put my hands on that, and I found that the tests were a bit overengineered. First, SimpleLruDoesPhysicalPageExist() is not that much necessary before and after each operation, like truncation or deletion, as the previous pages were doing equal tests. The hardcoded page number lacks a bit of flexibility and readability IMO, especially when combined with the number of pages per segments, as well. I have reworked that as per the attached, that provides basically the same coverage, going through a SQL interface for the whole thing. Like all the other tests of its kind, this does not use a TAP test, relying on a custom configuration file instead. This still needs some polishing, but the basics are here. What do you think? -- Michael
Attachment
Hi Michael, > I have reworked that as per the attached, that provides basically the > same coverage, going through a SQL interface for the whole thing. > Like all the other tests of its kind, this does not use a TAP test, > relying on a custom configuration file instead. This still needs some > polishing, but the basics are here. Many thanks for the updated patch. I didn't know one can run tests with a custom postgresql.conf without using TAP tests. > What do you think? It looks much better than before. I replaced strcpy() with strncpy() and pgindent'ed the code. Other than that to me it looks ready to be committed. -- Best regards, Aleksander Alekseev
Attachment
> On 15 Nov 2022, at 11:15, Aleksander Alekseev <aleksander@timescale.com> wrote: >> What do you think? > > It looks much better than before. I replaced strcpy() with strncpy() > and pgindent'ed the code. + /* write given data to the page */ + strncpy(TestSlruCtl->shared->page_buffer[slotno], data, BLCKSZ - 1); Would it make sense to instead use pg_pwrite to closer match the code being tested? > Other than that to me it looks ready to be committed. Agreed, reading over it nothing sticks out. -- Daniel Gustafsson https://vmware.com/
Hi, Alexander! > > I have reworked that as per the attached, that provides basically the > > same coverage, going through a SQL interface for the whole thing. > > Like all the other tests of its kind, this does not use a TAP test, > > relying on a custom configuration file instead. This still needs some > > polishing, but the basics are here. > > Many thanks for the updated patch. I didn't know one can run tests > with a custom postgresql.conf without using TAP tests. > > > What do you think? > > It looks much better than before. I replaced strcpy() with strncpy() > and pgindent'ed the code. Other than that to me it looks ready to be > committed. I've looked through the patch again. I agree it looks better and can be committed. Mark it as RfC now. Regards, Pavel Borisov
On Tue, Nov 15, 2022 at 11:39:20AM +0100, Daniel Gustafsson wrote: > + /* write given data to the page */ > + strncpy(TestSlruCtl->shared->page_buffer[slotno], data, BLCKSZ - 1); > > Would it make sense to instead use pg_pwrite to closer match the code being > tested? Hmm. I am not exactly sure what we'd gain with that, as it would imply that we need to write directly to the file using SlruFileName() after doing ourselves a OpenTransientFile(), duplicating what SlruPhysicalWritePage() does to create a fd to feed to a pg_pwrite()? Or I misunderstood your point. -- Michael
Attachment
On Tue, Nov 15, 2022 at 02:43:06PM +0400, Pavel Borisov wrote: > I've looked through the patch again. I agree it looks better and can > be committed. > Mark it as RfC now. Okay, applied, then. The SQL function names speak by themselves, even if some of them refer to pages but they act on segments, but that's easy enough to see the difference through the code when we do segment number compilations, as well. -- Michael