Re: Default ordering option - Mailing list pgsql-general

From Cyril Champier
Subject Re: Default ordering option
Date
Msg-id CAJaA8Vcemtt4yhJTk-Wtvj94ZpPLziMKDfzv4S_Ma5dGnAfhPw@mail.gmail.com
Whole thread Raw
In response to Re: Default ordering option  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: Default ordering option  (Julien Rouhaud <rjuju123@gmail.com>)
Re: Default ordering option  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List pgsql-general
Adrian:

Are you really looking for a pseudo-random name?

No, the code I pasted was an existing production bug: the last_name should have been unique, so the selected patient would always be the same.
This should have been detected in tests, but since the order was "almost always the same", our test was green 99% of the time, so we discarded it as flaky.

Fuzzy testing could be an option, but this would go too far, as for Peter extension suggestion.
We have huge existing codebase with more than 10K tests, and I do not want to modify our whole testing strategy.

Meanwhile, I went for an ORM patch (ActiveRecord) and forbid usages that can workaround it.
If you are interested, here is a gist:

I wish there was a simple flag to activate in PostgreSQL to do that!

Thanks for your ideas!




On Thu, Jul 25, 2019 at 4:55 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 7/25/19 12:43 AM, Cyril Champier wrote:
> *Adrian*:
>
>     If order is not an issue in the production code why test for it in the
>     test code?
>
>
> In many cases, it would not be a problem in tests if we had an unordered
> array comparison helper.
> But in other cases, it is a production issue.
> In ruby ActiveRecord for exemple, you can do `Patient.find_by(last_name:
> 'champier')`,
> which translates to `SELECT  "patients".* FROM "patients" WHERE
> "patients"."last_name" = 'champier' LIMIT 1`.
> If last_name is not unique, the returned record will be random.

Are you really looking for a pseudo-random name?

If so would not(warning not a Ruby developer, so below is tentative):

Patient.where(["last_name = :last_name", {last_name:
"champier"}]).order('RANDOM()').first

work better?

If not why not use something that returns all possible matches?

>
> So yes, everything as to be randomized, because the sources are
> multiples and the consequences can vary to a dramatic production bug, a
> failed CI 1% of the time, or to a useless test assertion.

One way I can think of doing this is write a script that walks through
your tables in the test db and does an UPDATE across the rows. It is
going to add time to your tests, but then I believe that is going to be
the case for anything you do. Or you could look at something I have
never tried, fuzzy testing. As a starting point:

https://www.guru99.com/fuzz-testing.html

Maybe other folks have suggestions on tools you could use for fuzzy
testing.


>
>
> *Peter*:
>
>     It might be an interesting exercise to implement this as a post-parsing
>     hook.
>
> I known nothing about that, but that sounds interesting, do you have any
> documentation pointer to help me implement that?
>
>
>
> On Wed, Jul 24, 2019 at 10:36 PM Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com
> <mailto:peter.eisentraut@2ndquadrant.com>> wrote:
>
>     On 2019-07-23 17:43, Cyril Champier wrote:
>      > In this documentation
>      > <https://www.postgresql.org/docs/9.1/queries-order.html>, it is said:
>      >
>      >     If sorting is not chosen, the rows will be returned in an
>      >     unspecified order. The actual order in that case will depend
>     on the
>      >     scan and join plan types and the order on disk, but it must
>     not be
>      >     relied on.
>      >
>      >
>      > I would like to know if there is any way to change that to have a
>     "real"
>      > random behaviour.
>
>     It might be an interesting exercise to implement this as a post-parsing
>     hook.
>
>     --
>     Peter Eisentraut http://www.2ndQuadrant.com/
>     PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


--
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Too slow to create new schema and their tables, functions,triggers.
Next
From: Luca Ferrari
Date:
Subject: Re: Too slow to create new schema and their tables, functions, triggers.