Re: [PATCH] Generate random dates/times in a specified range - Mailing list pgsql-hackers

From Damien Clochard
Subject Re: [PATCH] Generate random dates/times in a specified range
Date
Msg-id afd24bcbac05f72e009e8e800ec86582@dalibo.info
Whole thread Raw
In response to Re: [PATCH] Generate random dates/times in a specified range  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Le 10.07.2025 00:14, Tom Lane a écrit :
> Damien Clochard <damien@dalibo.info> writes:
>> So this adds 5 new variants of the random() function:
>
>>      random(min date, max date) returns date
>>      random(min time, max time) returns time
>>      random(min time, max time, zone text) returns timetz
>>      random(min timestamp, max timestamp) returns timestamp
>>      random(min timestamptz, max timestamptz) returns timestamptz
>
> I'm a little uncomfortable with this proposal, mainly because it
> overloads the random() function name to the point where I'm afraid
> of "ambiguous function" failures in SQL code that used to be fine.
>

Hi

Thanks for the feedback !

I agree with this, I overloaded the random() function because this is
what was done previously with `random(int,int)` and I did the same like
the good sheep that I am :)
but i'm fine with renaming this functions to daterandom, timerandom or
whatever....

> The traditional way of achieving these results would be something like
>
>     select now() + random() * interval '10 days';
>
> and I'm not convinced that the use-case is so large as to justify
> adding built-in forms of that.
>


 From my experience, when users are writing a set of masking rules, they
tend to anonymize the dates with "a random date between start_date and
end_date"

Which can be trasnlated like this

SELECT start_date+(random()*(end_date-start_date))::interval;

But when you have hundreds of masking rules, the meaning of this one is
not so clear.


Now with PostgreSQL 18, we can write

SELECT random(start_date::int, end_date::int)::date;


Which is more explicit, but we could extend that logic to:

SELECT daterandom(start_date,end_date);


I agree this is merely syntactic sugar for the developers, but I don't
see why it is ok to provide random(int,int) or random(numeric,numeric)
and why random(date,date) is not.

Regards,

--
Damien Clochard



pgsql-hackers by date:

Previous
From: Japin Li
Date:
Subject: Re: track needed attributes in plan nodes for executor use
Next
From: Zhang Mingli
Date:
Subject: [Question] Window Function Results without ORDER BY Clause