Thread: Mimicking Oracle SYSDATE
Hello, We are thinking of building our own version of Oracle's sysdate, in the form of PostgreSQL extension. Consider the behavior of sysdate for multiple inserts inside a function CREATE OR REPLACE FUNCTION fun2 RETURN number IS var1 number(10); var2 number (2); BEGIN insert into t1 select sysdate from dual connect by rownum<=70000; FOR var1 IN 0..1000000 LOOP SELECT 0 iNTO var2 FROM dual; END LOOP; insert into t1 select sysdate from dual connect by rownum<=70000; RETURN var1; END; The result of all first 70000 rows are same and result of all second 70000 row are same. But there is a difference between the values returned by sysdate between the 2 loops. PostgreSQL's statement_timestamp is not a good substitute for sysdate in this case as it returns same value for both loops. However if there are multiple inserts outside a function statement_timestamp(0) it seems to work the same as sysdate. Our implementation sysdate hence needs to figure out the context in which its called i.e. within a function or from outside. Also we have a similar need to figure out if its been called for multiple inserts (in a loop or simple insert statements one after another) as this affects behavior as well. So in short if we have a custom function how do we figure out 1) if its called from within another function 2) called within a loop 3) called multiple times outside a loop since this would alter what it returns Any idea how to implement this? regards Sameer
On Tue, Aug 19, 2014 at 07:41:00PM +0530, Sameer Thakur wrote: > We are thinking of building our own version of Oracle's sysdate, in > the form of PostgreSQL extension. I thought that was the point of the statement_timestamp() function? A -- Andrew Sullivan ajs@crankycanuck.ca
On 08/19/2014 07:11 AM, Sameer Thakur wrote: > Hello, > We are thinking of building our own version of Oracle's sysdate, in > the form of PostgreSQL extension. > Consider the behavior of sysdate for multiple inserts inside a function > > CREATE OR REPLACE FUNCTION fun2 > RETURN number > IS var1 number(10); var2 number (2); > BEGIN > insert into t1 select sysdate from dual connect by rownum<=70000; > FOR var1 IN 0..1000000 > LOOP > SELECT 0 iNTO var2 FROM dual; > END LOOP; > insert into t1 select sysdate from dual connect by rownum<=70000; > RETURN var1; > END; > > The result of all first 70000 rows are same and result of all second > 70000 row are same. But there is a difference between the values > returned by sysdate between the 2 loops. > > PostgreSQL's statement_timestamp is not a good substitute for sysdate > in this case > as it returns same value for both loops. > > However if there are multiple inserts outside a function > statement_timestamp(0) it seems to work the same as sysdate. > > Our implementation sysdate hence needs to figure out the context in > which its called > i.e. within a function or from outside. > > Also we have a similar need to figure out if its been called for > multiple inserts (in a loop or simple insert statements one after > another) as this affects behavior as well. > > So in short if we have a custom function how do we figure out > 1) if its called from within another function > 2) called within a loop > 3) called multiple times outside a loop > since this would alter what it returns > > Any idea how to implement this? Yes, clock_timestamp(): http://www.postgresql.org/docs/9.3/interactive/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT > regards > Sameer > > -- Adrian Klaver adrian.klaver@aklaver.com
Sameer Thakur <samthakur74@gmail.com> writes: > We are thinking of building our own version of Oracle's sysdate, in > the form of PostgreSQL extension. > Consider the behavior of sysdate for multiple inserts inside a function > CREATE OR REPLACE FUNCTION fun2 > RETURN number > IS var1 number(10); var2 number (2); > BEGIN > insert into t1 select sysdate from dual connect by rownum<=70000; > FOR var1 IN 0..1000000 > LOOP > SELECT 0 iNTO var2 FROM dual; > END LOOP; > insert into t1 select sysdate from dual connect by rownum<=70000; > RETURN var1; > END; > The result of all first 70000 rows are same and result of all second > 70000 row are same. But there is a difference between the values > returned by sysdate between the 2 loops. Do you really need to be bug-compatible with Oracle's SYSDATE at that level of detail? Especially seeing that SYSDATE is only precise to 1 second to start with? In most situations you could not tell the difference as to whether SYSDATE had advanced within a function or not; and, very likely, if it did visibly advance that would actually be a bug so far as behavior of the function was concerned, because it would be a case that hadn't been considered or tested. I suspect whoever thinks they have a requirement here hasn't actually thought very hard about it. If you think you do need bug-compatibility then the above is far from a precise specification, eg it doesn't address what should happen in sub-selects that are executed multiple times by the surrounding query. Anyway, the approach I'd think about using is to rely on the ability of C functions to cache query-lifespan data in fn_extra, ie (1) if fn_extra is NULL then read current timestamp and store it at *fn_extra (2) return *fn_extra This would give you one reading per query execution per textual occurrence of "sysdate()", which would satisfy the example above. A possible problem is that once in awhile, something like SELECT sysdate(), sysdate() would give two different answers. I'm not sure if that's possible with SYSDATE. [ thinks for a bit... ] Actually this might not work so well for sysdate() appearing in simple expressions in plpgsql. I think the expression evaluation tree is cached for the whole transaction in such cases. You'd have to test it. On the whole, much the best advice would be to explicitly read clock_timestamp() at the points where you'd like time to advance, eg convert the above to timestampvar := date_trunc('second', clock_timestamp()); insert into t1 select timestampvar from ... regards, tom lane
Hello Tom, >Do you really need to be bug-compatible with Oracle's SYSDATE at that >level of detail? We did a Oracle to PostgreSQL migration recently, and migrating sysdate was an issue. Figuring out whether to use clock_timestamp, timestamp(0), and cases in which both options are incorrect, took time. So we are trying to build an automated solution which can figure out context of sysdate and work it the same in PostgreSQL. Its easier to show the client that stuff works exactly the same before and after migration, bug or no bug. >Anyway, the approach I'd think about using.... Thank you, this is what we are looking for! >On the whole, much the best advice would be to explicitly read >clock_timestamp() at the points where you'd like time to advance, >eg convert the above to..... We understand. Appreciate the thoughts given, will keep you posted regards Sameer On Tue, Aug 19, 2014 at 8:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Sameer Thakur <samthakur74@gmail.com> writes: >> We are thinking of building our own version of Oracle's sysdate, in >> the form of PostgreSQL extension. >> Consider the behavior of sysdate for multiple inserts inside a function > >> CREATE OR REPLACE FUNCTION fun2 >> RETURN number >> IS var1 number(10); var2 number (2); >> BEGIN >> insert into t1 select sysdate from dual connect by rownum<=70000; >> FOR var1 IN 0..1000000 >> LOOP >> SELECT 0 iNTO var2 FROM dual; >> END LOOP; >> insert into t1 select sysdate from dual connect by rownum<=70000; >> RETURN var1; >> END; > >> The result of all first 70000 rows are same and result of all second >> 70000 row are same. But there is a difference between the values >> returned by sysdate between the 2 loops. > > Do you really need to be bug-compatible with Oracle's SYSDATE at that > level of detail? Especially seeing that SYSDATE is only precise to > 1 second to start with? In most situations you could not tell the > difference as to whether SYSDATE had advanced within a function or not; > and, very likely, if it did visibly advance that would actually be a bug > so far as behavior of the function was concerned, because it would be > a case that hadn't been considered or tested. I suspect whoever thinks > they have a requirement here hasn't actually thought very hard about it. > > If you think you do need bug-compatibility then the above is far from > a precise specification, eg it doesn't address what should happen in > sub-selects that are executed multiple times by the surrounding query. > > Anyway, the approach I'd think about using is to rely on the ability of > C functions to cache query-lifespan data in fn_extra, ie > (1) if fn_extra is NULL then read current timestamp and > store it at *fn_extra > (2) return *fn_extra > This would give you one reading per query execution per textual > occurrence of "sysdate()", which would satisfy the example above. > > A possible problem is that once in awhile, something like > SELECT sysdate(), sysdate() > would give two different answers. I'm not sure if that's possible > with SYSDATE. > > [ thinks for a bit... ] Actually this might not work so well for > sysdate() appearing in simple expressions in plpgsql. I think the > expression evaluation tree is cached for the whole transaction in > such cases. You'd have to test it. > > On the whole, much the best advice would be to explicitly read > clock_timestamp() at the points where you'd like time to advance, > eg convert the above to > > timestampvar := date_trunc('second', clock_timestamp()); > insert into t1 select timestampvar from ... > > regards, tom lane