Thread: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

From
"Ryo Matsumura (Fujitsu)"
Date:
Hi hackers,

I detected two problems about ECPG.
I show my opinion. Please comment.
If it's correct, I will prepare a patch.
Thank you.

1.
It is indefinite what PGTYPEStimestamp_from_asc() returns in error.
The following is written in document(36.6.8. Special Constants of pgtypeslib):
  A value of type timestamp representing an invalid time stamp.
  This is returned by the function PGTYPEStimestamp_from_asc on parse error.
  Note that due to the internal representation of the timestamp data type,
  PGTYPESInvalidTimestamp is also a valid timestamp at the same time.
  It is set to 1899-12-31 23:59:59. In order to detect errors,
  make sure that your application does not only test for PGTYPESInvalidTimestamp
  but also for errno != 0 after each call to PGTYPEStimestamp_from_asc.

However, PGTYPESInvalidTimestamp is not defined anywhere.
It no loger exists at REL6_2 that is the oldest branch.
At current implementation, PGTYPEStimestamp_from_asc returns -1.

So we must fix the document so that users write as follows:

  r = PGTYPEStimestamp_from_asc(.....);
  if (r < 0 || errno != 0)
    goto error;


2.
Regression test of pgtypelib is not robust (maybe incorrect).
Our test failed although there is no bug actually.

I think block2 and block3 should be swapped.

---[src/interfaces/ecpg/test/pgtypeslib/dt_test.pgc]---

  // block1 (my comment)
      ts1 = PGTYPEStimestamp_from_asc("96-02-29", NULL);
      text = PGTYPEStimestamp_to_asc(ts1);
      printf("timestamp_to_asc1: %s\n", text);
      PGTYPESchar_free(text);

  // block2
      ts1 = PGTYPEStimestamp_from_asc("1994-02-11 26:10:35", NULL);
      text = PGTYPEStimestamp_to_asc(ts1);
      printf("timestamp_to_asc3: %s\n", text);
      PGTYPESchar_free(text);

  // The following comment is for block1 clearly.
/*     abc-03:10:35-def-02/11/94-gh  */
/*      12345678901234567890123456789 */

  // block3
  // Maybe the following is for 'ts1' returned in block1.
  // In our environment, 'out' is indefinite because PGTYPEStimestamp_fmt_asc()
  // didn't complete and the area is not initialized.
      out = (char*) malloc(32);
      i = PGTYPEStimestamp_fmt_asc(&ts1, out, 31, "abc-%X-def-%x-ghi%%");
      printf("timestamp_fmt_asc: %d: %s\n", i, out);
      free(out);

------------------------------------

Best Regards
Ryo Matsumura


Best Regards
Ryo Matsumura


Re: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

From
Tom Lane
Date:
"Ryo Matsumura (Fujitsu)" <matsumura.ryo@fujitsu.com> writes:
> It is indefinite what PGTYPEStimestamp_from_asc() returns in error.
> The following is written in document(36.6.8. Special Constants of pgtypeslib):
>   A value of type timestamp representing an invalid time stamp.
>   This is returned by the function PGTYPEStimestamp_from_asc on parse error.

> However, PGTYPESInvalidTimestamp is not defined anywhere.

Ugh.

> At current implementation, PGTYPEStimestamp_from_asc returns -1.

It looks to me like it returns 0 ("noresult").  Where are you seeing
-1?

That documentation has more problems too: it claims that "endptr"
is unimplemented, which looks like a lie to me: the code is there
to do it, and there are several callers that depend on it.

> Regression test of pgtypelib is not robust (maybe incorrect).
> Our test failed although there is no bug actually.

> I think block2 and block3 should be swapped.

Hm, block3 seems to be completely nuts.  It looks like the code is
rejecting the input (probably because "26" is out of range) and
returning zero, because what we see in the expected-stdout file is:

timestamp_to_asc3: 2000-01-01 00:00:00

We also see that the expected output from the PGTYPEStimestamp_fmt_asc
step is:

timestamp_fmt_asc: 0: abc-00:00:00-def-01/01/00-ghi%

which is consistent with that, but not very much like what the
comment is expecting.  I'm a bit inclined to just drop "block 3".
If we want to keep some kind of test of the error behavior,
it doesn't belong right there, and it should be showing what errno
gets set to.

As for the "lack of robustness", I'll bet the problem you are
seeing is that the test uses the %X/%x format specifiers which
are locale-dependent.  But how come we haven't noticed that
before?  Have you added a setlocale() call somewhere?

            regards, tom lane



Re: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

From
"Ryo Matsumura (Fujitsu)"
Date:
Hi Tom,
Thank you for comment.

>> At current implementation, PGTYPEStimestamp_from_asc returns -1.
> It looks to me like it returns 0 ("noresult").  Where are you seeing -1?

I took a mistake. Sorry.
PGTYPEStimestamp_from_asc returns 0(noresult).
PGTYPEStimestamp_fmt_asc given 'noresult' returns -1.


> But how come we haven't noticed that
> before?  Have you added a setlocale() call somewhere?

I didn't notice to this point.
I added setlocale() to ECPG in my local branch.
I will test again after removing it.
It looks to me like existing ECPG code does not include setlocale().

So Please ignore about behavior of PGTYPEStimestamp_fmt_asc().
I want to continue to discuss about PGTYPEStimestamp_from_asc.


Current PGTYPEStimestamp_from_asc() returns 0, but should we return -1?
The document claims about return that "It is set to 1899-12-31 23:59:59.".

I wonder.
It is the incompatibility, but it may be allowed.
because I think usual users may check with errno.
Of course, the reason is weak.
Some users may check with 0(noresult) from their experience.


> That documentation has more problems too: it claims that "endptr"
> is unimplemented, which looks like a lie to me: the code is there
> to do it, and there are several callers that depend on it.

I think so too. The followings have the same problem.
PGTYPESdate_from_asc    (ParseDate)
PGTYPESinterval_from_asc    (ParseDate)
PGTYPEStimestamp_from_asc    (ParseDate)
PGTYPESnumeric_from_asc


> which is consistent with that, but not very much like what the
> comment is expecting.  I'm a bit inclined to just drop "block 3".
> If we want to keep some kind of test of the error behavior,
> it doesn't belong right there, and it should be showing what errno
> gets set to.

It is easy to exchange block3 to errno-checking.
However if we just fix there, it is weird because there is
no other errno-checking in dt_tests.

> I'm a bit inclined to just drop "block 3".
Are you concerned at the above point?

Best Regards
Ryo Matsumura



Re: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

From
Tom Lane
Date:
"Ryo Matsumura (Fujitsu)" <matsumura.ryo@fujitsu.com> writes:
>> But how come we haven't noticed that
>> before?  Have you added a setlocale() call somewhere?

> I didn't notice to this point.
> I added setlocale() to ECPG in my local branch.
> I will test again after removing it.
> It looks to me like existing ECPG code does not include setlocale().

> So Please ignore about behavior of PGTYPEStimestamp_fmt_asc().

If that's the only ecpg test case that fails under a non-C locale,
I think it'd be good to change it to use a non-locale-dependent
format string.  Surely there's no compelling reason why it has to
use %x/%X rather than something more stable.

> I want to continue to discuss about PGTYPEStimestamp_from_asc.
> Current PGTYPEStimestamp_from_asc() returns 0, but should we return -1?

No, I think changing its behavior now after so many years would be a
bad idea.  In any case, what's better about -1?  They are both legal
timestamp values.  I think we'd better fix the docs to match what the
code actually does, and to tell people that they *must* check errno
to detect errors.

>> I'm a bit inclined to just drop "block 3".

> Are you concerned at the above point?

Given your point that no other dt_test case is checking error
behavior, I'm not too concerned about dropping this one.  If
we wanted to take the issue seriously, we'd need to add a bunch
of new tests not just tweak this one.  (It's far from obvious that
this was even meant to be a test of an error case --- it looks like
just sloppily-verified testing to me.  "block 3" must have been
dropped in after the tests before and after it were written,
without noticing that it changed their results.)

            regards, tom lane



Re: Bug: PGTYPEStimestamp_from_asc() in ECPG pgtypelib

From
"Ryo Matsumura (Fujitsu)"
Date:
# I'm sorry for my late response.

I confirmed that the error of regression is caused by my code inserting setlocale() into ecpglib of local branch.
No other tests occur error in non-C locale.

The following is about other topics.


1. About regression test

We should test the followings:
- PGTYPEStimestamp_from_asc("1994-02-11 26:10:35", NULL) returns 0.
- PGTYPEStimestamp_fmt_asc() can accept format string including %x and %X.

ecpglib should be affected by only setlocale() called by user application and
dt_test.pgc does not call it. So the following test is the best, I think.
Please see attached patch for detail (fix_pgtypeslib_regress.patch).

    ts1 = PGTYPEStimestamp_from_asc("1994-02-11 3:10:35", NULL);
    text = PGTYPEStimestamp_to_asc(ts1);
    printf("timestamp_to_asc2: %s\n", text);
    PGTYPESchar_free(text);

/*  abc-03:10:35-def-02/11/94-gh  */
/*      12345678901234567890123456789 */

    out = (char*) malloc(32);
    i = PGTYPEStimestamp_fmt_asc(&ts1, out, 31, "abc-%X-def-%x-ghi%%");
    printf("timestamp_fmt_asc: %d: %s\n", i, out);
    free(out);

    ts1 = PGTYPEStimestamp_from_asc("1994-02-11 26:10:35", NULL);
    text = PGTYPEStimestamp_to_asc(ts1);
    printf("timestamp_to_asc3: %s\n", text);
    PGTYPESchar_free(text);

We should also add tests that check PGTYPEStimestamp_*() set errno for invalid input correctly,
but I want to leave the improvement to the next timing when implement for timestamp is changed.
(Maybe the timing will not come.)


2. About document of PGTYPEStimestamp_from_asc() and PGTYPESInvalidTimestamp

0 returned by PGTYPEStimestamp_from_asc () is a valid timestamp as you commented and
we should not break compatibility.
So we should remove the document for PGTYPESInvalidTimestamp and add one for checking errno
to description of PGTYPEStimestamp_from_asc().
Please see attached patch for detail (fix_PGTYPESInvalidTimestamp_doc.patch).


3. About endptr of *_from_asc()
> PGTYPESdate_from_asc    (ParseDate)
> PGTYPEStimestamp_from_asc    (ParseDate)
> PGTYPESinterval_from_asc    (ParseDate)
> PGTYPESnumeric_from_asc

Basically, they return immediately just after detecting invalid format.
However, after passing the narrow parse, they could fails (e.g. failure of DecodeInterval(), DecodeISO8601Interval(), malloc(), and so on).

So we should write as follows:
   If the function detects invalid format,
   then it stores the address of the first invalid character in
   endptr. However, don't assume it successed if
   endptr points to end of input because other
   processing(e.g. memory allocation) could fails.
   Therefore, you should check return value and errno for detecting error.
   You can safely endptr to NULL.

I also found pandora box that description of the followings don't show their behavior when it fails.
I fix doc including them. Please see attached patch(fix_pgtypeslib_funcs_docs.patch).
- PGTYPESdate_from_asc()        # sets errno. (can not check return value)
- PGTYPESdate_defmt_asc()       # returns -1 and sets errno
- PGTYPEStimestamp_to_asc()     # returns NULL and sets errno
- PGTYPEStimestamp_defmt_asc()  # just returns 1 and doesn't set errno!
- PGTYPESinterval_new()         # returns NULL and sets errno
- PGTYPESinterval_from_asc()    # returns NULL and sets errno
- PGTYPESinterval_to_asc()      # returns NULL and sets errno
- PGTYPESinterval_copy          # currently always return 0
- PGTYPESdecimal_new()          # returns NULL and sets errno


4. Bug of PGTYPEStimestamp_defmt_asc()
PGTYPEStimestamp_defmt_asc() doesn't set errno on failure.
I didn't make a patch for it yet.

Best Regards
Ryo Matsumura
Attachment