PostgreSQL bug: zero-min reluctant quantifier
During cross-validation a PostgreSQL bug was discovered involving
reluctant quantifiers whose minimum repetition is 0.
Example pattern:
PATTERN (A*?)
DEFINE A AS val > 0
Result comparison:
pattern PostgreSQL (cnt) Oracle (cnt)
A*? 1,1,1 0,0,0
A?? 1,1,1 0,0,0
A+? 1,1,1 1,1,1
For reluctant quantifiers with min=0 (such as *? and ??), PostgreSQL
always consumes at least one row, while Oracle allows a zero-length
match. When min>=1 (e.g., A+?), both systems behave the same.
This is indeed a bug. Thanks SugJun for finding it. I'll fix this in
the next patch.
Design difference: unused DEFINE variables
Example:
PATTERN (A+)
DEFINE A AS id > 0, B AS id > 5
PostgreSQL executes the query successfully and ignores the unused
variable B.
Oracle raises:
ORA-62503: illegal variable definition
Currently PostgreSQL silently removes unused DEFINE variables during
optimization. Do you think we should raise an error instead, as Oracle
does?
Oracle limitations observed
Bounded quantifier limit
A{200} -> works
A{201} -> ORA-62518
Oracle appears to limit the upper bound of bounded quantifiers to 200,
while PostgreSQL does not impose this restriction.
I don't think we need to impose an artificial limit like Oracle's 200.
What do you think?
Nested nullable quantifiers
Examples:
(A*)*
(A*)+
(((A)*)*)*
(A?|B){1,2}
((A?){2,3}){2,3}
(A?){n,m}
(A? B?){2,3}
Oracle raises:
ORA-62513
when a nullable subpattern is wrapped by an outer quantifier, while
PostgreSQL executes these patterns successfully.
This seems like an Oracle limitation rather than a standard requirement.
Best regards,
Henson