Re: [HACKERS] [PATCH] Improve geometric types - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: [HACKERS] [PATCH] Improve geometric types
Date
Msg-id 20180119.172646.207924292.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Improve geometric types  (Emre Hasegeli <emre@hasegeli.com>)
Responses Re: [HACKERS] [PATCH] Improve geometric types
List pgsql-hackers
Hello,

At Thu, 18 Jan 2018 16:01:01 +0100, Emre Hasegeli <emre@hasegeli.com> wrote in
<CAE2gYzwgcan3w3=-3oHa4Efif=0yvoErn-e_V5KJLUi9pd8ivQ@mail.gmail.com>
> > 0001:
> >
> > - You removed the check of parallelity check of
> >   line_interpt_line(old line_interpt_internal) but line_parallel
> >   is not changed so the consistency between the two functions are
> >   lost. It is better to be another patch file (maybe 0004?).
> 
> I am making line_parallel() use line_interpt_line().  What they do is
> exactly the same.

Thanks.

> > - +     Assert(p1->npts > 0 && p2->npts > 0);
> >
> >   Other path_ functions are allowing path with no points so just
> >   returning false if (p1->npts == 0 || p2->npts == 0) seems
> >   enough. Assertion should be used only for something server
> >   cannot continue working. (division by zero doesn't crash
> >   server) I saw other similar assertions in the patch.
> 
> path_in() and path_recv() reject paths with no points.  I thought we
> shouldn't spend cycles for things that cannot happen.  I can revert
> this part if you find previous practice better.

I'm fine with the current shape. Thanks. Maybe the same
discussion applies to polygons. (cf. poly_overlap)

> > 0003:
> >
> >  close_pl/ps/lseg/pb/ls/sb have changed to return null when
> >  lseg_closept_line returns NAN, but they are defined as strict
> >  and that is reasonable. Anyway pg_hypot returns NaN only when
> >  parameters contain INF or NAN so we should error out when it
> >  returns nan.
> 
> I though strict is only related to the input being NULL.  Tom Lane

Oops. Yes. you're rigtht.

> made close_ps() return NULL with commit
> 278148907a971ec7fa4ffb24248103d8012155d2.  The other functions

Thank you for the pointer. By the way,

https://www.postgresql.org/message-id/1919.1468269494%40sss.pgh.pa.us

| > Is it reasonable to disallow NaN coordinates on the next major
| > release.  Are there any reason to deal with them?
| 
| I don't see how we can do that; what would you do about tables already
| containing NaNs?  Even without that consideration, assuming that there's
| no way a NaN could creep in seems a pretty fragile assumption, considering
| that operations like Infinity/Infinity will produce one.

Ok, it is convicing. The policy is don't do anything that is not
harmful to server. Currently close_* are *strict* so what we
should do is avoid returning '(anything *)NULL' as a result.

> currently fail with elog()s from DirectFunctionCalls.  I don't have
> strong preference for NULL or an error.  Could you please suggest an
> errcode and errmsg, if you have?

=# select close_sb('((nan,0),(1,1))'::lseg, '((-1,-1),(2,2))'::box);
ERROR:  function 0x8fe61c returned NULL

Recosdering on it and I came to think that such usage of SQL null
is the same to "invalid" objects I mentioned upthread. But we
don't actively check component NaNs but if we happen to see an
invalid result, return SQL null instead.

In this criteria close_* functions looks good that return SQL
null.

0003:

 line_closept_point asserts line_interpt_line returns true but it
 is hazardous. It is safer if line_closept_point returns NaN if
 line_interpt_line returns false.

 All other functions looks good in the criteria.

> >  I understand that we don't need to consider NAN is equal to NAN.
> >  circle_same, point_eq_point, line_eq no longer needs such
> >  change?
> 
> I though it would be better to consider NaNs equal only on the
> equality operators.  That is why only they are changed that way.

I'm convinced by that.

> > 0004:
> >
> >  Looking line_recv's change, I became to think that FPxx macros
> >  is provided for coordinate values. I don't think it is applied
> >  to non-coordinate values like coefficients. If some kind of
> >  tolerance is needed here, it is not the same with the EPSILON.
> >
> >  +    if (FPzero(line->A) && FPzero(line->B))
> >  +        ereport(ERROR,
> >
> >  So, the above should be exact comparison with 0.0 and line_in
> >  also should be the same. And maybe the same thing should be done
> >  at many places..
> 
> I agree you.  EPSILON is currently applied prematurely.  I am trying
> to stay away from EPSION related issues to improve the chances to get
> this part committed.

Agreed.

> >  But the following line of line_parallel still requires some kind
> >  of tolerance to work properly. Since this patch is an
> >  imoprovement patch, I think we can change it to the vector method.
> 
> I am making line_parallel() use line_interpt_line() in response to
> your first remark.  Vector based algorithm is probably better for
> every use of line_interpt_line(), but it is a bigger change with more
> user visible effects.

I'm fine with that.

> >  The problem of line_distance is an existing one so we can leave
> >  it alone. It returns 0.0 for the most cases but it is a
> >  long-standing behavior.. (Anyway I don't find a reasonable
> >  definition of the distance between very-nearly parallel lines..)
> 
> Exact comparison with 0.0 instead of FPzero() would also be an
> improvement for line_distance(), but I am not doing it now.


reagrds,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Rangejoin rebased
Next
From: Yuto Hayamizu
Date:
Subject: Re: [HACKERS] [PATCH] Overestimated filter cost and its mitigation