Re: Compatible defaults for LEAD/LAG - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: Compatible defaults for LEAD/LAG |
Date | |
Msg-id | CAFj8pRAFhk_Ogoj7nmotAH-MTEJap35cGRZT9AShXBKUVumLig@mail.gmail.com Whole thread Raw |
In response to | Re: Compatible defaults for LEAD/LAG (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Compatible defaults for LEAD/LAG
|
List | pgsql-hackers |
ne 30. 8. 2020 v 23:59 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> This is nice example of usage of anycompatible type (that is consistent
> with other things in Postgres), but standard says something else.
> It can be easily solved with https://commitfest.postgresql.org/28/2081/ -
> but Tom doesn't like this patch.
> I am more inclined to think so this feature should be implemented
> differently - there is no strong reason to go against standard or against
> the implementations of other databases (and increase the costs of porting).
> Now the implementation is limited, but allowed behaviour is 100% ANSI.
I don't particularly buy this argument. The case at hand is what to do
if we have, say,
select lag(integer_column, 1, 1.2) over ...
The proposed patch would result in the output being of type numeric,
and any rows using the default would show "1.2". The spec says that
the right thing is to return integer, and we should round the default
to "1" to make that work. But
(1) I doubt that anybody actually writes such things;
(2) For anyone who does write it, the spec's behavior fails to meet
the principle of least surprise. It is not normally the case that
any information-losing cast would be applied silently within an
expression.
postgres=# create table foo(a int);
CREATE TABLE
postgres=# insert into foo values(1.1);
INSERT 0 1
CREATE TABLE
postgres=# insert into foo values(1.1);
INSERT 0 1
postgres=# create table foo(a int default 1.1);
CREATE TABLE
postgres=# insert into foo values(default);
INSERT 0 1
postgres=# select * from foo;
┌───┐
│ a │
╞═══╡
│ 1 │
└───┘
(1 row)
CREATE TABLE
postgres=# insert into foo values(default);
INSERT 0 1
postgres=# select * from foo;
┌───┐
│ a │
╞═══╡
│ 1 │
└───┘
(1 row)
It is maybe strange, but it is not an unusual pattern in SQL. I think it can be analogy with default clause
DECLARE a int DEFAULT 1.2;
The default value doesn't change a type of variable. This is maybe too stupid example. There can be other little bit more realistic
CREATE OR REPLACE FUNCTION foo(a numeric, b numeric, ...
DECLARE x int DEFAULT a;
BEGIN
...
I am afraid about performance - if default value can change type, then some other things can stop work - like using index.
For *this* case we don't speak about some operations between two independent operands or function arguments. We are speaking about
the value and about a *default* for the value.
So this deviation from spec doesn't bother me; we have much bigger ones.
ok, if it is acceptable for other people, I can accept it too - I understand well so it is a corner case and there is some consistency with other Postgres features.
Maybe this difference should be mentioned in documentation.
My concern with this patch is what I said upthread: I'm not sure that
we should be adjusting this behavior in a piecemeal fashion. I looked
through pg_proc to find all the functions that have more than one any*
argument, and found these:
oid | prorettype
-----------------------------------------------+------------
lag(anyelement,integer,anyelement) | anyelement
lead(anyelement,integer,anyelement) | anyelement
width_bucket(anyelement,anyarray) | integer
btarraycmp(anyarray,anyarray) | integer
array_eq(anyarray,anyarray) | boolean
array_ne(anyarray,anyarray) | boolean
array_lt(anyarray,anyarray) | boolean
array_gt(anyarray,anyarray) | boolean
array_le(anyarray,anyarray) | boolean
array_ge(anyarray,anyarray) | boolean
array_append(anyarray,anyelement) | anyarray
array_prepend(anyelement,anyarray) | anyarray
array_cat(anyarray,anyarray) | anyarray
array_larger(anyarray,anyarray) | anyarray
array_smaller(anyarray,anyarray) | anyarray
array_position(anyarray,anyelement) | integer
array_position(anyarray,anyelement,integer) | integer
array_positions(anyarray,anyelement) | integer[]
array_remove(anyarray,anyelement) | anyarray
array_replace(anyarray,anyelement,anyelement) | anyarray
arrayoverlap(anyarray,anyarray) | boolean
arraycontains(anyarray,anyarray) | boolean
arraycontained(anyarray,anyarray) | boolean
elem_contained_by_range(anyelement,anyrange) | boolean
range_contains_elem(anyrange,anyelement) | boolean
range_eq(anyrange,anyrange) | boolean
range_ne(anyrange,anyrange) | boolean
range_overlaps(anyrange,anyrange) | boolean
range_contains(anyrange,anyrange) | boolean
range_contained_by(anyrange,anyrange) | boolean
range_adjacent(anyrange,anyrange) | boolean
range_before(anyrange,anyrange) | boolean
range_after(anyrange,anyrange) | boolean
range_overleft(anyrange,anyrange) | boolean
range_overright(anyrange,anyrange) | boolean
range_union(anyrange,anyrange) | anyrange
range_merge(anyrange,anyrange) | anyrange
range_intersect(anyrange,anyrange) | anyrange
range_minus(anyrange,anyrange) | anyrange
range_cmp(anyrange,anyrange) | integer
range_lt(anyrange,anyrange) | boolean
range_le(anyrange,anyrange) | boolean
range_ge(anyrange,anyrange) | boolean
range_gt(anyrange,anyrange) | boolean
range_gist_same(anyrange,anyrange,internal) | internal
(45 rows)
Now, there's no point in changing range_eq and the later entries in this
table (the ones taking two anyrange's); our rather lame definition of
anycompatiblerange means that we'd get no benefit from doing so. But
I think there's a strong case for changing everything before range_eq.
It'd be nice if something like
SELECT array[1] < array[1.1];
would work the same way that "SELECT 1 < 1.1" would.
There it has sense without any discussion. But it is a little bit different than using the default value in the LAG function.
I checked the other concern that I had about whether the more flexible
polymorphic definitions would create any new ambiguities, and I don't
see any problems with this list. As functions, none of these names are
overloaded, except with different numbers of arguments so there's no
ambiguity. Most of the array functions are also operators, but the
competing operators do not take arrays, so it doesn't look like there's
any issue on that side either.
So I think we should be more ambitious and generalize all of these
to use anycompatiblearray etc.
+1
Pavel
regards, tom lane
pgsql-hackers by date: