Re: Compatible defaults for LEAD/LAG - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: Compatible defaults for LEAD/LAG |
Date | |
Msg-id | 3902163.1598824779@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Compatible defaults for LEAD/LAG (Pavel Stehule <pavel.stehule@gmail.com>) |
Responses |
Re: Compatible defaults for LEAD/LAG
|
List | pgsql-hackers |
Pavel Stehule <pavel.stehule@gmail.com> writes: > This is nice example of usage of anycompatible type (that is consistent > with other things in Postgres), but standard says something else. > It can be easily solved with https://commitfest.postgresql.org/28/2081/ - > but Tom doesn't like this patch. > I am more inclined to think so this feature should be implemented > differently - there is no strong reason to go against standard or against > the implementations of other databases (and increase the costs of porting). > Now the implementation is limited, but allowed behaviour is 100% ANSI. I don't particularly buy this argument. The case at hand is what to do if we have, say, select lag(integer_column, 1, 1.2) over ... The proposed patch would result in the output being of type numeric, and any rows using the default would show "1.2". The spec says that the right thing is to return integer, and we should round the default to "1" to make that work. But (1) I doubt that anybody actually writes such things; (2) For anyone who does write it, the spec's behavior fails to meet the principle of least surprise. It is not normally the case that any information-losing cast would be applied silently within an expression. So this deviation from spec doesn't bother me; we have much bigger ones. My concern with this patch is what I said upthread: I'm not sure that we should be adjusting this behavior in a piecemeal fashion. I looked through pg_proc to find all the functions that have more than one any* argument, and found these: oid | prorettype -----------------------------------------------+------------ lag(anyelement,integer,anyelement) | anyelement lead(anyelement,integer,anyelement) | anyelement width_bucket(anyelement,anyarray) | integer btarraycmp(anyarray,anyarray) | integer array_eq(anyarray,anyarray) | boolean array_ne(anyarray,anyarray) | boolean array_lt(anyarray,anyarray) | boolean array_gt(anyarray,anyarray) | boolean array_le(anyarray,anyarray) | boolean array_ge(anyarray,anyarray) | boolean array_append(anyarray,anyelement) | anyarray array_prepend(anyelement,anyarray) | anyarray array_cat(anyarray,anyarray) | anyarray array_larger(anyarray,anyarray) | anyarray array_smaller(anyarray,anyarray) | anyarray array_position(anyarray,anyelement) | integer array_position(anyarray,anyelement,integer) | integer array_positions(anyarray,anyelement) | integer[] array_remove(anyarray,anyelement) | anyarray array_replace(anyarray,anyelement,anyelement) | anyarray arrayoverlap(anyarray,anyarray) | boolean arraycontains(anyarray,anyarray) | boolean arraycontained(anyarray,anyarray) | boolean elem_contained_by_range(anyelement,anyrange) | boolean range_contains_elem(anyrange,anyelement) | boolean range_eq(anyrange,anyrange) | boolean range_ne(anyrange,anyrange) | boolean range_overlaps(anyrange,anyrange) | boolean range_contains(anyrange,anyrange) | boolean range_contained_by(anyrange,anyrange) | boolean range_adjacent(anyrange,anyrange) | boolean range_before(anyrange,anyrange) | boolean range_after(anyrange,anyrange) | boolean range_overleft(anyrange,anyrange) | boolean range_overright(anyrange,anyrange) | boolean range_union(anyrange,anyrange) | anyrange range_merge(anyrange,anyrange) | anyrange range_intersect(anyrange,anyrange) | anyrange range_minus(anyrange,anyrange) | anyrange range_cmp(anyrange,anyrange) | integer range_lt(anyrange,anyrange) | boolean range_le(anyrange,anyrange) | boolean range_ge(anyrange,anyrange) | boolean range_gt(anyrange,anyrange) | boolean range_gist_same(anyrange,anyrange,internal) | internal (45 rows) Now, there's no point in changing range_eq and the later entries in this table (the ones taking two anyrange's); our rather lame definition of anycompatiblerange means that we'd get no benefit from doing so. But I think there's a strong case for changing everything before range_eq. It'd be nice if something like SELECT array[1] < array[1.1]; would work the same way that "SELECT 1 < 1.1" would. I checked the other concern that I had about whether the more flexible polymorphic definitions would create any new ambiguities, and I don't see any problems with this list. As functions, none of these names are overloaded, except with different numbers of arguments so there's no ambiguity. Most of the array functions are also operators, but the competing operators do not take arrays, so it doesn't look like there's any issue on that side either. So I think we should be more ambitious and generalize all of these to use anycompatiblearray etc. regards, tom lane
pgsql-hackers by date: