Thread: SET within a function?
Hi all! I am doing some trigger functions that need to find a tuple in another table. The problem is that this second table is doing some summarization work, and I need nulls to equal each other. Basically, in the trigger I do a: SELECT INTO ... x FROM table1 WHERE ...(some straightforward x = old.x)... AND (x1 = old.x1 OR (x1 is null and old.x1 is null)) AND (x2 = old.x2 OR (x2 is null and old.x2 is null)) AND (x3 = old.x3 OR (x3 is null and old.x3 is null)); The problem is that an index is used to perform the straightforward stuff, and then the x1,x2,x3 is done via an index scan, rather than directly. Unfortunately for the data set I have, it can be clustered pretty badly around the straightforward stuff, and so the scan can take multiple seconds per call. I think if I could do a 'SET TRANSFORM_NULL_EQUALS TO ON' then this might fix the issue (don't know, haven't tried it yet). My question is: can this be done within a function such that at the end of the function, the value is reset back to value upon entering (kind of like 'SET LOCAL' except for just the length of the function call). Is this possible? Thanks! Ed
On Mon, Oct 13, 2003 at 21:16:33 -0400, Edmund Dengler <edmundd@eSentire.com> wrote: > > I think if I could do a 'SET TRANSFORM_NULL_EQUALS TO ON' then this might > fix the issue (don't know, haven't tried it yet). My question is: can this > be done within a function such that at the end of the function, the value > is reset back to value upon entering (kind of like 'SET LOCAL' except for > just the length of the function call). Is this possible? I don't think that will do what you want. That setting is used to rewrite = null as is null, not to change things so that nulls match each other.
Is the rewrite only for the literal 'X = NULL' or will it do a test against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? Is there any way to match NULLS to each other (as I am looking for a literal row, not using NULL as the UNKNOWN). I suppose I could put in a dummy value for the 'Not a valid value', but it seems to be quite awkward when I really do want the NULL. Regards! Ed On Mon, 13 Oct 2003, Bruno Wolff III wrote: > On Mon, Oct 13, 2003 at 21:16:33 -0400, > Edmund Dengler <edmundd@eSentire.com> wrote: > > > > I think if I could do a 'SET TRANSFORM_NULL_EQUALS TO ON' then this might > > fix the issue (don't know, haven't tried it yet). My question is: can this > > be done within a function such that at the end of the function, the value > > is reset back to value upon entering (kind of like 'SET LOCAL' except for > > just the length of the function call). Is this possible? > > I don't think that will do what you want. That setting is used to > rewrite = null as is null, not to change things so that nulls match each > other. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
Edmund Dengler wrote: > Is the rewrite only for the literal 'X = NULL' or will it do a test > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? It is a parse time transformation: http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=26ef31219ae11442&seekm=3DF52206.5060507%40mascari.com#link6 > Is there any way to match NULLS to each other (as I am looking for a > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > dummy value for the 'Not a valid value', but it seems to be quite awkward > when I really do want the NULL. Normalization would have you eliminate the NULL by having another relation whose candidate key is the same as your original table, but those records whose attribute is NULL would simply not be present in the child table. Another possible solution is to define your own type with an internal status for 'Not a valid value'... HTH, Mike Mascari mascarm@mascari.com
> Is the rewrite only for the literal 'X = NULL' or will it do a test > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? > > Is there any way to match NULLS to each other (as I am looking for a > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > dummy value for the 'Not a valid value', but it seems to be quite awkward > when I really do want the NULL. I ended up writing an "equivalent" function for the project I'm working on. It goes like this in plpgsql: IF $1 IS NULL THEN RETURN $2 IS NULL; ELSIF $2 IS NULL THEN -- We already know $1 is not null. RETURN FALSE; ELSE -- Both args are not null. RETURN $1 = $2; END IF; That's the basic idea. I put a wrapper around this to generate a copy of it for all the data types used in my database.
The problem I would face is that this still needs to be a sequential scan in the table rather than an index lookup. Regards, Ed On Tue, 14 Oct 2003, Arthur Ward wrote: > > Is the rewrite only for the literal 'X = NULL' or will it do a test > > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? > > > > Is there any way to match NULLS to each other (as I am looking for a > > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > > dummy value for the 'Not a valid value', but it seems to be quite awkward > > when I really do want the NULL. > > I ended up writing an "equivalent" function for the project I'm working > on. It goes like this in plpgsql: > > IF $1 IS NULL THEN > RETURN $2 IS NULL; > ELSIF $2 IS NULL THEN > -- We already know $1 is not null. > RETURN FALSE; > ELSE > -- Both args are not null. > RETURN $1 = $2; > END IF; > > That's the basic idea. I put a wrapper around this to generate a copy of > it for all the data types used in my database. >
Edmund Dengler wrote: > The problem I would face is that this still needs to be a sequential scan > in the table rather than an index lookup. IIRC, NULL values aren't indexed, only actual values, which is an implementation detail but yet-another reason why NULL-elimination through normalization is a good idea: http://www.hughdarwen.freeola.com/TheThirdManifesto.web/Missing-info-without-nulls.pdf Mike Mascari mascarm@mascari.com
An issue is that I am trying to avoid having another table (to stop the need for a join (performance reasons)). The NULLs are relatively rare, but since they can appear, and in certain pathological cases the sequential scan can take seconds to run, I was hoping for a work-around. But it looks like I have no real choice in this as there is no way to specify that NULL == NULL. Another question: if I have a multi-column index, and one of the values of a tuple is NULL, is that row not indexed? If it is, how does this jibe with the "NULLs are not indexed" statements? Thanks! Ed On Tue, 14 Oct 2003, Mike Mascari wrote: > Edmund Dengler wrote: > > > Is the rewrite only for the literal 'X = NULL' or will it do a test > > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? > > It is a parse time transformation: > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=26ef31219ae11442&seekm=3DF52206.5060507%40mascari.com#link6 > > > Is there any way to match NULLS to each other (as I am looking for a > > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > > dummy value for the 'Not a valid value', but it seems to be quite awkward > > when I really do want the NULL. > > Normalization would have you eliminate the NULL by having another > relation whose candidate key is the same as your original table, but > those records whose attribute is NULL would simply not be present in > the child table. > > Another possible solution is to define your own type with an internal > status for 'Not a valid value'... > > HTH, > > Mike Mascari > mascarm@mascari.com > > >
You can try COALESCE function where by if you have a null you can make it have a default value and that default value will be a value the field never reaches i.e. COALESCE(NULL,'') = COALESCE(NULL,'') HTH Darren On Tue, 14 Oct 2003, Edmund Dengler wrote: > An issue is that I am trying to avoid having another table (to stop the > need for a join (performance reasons)). The NULLs are relatively rare, but > since they can appear, and in certain pathological cases the sequential > scan can take seconds to run, I was hoping for a work-around. But it looks > like I have no real choice in this as there is no way to specify that > NULL == NULL. > > Another question: if I have a multi-column index, and one of the values of > a tuple is NULL, is that row not indexed? If it is, how does this jibe > with the "NULLs are not indexed" statements? > > Thanks! > Ed > > On Tue, 14 Oct 2003, Mike Mascari wrote: > > > Edmund Dengler wrote: > > > > > Is the rewrite only for the literal 'X = NULL' or will it do a test > > > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? > > > > It is a parse time transformation: > > > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=26ef31219ae11442&seekm=3DF52206.5060507%40mascari.com#link6 > > > > > Is there any way to match NULLS to each other (as I am looking for a > > > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > > > dummy value for the 'Not a valid value', but it seems to be quite awkward > > > when I really do want the NULL. > > > > Normalization would have you eliminate the NULL by having another > > relation whose candidate key is the same as your original table, but > > those records whose attribute is NULL would simply not be present in > > the child table. > > > > Another possible solution is to define your own type with an internal > > status for 'Not a valid value'... > > > > HTH, > > > > Mike Mascari > > mascarm@mascari.com > > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org > -- Darren Ferguson
This just returns us to the problem that the use of a function causes a sequential scan (as the select has to essentially apply the function to each row). I would need to store a dummy value into the field (it is an int, so I could store -1, but it breaks my sense of aesthetics to do this simply to get around the sequential scan). Could I use a functional index, maybe? Regards, Ed On Tue, 14 Oct 2003 darren@crystalballinc.com wrote: > You can try COALESCE function where by if you have a null you can make it > have a default value and that default value will be a value the field > never reaches > > i.e. COALESCE(NULL,'') = COALESCE(NULL,'') > > HTH > Darren > > On Tue, 14 Oct 2003, Edmund Dengler wrote: > > > An issue is that I am trying to avoid having another table (to stop the > > need for a join (performance reasons)). The NULLs are relatively rare, but > > since they can appear, and in certain pathological cases the sequential > > scan can take seconds to run, I was hoping for a work-around. But it looks > > like I have no real choice in this as there is no way to specify that > > NULL == NULL. > > > > Another question: if I have a multi-column index, and one of the values of > > a tuple is NULL, is that row not indexed? If it is, how does this jibe > > with the "NULLs are not indexed" statements? > > > > Thanks! > > Ed > > > > On Tue, 14 Oct 2003, Mike Mascari wrote: > > > > > Edmund Dengler wrote: > > > > > > > Is the rewrite only for the literal 'X = NULL' or will it do a test > > > > against a value such as 'X = OLD.X' (and rewrite is OLD.X is NULL)? > > > > > > It is a parse time transformation: > > > > > > http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=26ef31219ae11442&seekm=3DF52206.5060507%40mascari.com#link6 > > > > > > > Is there any way to match NULLS to each other (as I am looking for a > > > > literal row, not using NULL as the UNKNOWN). I suppose I could put in a > > > > dummy value for the 'Not a valid value', but it seems to be quite awkward > > > > when I really do want the NULL. > > > > > > Normalization would have you eliminate the NULL by having another > > > relation whose candidate key is the same as your original table, but > > > those records whose attribute is NULL would simply not be present in > > > the child table. > > > > > > Another possible solution is to define your own type with an internal > > > status for 'Not a valid value'... > > > > > > HTH, > > > > > > Mike Mascari > > > mascarm@mascari.com > > > > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: Have you searched our list archives? > > > > http://archives.postgresql.org > > > > -- > Darren Ferguson > >
Edmund Dengler wrote: > This just returns us to the problem that the use of a function causes a > sequential scan (as the select has to essentially apply the function to > each row). I would need to store a dummy value into the field (it is an > int, so I could store -1, but it breaks my sense of aesthetics to do this > simply to get around the sequential scan). > > Could I use a functional index, maybe? Yes, but I think you have to write a little wrapper: CREATE TABLE foo ( key integer not null, value text); CREATE FUNCTION toValue(text) RETURNS text AS ' SELECT COALESCE($1, ''''); ' LANGUAGE 'SQL' IMMUTABLE; CREATE INDEX i_foo1 ON foo(toValue(value)); And always be sure to use the function in the query: SELECT * FROM foo WHERE toValue(value) = ''; For fun: SET enable_seqscan to off; EXPLAIN SELECT * FROM foo WHERE toValue(value) = 'Mike'; should produce an Index Scan.... HTH, Mike Mascari mascarm@mascari.com
Edmund Dengler <edmundd@eSentire.com> writes: > ... I have no real choice in this as there is no way to specify that > NULL == NULL. The conventional wisdom on this is that if you think you need NULL == NULL to yield true, then you are misusing NULL, and you'd better reconsider your data representation. The standard semantics for NULL really do not support any other interpretation of NULL than "I don't know what this value is". If you are trying to use NULL to mean something else, you will face nothing but misery. Choose another representation for whatever you do mean. regards, tom lane
I guess it comes back to the semantics of NULL. As has been pointed out in many a database course, what we mean by NULL changes, and how we want to use NULL changes on circumstances. Normally, when I am comparing rows, I do want NULL <> NULL. In this specific instance, no value has been assigned to the specific column for this row, so NULL is appropriate. However, there are cases where I am trying to explicitely test for existence of a specific row in the table, and in this case, I _do_ want a NULL == NULL type of comparison. I could try and specify a dummy value (in this case, I could put in -1), but then I am trying to create a second class of NULLs, and this is usually not considered good design. Note that as a prime example of how postgresql itself is not "consistent" (in the strictest sense) is GROUP BY which treats NULL == NULL (interesting side bar, is there a way to cause GROUP BY to treat NULLs as not equal to each other?). In a theoretical question, how is this justified if NULL should not equal to NULL (other than "it is in the spec")? Also, is there a particular reason for not having a strict equality operator (or is it simply because it is not in the specification)? Performance? No support from the back-end? Something else? Regards, Ed On Wed, 15 Oct 2003, Tom Lane wrote: > Edmund Dengler <edmundd@eSentire.com> writes: > > ... I have no real choice in this as there is no way to specify that > > NULL == NULL. > > The conventional wisdom on this is that if you think you need NULL == > NULL to yield true, then you are misusing NULL, and you'd better > reconsider your data representation. The standard semantics for NULL > really do not support any other interpretation of NULL than "I don't > know what this value is". If you are trying to use NULL to mean > something else, you will face nothing but misery. Choose another > representation for whatever you do mean. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
Edmund Dengler <edmundd@eSentire.com> writes: > Normally, when I am comparing rows, I do want NULL <> NULL. No, you still haven't got the point. NULL is not equal to NULL, and it is not not-equal-to NULL either. The result of the comparison is NULL, not true or false. This is consistent with the interpretation of NULL as "I don't know the value". If you don't know what the value is, you also don't know whether it is equal to some other value. > Note that as a prime example of how postgresql itself is not "consistent" > (in the strictest sense) is GROUP BY which treats NULL == NULL Shrug ... the standard tells us to do that. SQL has never been held up as a model of consistency. > Also, is there a particular reason for not having a strict equality > operator (or is it simply because it is not in the specification)? The existing operators *are* strict (which is defined as NULL in yields NULL out). You could build a set of non-strict comparison operators if you had a mind to. IIRC you would lose some potential hashtable optimizations, but in the main it would work. regards, tom lane
Tom Lane wrote: >Edmund Dengler <edmundd@eSentire.com> writes: > > >>Normally, when I am comparing rows, I do want NULL <> NULL. >> >> > >No, you still haven't got the point. NULL is not equal to NULL, and >it is not not-equal-to NULL either. The result of the comparison is >NULL, not true or false. This is consistent with the interpretation >of NULL as "I don't know the value". If you don't know what the value >is, you also don't know whether it is equal to some other value. > > In these cases, it is recommended to either find a value which is out of range, normally, and use that in place of NULL. For examples: -1 10^32-1 "." the_oldest_possible_date BC the_furthest_away_date AD Another way is to put an additional column in, but I think this still has problems if you are trying to get a query to return values in a column that has NULLs and you are querying against the column that has the NULLs.
On Wed, 15 Oct 2003, Edmund Dengler wrote: > Note that as a prime example of how postgresql itself is not "consistent" > (in the strictest sense) is GROUP BY which treats NULL == NULL > (interesting side bar, is there a way to cause GROUP BY to treat NULLs as > not equal to each other?). In a theoretical question, how is this > justified if NULL should not equal to NULL (other than "it is in the > spec")? Because it's not defined in terms of equality. ;) GROUP BY is defined by value "distinct"ness, where distinct has a very specific definition in the spec (which treats two NULL values as not distinct). You might actually be able to find some way to use that to your advantage, but I'm not sure how.