Thread: add, subtract bool type

add, subtract bool type

From
Jeff Davis
Date:
I made some simple functions that create the operators + and - for any
combination of int4's and bool's.

The reasoning that propted this operator was that I wanted to use the
following syntax in a query:
SELECT *,((col_a == 'foo') + (col_b == 'bar')) as relevance WHERE
relevance > 0;

The idea is that the attribute 'relevance' counts the number of matching
criteria. Such functionality would be most useful in a search engine,
but may have other uses.

My question is two-part:
1) Is this a sane approach?
2) Might enough other people find a use that I should make a
contribution somewhere (and would it be appropriate to send it to
pgsql-hackers or pgsql-patches)? Is there a chance it will get
incorporated into the main  source?

Regards,
Jeff Davis


Re: add, subtract bool type

From
Tom Lane
Date:
Jeff Davis <list-pgsql-general@dynworks.com> writes:
> I made some simple functions that create the operators + and - for any
> combination of int4's and bool's.

> My question is two-part:
> 1) Is this a sane approach?

It'd be less work if you made a bool->int conversion function and relied
on the regular integer operators.  Safer, too: this way is likely to
cause the system to accept queries that do things you didn't intend.

In fact, I'd personally want the conversion to not be applied by
default, which means you *shouldn't* call the conversion function int4().
Functions named after the destination type are assumed to represent
implicit type coercions, and will be applied without being mentioned
explicitly.  OTOH, if you don't care about type safety, you could
achieve the same result as your operator set with just such an implicit
coercion function:

test71=# select 't'::bool + 44;
ERROR:  Unable to identify an operator '+' for types 'bool' and 'int4'
        You will have to retype this query using an explicit cast

test71=# select 't'::bool + 't'::bool;
ERROR:  Unable to identify an operator '+' for types 'bool' and 'bool'
        You will have to retype this query using an explicit cast

test71=# create function int4(bool) returns int4 as '
test71'# select case when $1 then 1 else 0 end;' language 'sql';
CREATE

test71=# select 't'::bool + 44;
 ?column?
----------
       45
(1 row)

test71=# select 't'::bool + 't'::bool;
 ?column?
----------
        2
(1 row)

test71=#

But as I said, I'd prefer to name the function something else (maybe
"integerize") and have to invoke it explicitly.  I'm an old Pascal
programmer and believe strongly that bool and int ought not be
considered interchangeable: that masks too many programming errors.

> 2) Might enough other people find a use that I should make a
> contribution somewhere (and would it be appropriate to send it to
> pgsql-hackers or pgsql-patches)?

As you can see, there's not a lot to it, at least not for low-volume
applications.  If you expected to invoke integerize() zillions of
times, it might be worth the trouble to prepare a C-coded version of
it.  That would be worth contributing, since other people have asked
for this same functionality.  But then you'd have to get people to
agree on a name for the function, and believe me that'll be the
hardest part ;-)

            regards, tom lane

Re: add, subtract bool type

From
Peter Eisentraut
Date:
Jeff Davis writes:

> I made some simple functions that create the operators + and - for any
> combination of int4's and bool's.
>
> The reasoning that propted this operator was that I wanted to use the
> following syntax in a query:
> SELECT *,((col_a == 'foo') + (col_b == 'bar')) as relevance WHERE
> relevance > 0;

You could also try

case when col_a = 'foo' then 1 else 0 end + case when col_b = 'bar' then 1
else 0 end

which is the usual way to "cast" booleans.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter


Re: add, subtract bool type

From
Jeff Davis
Date:
On Wednesday 22 August 2001 07:21 am, you wrote:
> Jeff Davis <list-pgsql-general@dynworks.com> writes:
> > I made some simple functions that create the operators + and - for any
> > combination of int4's and bool's.
> >
> > My question is two-part:
> > 1) Is this a sane approach?
>
> It'd be less work if you made a bool->int conversion function and relied
> on the regular integer operators.  Safer, too: this way is likely to
> cause the system to accept queries that do things you didn't intend.
>
> In fact, I'd personally want the conversion to not be applied by
> default, which means you *shouldn't* call the conversion function int4().
> Functions named after the destination type are assumed to represent
> implicit type coercions, and will be applied without being mentioned
> explicitly.  OTOH, if you don't care about type safety, you could
> achieve the same result as your operator set with just such an implicit
> coercion function:
>
> test71=# select 't'::bool + 44;
> ERROR:  Unable to identify an operator '+' for types 'bool' and 'int4'
>         You will have to retype this query using an explicit cast
>
> test71=# select 't'::bool + 't'::bool;
> ERROR:  Unable to identify an operator '+' for types 'bool' and 'bool'
>         You will have to retype this query using an explicit cast
>
> test71=# create function int4(bool) returns int4 as '
> test71'# select case when $1 then 1 else 0 end;' language 'sql';
> CREATE
>
> test71=# select 't'::bool + 44;
>  ?column?
> ----------
>        45
> (1 row)
>
> test71=# select 't'::bool + 't'::bool;
>  ?column?
> ----------
>         2
> (1 row)
>
> test71=#
>
> But as I said, I'd prefer to name the function something else (maybe
> "integerize") and have to invoke it explicitly.  I'm an old Pascal
> programmer and believe strongly that bool and int ought not be
> considered interchangeable: that masks too many programming errors.
>
> > 2) Might enough other people find a use that I should make a
> > contribution somewhere (and would it be appropriate to send it to
> > pgsql-hackers or pgsql-patches)?
>
> As you can see, there's not a lot to it, at least not for low-volume
> applications.  If you expected to invoke integerize() zillions of
> times, it might be worth the trouble to prepare a C-coded version of
> it.  That would be worth contributing, since other people have asked
> for this same functionality.  But then you'd have to get people to
> agree on a name for the function, and believe me that'll be the
> hardest part ;-)
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

I have come to the conclusion that such a C function is about the same speed
as an inlined SQL "CASE" clause, but both are much faster than a SQL
function. Basically, I ran a test in which I created 3 views:

create view test_view_sql_fcn as select
*,(as_int_sql(col_a='foo')+as_int_sql(col_b='bar')) as relevance FROM
test_table;

create view test_view_c as select *,(as_int(col_a='foo')+as_int(col_b='bar'))
as relevance FROM test_table;

create view test_view_sql as select *,(case when col_a = 'foo' then 1 else 0
end + case when col_b = 'bar' then 1
else 0 end) as relevance FROM test_table;

test_table(col_a varchar(50),col_b varchar(50)) is a table with 14044 tuples,
mostly repetitive (test data).

as_int is a C function:
int4 bool_as_int(bool b) { return (int4)(b); }

as_int_sql is a SQL function:
create function int4(bool) returns int4 as 'select case when $1 then 1 else 0
end;' language 'sql';

I ran my client (written in C) which iterated 100 times through 3 loops, each
time executing a query on one of those views:
SELECT * FROM <aforementioned view> WHERE relevance > 0;

so a total of 300 queries, 100 for each view, each loop of 100 queries in
it's own transaction.

results:
SQL function: 59 seconds
C function: 17 seconds
SQL inline: 17 seconds
Note that I was not aiming for super-accuracy; just for a general estimate.


So, I concluded:
1) using a function might be cleaner to write in some situations, although
views will ususally eliminate any strong need.
2) SQL function is out of the question if performance is a concern
3) I see little reason to not add the function, since it should be simple
enough not to cause problems/confusion.
4) I agree with Tom about implicit casting, I actually didn't know PostgreSQL
would do that (I assumed a "::int4" suffix would be required when using "+").
I named the function "as_int" for lack of a better term (I am not sure
"integerize" would flow as well from the keyboard). However, I really don't
care what the name is.

I hope this infomation helps someone.

Regards,
Jeff Davis

Re: add, subtract bool type

From
Tom Lane
Date:
Jeff Davis <list-pgsql-general@dynworks.com> writes:
> SQL function: 59 seconds
> C function: 17 seconds
> SQL inline: 17 seconds

Very interesting.  I've been thinking for awhile that it'd be nice if
simple-SELECT SQL functions could be inlined by the planner/optimizer.
This is good evidence that that's worth doing.

            regards, tom lane