Thread: PG_RETURN_INT64 vs PointerGetDatum & ANALYZE

PG_RETURN_INT64 vs PointerGetDatum & ANALYZE

From
"Sergey E. Koposov"
Date:
Hello, Hackers

I recently discovered that some my tables with the functional indices
have the wrong pg_stats records.

wsdb=# \d q3c          Table "public.q3c" Column |       Type       | Modifiers 
--------+------------------+----------- ipix   | bigint           | ra     | double precision | dec    | double
precision| 
 
Indexes:    "ipix_idx" btree (ipix)    "q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER

wsdb=# select * from pg_stats where tablename ~'q3c' and attname ~'pg'; schemaname |     tablename      |     attname
 | null_frac | avg_width | n_distinct |   most_common_vals    | most_common_freqs | histogram_bounds | correlation 
 

------------+--------------------+-----------------+-----------+-----------+------------+-----------------------+-------------------+------------------+-------------
public    | q3c_func_idx       | pg_expression_1 |         0 |         8 |          1 | {6000}                | {1}
         |                  |           1
 

You see that the most_common_freqs=1 which is certainly not true.

Starting to dig into the problem and checking my C function  I saw that when 
my C function (returning bigint) use the  return PointerGetDatum((&ipix_buf));
instead of PG_RETURN_INT64 ,the ANALYZE command produces the wrong statistics. 
But I think that's wrong, isn't it ?

So the short toy example:

CREATE OR REPLACE FUNCTION q3c_ang2ipix(double precision, double precision)        RETURNS bigint        AS
'$libdir/q3c','pgq3c_ang2ipix'        LANGUAGE C IMMUTABLE STRICT;
 

C function definition:

PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
{    static q3c_ipix_t ipix_buf;    ipix_buf ++;    elog(WARNING,"%lld",ipix_buf);    return
PointerGetDatum((&ipix_buf));
}

And I have the table q3c 
wsdb=# \d q3c          Table "public.q3c" Column |       Type       | Modifiers 
--------+------------------+----------- ipix   | bigint           | ra     | double precision | dec    | double
precision| 
 
Indexes:    "ipix_idx" btree (ipix)    "q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER


Now I do 
wsdb=# ANALYZE q3c;
WARNING:  1
WARNING:  2
WARNING:  3
WARNING:  4
....
WARNING:  2998
WARNING:  2999
WARNING:  3000
ANALYZE
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg'; schemaname |  tablename   |     attname     |
null_frac| avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation 
 

------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+------------------+-------------
public    | q3c_func_idx | pg_expression_1 |         0 |         8 |          1 | {3000}           | {1}
|                 |           1
 
(1 row)

So the values of most_common_vals, most_common_freqs are wrong

But if I replace the     return PointerGetDatum((&ipix_buf));
by the  PG_RETURN_INT64(ipix_buf)
the analyze works fine

wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg'; schemaname |  tablename   |     attname     |
null_frac| avg_width | n_distinct | most_common_vals | most_common_freqs |                  histogram_bounds
     | correlation 
 

------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+----------------------------------------------------+-------------
public    | q3c_func_idx | pg_expression_1 |         0 |         8 |         -1 |                  |
|{1,300,600,900,1200,1500,1800,2100,2400,2700,3000} |           1
 
(1 row)

Is it incorrect to use PointerGetDatum in those circumstances ?
Because I  always used it in that function to return bigints (to not have 
the additional overhead of PG_RETURN_INT64), and that's  the first time a 
see the bug due to that.

Thank you in advance,

Regards,    Sergey

*******************************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: math@sai.msu.ru


Re: PG_RETURN_INT64 vs PointerGetDatum & ANALYZE

From
Tom Lane
Date:
"Sergey E. Koposov" <math@sai.msu.ru> writes:
> C function definition:

> PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
> Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
> {
>      static q3c_ipix_t ipix_buf;
>      ipix_buf ++;
>      elog(WARNING,"%lld",ipix_buf);
>      return PointerGetDatum((&ipix_buf));
> }

This code is wrong on its face: it can't support multiple calls to the
function within a single query, because each call will damage the
previous call's result.  Try something like
select pgq3c_ang2ipix(...), pgq3c_ang2ipix(...) from ...

and you'll get bizarre behavior.

You need to return a palloc'd result, rather than returning pointers to
the same static variable on successive calls.
        regards, tom lane