Hello, Hackers
I recently discovered that some my tables with the functional indices
have the wrong pg_stats records.
wsdb=# \d q3c Table "public.q3c" Column | Type | Modifiers
--------+------------------+----------- ipix | bigint | ra | double precision | dec | double
precision|
Indexes: "ipix_idx" btree (ipix) "q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER
wsdb=# select * from pg_stats where tablename ~'q3c' and attname ~'pg'; schemaname | tablename | attname
| null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+--------------------+-----------------+-----------+-----------+------------+-----------------------+-------------------+------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 | 1 | {6000} | {1}
| | 1
You see that the most_common_freqs=1 which is certainly not true.
Starting to dig into the problem and checking my C function I saw that when
my C function (returning bigint) use the return PointerGetDatum((&ipix_buf));
instead of PG_RETURN_INT64 ,the ANALYZE command produces the wrong statistics.
But I think that's wrong, isn't it ?
So the short toy example:
CREATE OR REPLACE FUNCTION q3c_ang2ipix(double precision, double precision) RETURNS bigint AS
'$libdir/q3c','pgq3c_ang2ipix' LANGUAGE C IMMUTABLE STRICT;
C function definition:
PG_FUNCTION_INFO_V1(pgq3c_ang2ipix);
Datum pgq3c_ang2ipix(PG_FUNCTION_ARGS)
{ static q3c_ipix_t ipix_buf; ipix_buf ++; elog(WARNING,"%lld",ipix_buf); return
PointerGetDatum((&ipix_buf));
}
And I have the table q3c
wsdb=# \d q3c Table "public.q3c" Column | Type | Modifiers
--------+------------------+----------- ipix | bigint | ra | double precision | dec | double
precision|
Indexes: "ipix_idx" btree (ipix) "q3c_func_idx" btree (q3c_ang2ipix(ra, "dec")) CLUSTER
Now I do
wsdb=# ANALYZE q3c;
WARNING: 1
WARNING: 2
WARNING: 3
WARNING: 4
....
WARNING: 2998
WARNING: 2999
WARNING: 3000
ANALYZE
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg'; schemaname | tablename | attname |
null_frac| avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 | 1 | {3000} | {1}
| | 1
(1 row)
So the values of most_common_vals, most_common_freqs are wrong
But if I replace the return PointerGetDatum((&ipix_buf));
by the PG_RETURN_INT64(ipix_buf)
the analyze works fine
wsdb=# select * from pg_stats where tablename ~'q3c_f' and attname ~'pg'; schemaname | tablename | attname |
null_frac| avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds
| correlation
------------+--------------+-----------------+-----------+-----------+------------+------------------+-------------------+----------------------------------------------------+-------------
public | q3c_func_idx | pg_expression_1 | 0 | 8 | -1 | |
|{1,300,600,900,1200,1500,1800,2100,2400,2700,3000} | 1
(1 row)
Is it incorrect to use PointerGetDatum in those circumstances ?
Because I always used it in that function to return bigints (to not have
the additional overhead of PG_RETURN_INT64), and that's the first time a
see the bug due to that.
Thank you in advance,
Regards, Sergey
*******************************************************************
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: math@sai.msu.ru