Thread: [GENERAL] Planner statistics usage for composite type
Hello. I am trying to find workaround for cross-column statistics. For example, I have tags with similarity: select version(); version ------------------------------------------------------------------------------------------------------------------- PostgreSQL 10beta2 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609, 64-bit CREATE TABLE tags( t varchar(30) NOT NULL, s double precision ); INSERT INTO tags (SELECT 'tag1'::text as t, 0.7 as s from generate_series(0, 10000)); I think i can create index for cross-column statistics: CREATE TYPE tag_sim AS ( t varchar(30), s double precision ); CREATE INDEX tags_composite ON tags USING btree ((ROW(t, s)::tag_sim)); ANALYZE tags; SELECT * FROM pg_stats WHERE tablename = 'tags_composite'; -[ RECORD 1 ]----------+--------------- schemaname | public tablename | tags_composite attname | row inherited | f null_frac | 0 avg_width | 40 n_distinct | 1 most_common_vals | {"(tag1,0.7)"} most_common_freqs | {1} histogram_bounds | (null) correlation | 1 most_common_elems | (null) most_common_elem_freqs | (null) elem_count_histogram | (null) OK, I have statistics. Search: EXPLAIN SELECT * FROM tags WHERE ROW(t, s)::tag_sim >= ROW('tag1', 0.9)::tag_sim AND ROW(t, s)::tag_sim <= ROW('tag1', 1.0)::tag_sim; QUERY PLAN ------------------------------------------------------------------------------------------------------- Index Scan using tags_composite on tags (cost=0.29..9.29 rows=50 width=13) Index Cond: ((ROW(t, s)::tag_sim >= ROW('tag1', 0.9)) AND (ROW(t, s)::tag_sim <= ROW('tag1', 1.0))) (2 rows) Worn estimate. Planner doesn't use statistics. In code I see usage of function scalargtsel which returns default selectivity because ROW('tag1', 0.9)::tag_sim is not Const. May be someone known how to fix this issue? Thanks.
Dmitry Lazurkin <dilaz03@gmail.com> writes: > I am trying to find workaround for cross-column statistics. > ... > Worn estimate. Planner doesn't use statistics. In code I see usage of > function scalargtsel which returns default selectivity because > ROW('tag1', 0.9)::tag_sim is not Const. Yeah, that's because eval_const_expressions doesn't know how to fold a constant RowExpr to a simple Const. I have a patch laying about someplace to improve that, but I keep forgetting about it until we're in beta phase :-(. In the meantime you could perhaps do the folding by hand: regression=# EXPLAIN ANALYZE SELECT * FROM tags WHERE ROW(t, s)::tag_sim >= '(tag1,0.9)'::tag_sim AND ROW(t, s)::tag_sim <= '(tag1,1.0)'::tag_sim; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------- Index Scan using tags_composite on tags (cost=0.29..8.30 rows=1 width=13) (actual time=0.051..0.051 rows=0 loops=1) Index Cond: ((ROW(t, s)::tag_sim >= '(tag1,0.9)'::tag_sim) AND (ROW(t, s)::tag_sim <= '(tag1,1)'::tag_sim)) Planning time: 0.230 ms Execution time: 0.110 ms (4 rows) regards, tom lane
Tom, * Tom Lane (tgl@sss.pgh.pa.us) wrote: > Dmitry Lazurkin <dilaz03@gmail.com> writes: > > I am trying to find workaround for cross-column statistics. > > ... > > Worn estimate. Planner doesn't use statistics. In code I see usage of > > function scalargtsel which returns default selectivity because > > ROW('tag1', 0.9)::tag_sim is not Const. > > Yeah, that's because eval_const_expressions doesn't know how to fold > a constant RowExpr to a simple Const. I have a patch laying about > someplace to improve that, but I keep forgetting about it until > we're in beta phase :-(. In the meantime you could perhaps do the > folding by hand: Put it into the CF..? Thanks! Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > * Tom Lane (tgl@sss.pgh.pa.us) wrote: >> Yeah, that's because eval_const_expressions doesn't know how to fold >> a constant RowExpr to a simple Const. I have a patch laying about >> someplace to improve that, but I keep forgetting about it until >> we're in beta phase :-(. In the meantime you could perhaps do the >> folding by hand: > Put it into the CF..? Actually, I did already: https://commitfest.postgresql.org/14/1136/ regards, tom lane
On 07/19/2017 06:37 PM, Tom Lane wrote: > Dmitry Lazurkin <dilaz03@gmail.com> writes: >> I am trying to find workaround for cross-column statistics. >> ... >> Worn estimate. Planner doesn't use statistics. In code I see usage of >> function scalargtsel which returns default selectivity because >> ROW('tag1', 0.9)::tag_sim is not Const. > regression=# EXPLAIN ANALYZE SELECT * FROM tags WHERE ROW(t, s)::tag_sim >= '(tag1,0.9)'::tag_sim AND > ROW(t, s)::tag_sim <= '(tag1,1.0)'::tag_sim; > QUERY PLAN > ---------------------------------------------------------------------------------------------------------------------- > Index Scan using tags_composite on tags (cost=0.29..8.30 rows=1 width=13) (actual time=0.051..0.051 rows=0 loops=1) > Index Cond: ((ROW(t, s)::tag_sim >= '(tag1,0.9)'::tag_sim) AND (ROW(t, s)::tag_sim <= '(tag1,1)'::tag_sim)) > Planning time: 0.230 ms > Execution time: 0.110 ms > (4 rows) Thank you. (: