Query Planner Issue - View and regexp_replace? - Mailing list pgsql-novice
From | Brandon Kane |
---|---|
Subject | Query Planner Issue - View and regexp_replace? |
Date | |
Msg-id | 792368.9791.qm@web51703.mail.re2.yahoo.com Whole thread Raw |
Responses |
Re: Query Planner Issue - View and regexp_replace?
(Tom Lane <tgl@sss.pgh.pa.us>)
|
List | pgsql-novice |
I'm not entirely sure where to ask this question, or what to include, so I'm hoping folks here can help me out with that. Basically, I have a query that joins a table to a view. The view contains a CASE statement with a regexp_replace call. The query plan used involves a sequential scan on the table used in the view. However, when I run directly against the table, or include the view's SQL in my query directly, I (correctly) get a nested loop scan with index scans. When I remove the regexp_replace clause from my CASE statement in the view, then the original query begins to use the "correct" query plan. I'm able to reproduce this behavior with simple test tables and absolutely no test data loaded, so I'm including the simple reproduction case below, with all the SQL. Any help in directing this into the correct format and to the correct place is appreciated. Thanks, -Brandon -- Server Version String -- PostgreSQL 8.2.5 on i386-redhat-linux-gnu, -- compiled by GCC gcc (GCC) 4.1.2 20070502 (Red Hat 4.1.2-12) -- -- Main Table -- CREATE TABLE test_table ( id character(12) NOT NULL, data character varying(20), CONSTRAINT "PK_test_table" PRIMARY KEY (id) ) WITH (OIDS=FALSE); ALTER TABLE test_table OWNER TO postgres; -- -- Join Table - Filtering done here, then joined to main table -- CREATE TABLE test_join_table ( joinid integer NOT NULL, refid character(12) NOT NULL, data integer NOT NULL, CONSTRAINT "PK_join_test" PRIMARY KEY (joinid), CONSTRAINT "FK_join_test_table" FOREIGN KEY (refid) REFERENCES test_table (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) WITH (OIDS=FALSE); ALTER TABLE test_join_table OWNER TO postgres; CREATE INDEX "IDX_join_data" ON test_join_table USING btree (data); -- -- View on Main Table. Used in join -- CREATE OR REPLACE VIEW test_view AS SELECT test_table.id, test_table.data, CASE WHEN btrim(test_table.data::text) = ''::text THEN '0'::text::bpchar::character varying -- Removing this condition from CASE statement in view corrects query planner problem. WHEN NOT isnumeric(test_table.data::text) THEN regexp_replace( test_table.data::text, '[^0-9]'::text, ''::text, 'g'::text)::bpchar::character varying -- ELSE test_table.data END AS datafixed FROM test_table; ALTER TABLE test_view OWNER TO postgres; -- -- Query #1 - This returns the correct query plan. View is not involved. -- select tt.* FROM test_table tt INNER JOIN test_join_table jt ON tt.id = jt.refid AND jt.joinid = 3 AND jt.data = 4 -- -- Query #2 - This is the problem query. It fails to return a plan involving a nested loop -- with the two tables. If the view is modified as mentioned above, this does return the -- correct query plan. -- select tt.* FROM test_view tt INNER JOIN test_join_table jt ON tt.id = jt.refid AND jt.joinid = 3 AND jt.data = 4 -- -- Query #3 - Interestingly, this returns the correct query plan. I've taken the view SQL, and -- merged it into the larger query, and it seems to return the correct query plan, even with -- the problematic line in the CASE statement. -- SELECT tt.id, tt.data, CASE WHEN btrim(tt.data::text) = ''::text THEN '0'::text::bpchar::character varying WHEN NOT isnumeric(tt.data::text) THEN regexp_replace( tt.data::text, '[^0-9]'::text, ''::text, 'g'::text)::bpchar::character varying ELSE tt.data END AS datafixed FROM test_table tt INNER JOIN test_join_table jt ON tt.id = jt.refid AND jt.joinid = 3 AND jt.data = 4 -- This query plan discrepancy occurs with absolutely no data loaded in the tables, as well as in -- the actual tables with actual data loaded. This suggests that the problem is not related to -- statistics. Since the field being operated in the case statement is not an index/search or -- join column, I can't figure out why it affects the query plan. I can't seem to find anything -- on this through Google. -- I expect that the query planner will return a nested loop scan with the join table being on -- the outer loop. In my practical scenario, this is the most efficient query, since the join -- table filtering will remove most of the rows. What I find in query #2 is that instead a -- hash/hash join is performed, which does a sequential scan on the main table. This is what -- I'm trying to avoid. ____________________________________________________________________________________ Never miss a thing. Make Yahoo your home page. http://www.yahoo.com/r/hs
pgsql-novice by date: