Re: Index usage for BYTEA column in OR/IN clause - Mailing list pgsql-general
From | David Garamond |
---|---|
Subject | Re: Index usage for BYTEA column in OR/IN clause |
Date | |
Msg-id | 40671D3E.3040806@zara.6.isreserved.com Whole thread Raw |
In response to | Re: Index usage for BYTEA column in OR/IN clause (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Index usage for BYTEA column in OR/IN clause
|
List | pgsql-general |
Tom Lane wrote: > David Garamond <lists@zara.6.isreserved.com> writes: > >>The table contain +- 1 mil records, all of the actual version of the >>queries below return < 10 rows, so an index should be used. Using an >>index scan, the query ran < 100ms. Using seq scan, 2-3 secs. So there is >>no good reason why a seq scan should be used, especially in a case of >>b='foo' or b='bar'. > > [shrug...] We can't possibly diagnose a bad-plan-choice problem with > the amount of information you've provided. See > http://techdocs.postgresql.org/guides/SlowQueryPostingGuidelines Okay, my bad. How about this: 1. script to create the test table (a 1mil-record table; each record contains 1-40 random bytes): #!/usr/bin/ruby -rpostgres ROWS = 1_000_000 conn = PGconn.connect("/tmp", 5432, "", "", "...", "...", "...") conn.exec("CREATE TABLE t (b BYTEA NOT NULL)") conn.exec("CREATE INDEX i_t_b ON t(b)") (1..ROWS).each { |i| b = (1..rand(40)+1).collect{"\\\\"+rand(256).to_s(8).rjust(3,"0")} if i % 1000 == 1; conn.exec("BEGIN"); end conn.exec("INSERT INTO t VALUES ('#{b}')") if i % 1000 == 0; conn.exec("COMMIT"); puts "#{i}/1000000..."; end } 2. output of explain analyze: => explain analyze select * from t where b='aa'; QUERY PLAN ----------------------------------------------------------------------------------------------------------- Index Scan using i_t_v on t (cost=0.00..17.07 rows=6 width=32) (actual time=0.062..0.062 rows=0 loops=1) Index Cond: (b = 'aa'::bytea) Total runtime: 0.166 ms (3 rows) Time: 19.372 ms => explain analyze select * from t where b='aa' or b='ab'; QUERY PLAN ---------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..25.00 rows=10 width=32) (actual time=6857.151..6857.151 rows=0 loops=1) Filter: ((b = 'aa'::bytea) OR (b = 'ab'::bytea)) Total runtime: 6857.345 ms (3 rows) Time: 6864.526 ms => explain analyze select * from t where b like 'aa%'; QUERY PLAN ------------------------------------------------------------------------------------------------------------ Index Scan using i_t_v on t (cost=0.00..17.07 rows=6 width=32) (actual time=0.682..15.763 rows=2 loops=1) Index Cond: ((b >= 'aa'::bytea) AND (b < 'ab'::bytea)) Filter: (b ~~ 'aa%'::bytea) Total runtime: 15.935 ms (4 rows) Time: 29.432 ms => explain analyze select * from t where b like 'aa%' or b like 'ab%'; QUERY PLAN --------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..25.00 rows=10 width=32) (actual time=636.738..7239.460 rows=3 loops=1) Filter: ((b ~~ 'aa%'::bytea) OR (b ~~ 'ab%'::bytea)) Total runtime: 7239.758 ms (3 rows) Time: 7251.326 ms => explain analyze select * from t where b in ('aa','ab'); QUERY PLAN ---------------------------------------------------------------------------------------------------- Seq Scan on t (cost=0.00..25.00 rows=10 width=32) (actual time=7055.390..7055.390 rows=0 loops=1) Filter: ((b = 'aa'::bytea) OR (b = 'ab'::bytea)) Total runtime: 7055.574 ms (3 rows) Time: 7063.942 ms -- dave
pgsql-general by date: