Poor performance on seq scan - Mailing list pgsql-performance

From Laszlo Nagy
Subject Poor performance on seq scan
Date
Msg-id 450684EC.4090304@designaproduct.biz
Whole thread Raw
Responses Re: Poor performance on seq scan  (Heikki Linnakangas <heikki@enterprisedb.com>)
Re: Poor performance on seq scan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
  Hello,

I have a big table called products. Table size: 1123MB. Toast table
size: 32MB. Indexes size: 380MB.
I try to do a query like this:

select id,name from products where name like '%Mug%';

Yes, I know that tsearch2 is better for this, but please read on. The
above query gives this plan:

Seq Scan on product  (cost=0.00..153489.52 rows=31390 width=40)
  Filter: (name ~~ '%Mug%'::text)

When I use this with explain analyze:

"Seq Scan on product  (cost=0.00..153489.52 rows=31390 width=40) (actual
time=878.873..38300.588 rows=72567 loops=1)"
"  Filter: (name ~~ '%Mug%'::text)"
"Total runtime: 38339.026 ms"

Meanwhile, "iostat 5" gives something like this:

 tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
   1   14 128.00   1  0.10  128.00   1  0.10   5  0 94  1  0
   0   12 123.98 104 12.56  123.74 104 12.56   8  0 90  2  0
   0   12 125.66 128 15.75  125.26 128 15.68  10  0 85  6  0
   0   12 124.66 129 15.67  124.39 129 15.64  12  0 85  3  0
   0   12 117.13 121 13.87  117.95 121 13.96  12  0 84  5  0
   0   12 104.84 118 12.05  105.84 118 12.19  10  0 87  2  0

130 transfers per second with 12-15MB/sec transfer speed. (FreeBSD 6.1
with two STATA150 drives in gmirror RAID1)

I made another test. I create a file with the identifiers and names of
the products:

psql#\o products.txt
psql#select id,name from product;

Then I can search using grep:

grep "Mug" products.txt | cut -f1 -d\|

There is a huge difference. This command runs within 0.5 seconds. That
is, at least 76 times faster than the seq scan. It is the same if I
vacuum, backup and restore the database. I thought that the table is
stored in one file, and the seq scan will be actually faster than
grepping the file. Can you please tell me what am I doing wrong? I'm not
sure if I can increase the performance of a seq scan by adjusting the
values in postgresql.conf. I do not like the idea of exporting the
product table periodically into a txt file, and search with grep. :-)

Another question: I have a btree index on product(name). It contains all
product names and the identifiers of the products. Wouldn't it be easier
to seq scan the index instead of seq scan the table? The index is only
66MB, the table is 1123MB.

I'm new to this list and also I just recently started to tune postgresql
so please forgive me if this is a dumb question.

Regards,

   Laszlo

pgsql-performance by date:

Previous
From: Piñeiro
Date:
Subject: Re: Performance problem with Sarge compared with Woody
Next
From: Heikki Linnakangas
Date:
Subject: Re: Poor performance on seq scan