Re: postgres slower than grep? - Mailing list pgsql-general
From | Allan Engelhardt |
---|---|
Subject | Re: postgres slower than grep? |
Date | |
Msg-id | 3B4A2AAE.87C480BF@cybaea.com Whole thread Raw |
List | pgsql-general |
I could only be botherd to try a million lines, but my results from 7.1.2 are below. Basically: 1. I find about 50% database storage overhead in this case. That's not completely silly, considering this is structureddata, but seems a little high. I don't know the internal structures well enough to really see what's happening. 2. Why would it be faster than grep? This has to match structured data, in this case varchar, and not just bytes. It hasto worry about transactions and logs, not just a stream of data. Besides, in my tests it is not *that* slow (3 sec, comparedwith 1/2). Dunno what's up with your system. 3. As you said: With an index it rocks, easily beating grep. Use an index - it's your friend :-) Allan. [playpen]$ wc -l /tmp/input.txt 1000001 /tmp/input.txt [playpen]$ ls -lh /tmp/input.txt -rw-rw-r-- 1 allane allane 124M Jul 9 22:12 input.txt [playpen]$ psql -f create_table.sql test CREATE [playpen]$ time psql -c "COPY clients FROM '/tmp/input.txt';" test COPY real 1m26.543s user 0m0.000s sys 0m0.010s [playpen]$ du -sh $PGDATA/base/963549/ 179M /var/lib/pgsql/data/base/963549 [playpen]$ time psql -c 'SELECT * FROM clients WHERE tel=1234567;' test test clientid | firstname | surname | area | neigh | dimos | tel ----------+-----------+---------+------+-------+-------+----- (0 rows) real 0m24.281s user 0m0.010s sys 0m0.000s [playpen]$vacuumdb -a; vacuumdb -a --analyze ... [playpen]$ time psql -c 'SELECT * FROM clients WHERE tel=1234567;' test clientid | firstname | surname | area | neigh | dimos | tel ----------+-----------+---------+------+-------+-------+----- (0 rows) real 0m3.808s user 0m0.010s sys 0m0.000s [playpen]$ time grep '1234567' /tmp/input.txt ... real 0m0.505s user 0m0.150s sys 0m0.360s [playpen]$ dropdb test DROP DATABASE [playpen]$ createdb -E LATIN1 test CREATE DATABASE [playpen]$ psql -f create_table.sql test CREATE [playpen]$ time psql -c "COPY clients FROM '/tmp/input.txt';" test COPY real 1m23.927s user 0m0.010s sys 0m0.000s [playpen]$ time psql -c 'SELECT * FROM clients WHERE tel=1234567;' test clientid | firstname | surname | area | neigh | dimos | tel ----------+-----------+---------+------+-------+-------+----- (0 rows) real 0m23.934s user 0m0.010s sys 0m0.010s [playpen]$vacuumdb -a; vacuumdb -a --analyze ... [playpen]$ time psql -c 'SELECT * FROM clients WHERE tel=1234567;' test clientid | firstname | surname | area | neigh | dimos | tel ----------+-----------+---------+------+-------+-------+----- (0 rows) real 0m3.796s user 0m0.010s sys 0m0.000s [playpen]$ psql -c 'CREATE INDEX clients_idx_tel ON clients (tel);' test CREATE [playpen]$ vacuumdb --analyze test VACUUM [playpen]$ time psql -c 'SELECT * FROM clients WHERE tel=1234567;' test clientid | firstname | surname | area | neigh | dimos | tel ----------+-----------+---------+------+-------+-------+----- (0 rows) real 0m0.189s user 0m0.010s sys 0m0.000s Spiros Ioannou wrote: > Hello all, > > before you start reading, have in mind that this is not post to advertise > one db over another, I just want to know what happens, that's why I did > the comparisons with other db. > to the point: > > This is my table: > > Table "table1" > Attribute | Type | Modifier > -----------+-------------+---------- > ClientID | integer | > firstname | varchar(5) | > surname | varchar(22) | > area | varchar(3) | > neigh | varchar(27) | > dimos | varchar(50) | > tel | varchar(7) | > > The Facts: > ---------- > The table exported to a text file is about 330MB, and contains about 5 > milion lines. > > The same table takes 670MB in postgres. Why? > (I tried mysql, and it uses 340MB). > > issuing the following query: > select * from table1 where tel='7485842'; > takes about 1min and 40 seconds to complete. (query has one result) > > explain of the above returns: > Seq Scan on table1 (cost=0.00..147835.01 rows=23 width=76) > > issuing a grep in the text file takes about 25 seconds! *(for the whole grep > to finish - worse case)* > > issuing the same query in mysql takes about 25 seconds. > > -to test the disk speed, I doubled the size of the text file by copying > it twice and the same grep query took 51 seconds (as expected) > > when creating an index the query completes of course in no time > > yes , i tried vacuum > > Postgres is version 7.0 > system is solaris 2.7 > hard disk is 10000rpm, ultraSCSI > cpu is UltraSparcIIi 333Mhz > physical memory is 384MB > > and now the questions > --------------------- > -shouldn't postgres be at least as fast as the grep? > -shouldn't the table data file be smaller to reduce disk activity? Why is it > double as the mysql same data file or the text file? I also noticed that > an index file for column "tel" is about 130MB large and the mysql's one > was 64MB, is this normal? > > Thank you for your time, > > Spiros Ioannou > e-mail:sivann@kill-9.gr > --------------------------------------- > Image Video & Multimedia Systems Lab. > Department of Electrical & Computer Eng. > National Technical University of Athens
pgsql-general by date: