Thread: Postgres is too slow?

Postgres is too slow?

From

Paul Mamin

Date:

20 June 2001, 23:58:30

I've created the same table with 0.5M records ~200 bytes each. And I
decided to compare its speed under Postgres 7.1.2, Linux (kernel 2.4.1) and undex MS
SQL 7.0 (SP3), Win2K Server on the same PC - PIII/500 MHz, 128 Mb RAM.
Under W2K the OS stealt about 64Mb, and Linux eated almost
nothing, cause XWindows doesnt work. The speed of data exchange with
HDD was almost the same under Linux and Win2K.

Under Postgres I filled this table by COPY FROM cause.

And I made the SELECT that browse the whole table:
-----
select field1, sum(field2-field3)/count(*)
from table1
group by field1;
-----
1) It's speed almost doesnt depend from index on field1.
2) I made VACUUM ANALYZE

And ... MSSQL 7.0 worked in 2-2.5 times faster that Postgres :((

Why so?

--
Best regards,
 Paul                          mailto:magamos@mail.ru

Re: Postgres is too slow?

From

"Richard Huxton"

Date:

21 June 2001, 05:03:23

From: "Paul Mamin" <magamos@mail.ru>

> Under Postgres I filled this table by COPY FROM cause.
>
> And I made the SELECT that browse the whole table:
> -----
> select field1, sum(field2-field3)/count(*)
> from table1
> group by field1;
> -----
> 1) It's speed almost doesnt depend from index on field1.

Since you'll be accessing all the records anyway, I'm not sure use of an
index would make sense. Try EXPLAIN SELECT ... to see what postgesql thinks
is happening.

> 2) I made VACUUM ANALYZE

Well - PG knows the shape of the tables then.

> And ... MSSQL 7.0 worked in 2-2.5 times faster that Postgres :((

Can't really say - could be count(*) isn't being cached. Could be MSSQL is
just more efficient at this query. Bear in mind that MS may not be popular
with everyone, but their developers aren't idiots.

Post the output of the EXPLAIN and we'll see if PG is making any odd
assumptions.

- Richard Huxton

Re[2]: Postgres is too slow?

From

Paul Mamin

Date:

22 June 2001, 00:11:29

The explanation of the SQL request that works too slow on Postgres
follows.

>> Under Postgres I filled this table by COPY FROM cause.
...
>> And ... MSSQL 7.0 worked in 2-2.5 times faster that Postgres :((

RH> Post the output of the EXPLAIN and we'll see if PG is making any odd
RH> assumptions.

THE CREATE TABLE DEFINITION (about 200 bytes per record):
----------------------------------------------------------------
CREATE TABLE CallBase (
        f28 smallint NULL ,
        NumberID int NULL ,
        f29 varchar (18) NULL ,
        f30 varchar (18) NULL ,
        f10 int NULL ,
        f11 smallint NULL ,
        f12 smallint NULL ,
        f13 smallint NULL ,
        f14 smallint NULL ,
        f31 datetime NULL ,
        CallDuration int NULL ,
        f32 int NULL ,
        f33 float8 NULL ,
        f34 float8 NULL ,
        f35 float8 NULL ,
        f36 float8 NULL ,
        TarifDurationAir int NULL ,
        f15 int NULL ,
        f16 int NULL ,
        f17 int NULL ,
        f18 int NULL ,
        f19 real NULL ,
        f20 real NULL ,
        f21 real NULL ,
        f22 real NULL ,
        f23 smallint NULL ,
        f24 datetime NULL ,
        f25 int NULL ,
        f26 int NULL ,
        f27 int NULL ,
        f37 float8 NULL ,
        int1 int NULL ,
        int2 smallint NULL ,
        int3 smallint NULL ,
        bool1 bool NOT NULL ,
        bool2 bool NOT NULL ,
        bool3 bool NOT NULL ,
        bool4 bool NOT NULL ,
        bool5 bool NOT NULL ,
        bool6 bool NOT NULL ,
        bool7 bool NOT NULL ,
        bool8 bool NOT NULL ,
        f38 int NULL ,
        f39 varchar (2) NULL ,
        f40 varchar (2) NULL ,
        f41 varchar (2) NULL ,
        f42 int NOT NULL ,
        f43 int NOT NULL ,
        f44 smallint NOT NULL
);
----------------------------------------------------------------


I fill this table by COPY FROM command with 500K records, exported
from the same table from MS SQL 7.0 (SP3), and created index on field
"numberid".

postmaster runned with option -B 8000 (~64 Kb for buffers)
and the result of linux "top" command during SQL command - all is OK:
----------------------------------------------------------------
  PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
  253 sql       16   0 68320  66M 66424 R       0 80.0 53.4   0:33 postmaster
----------------------------------------------------------------

the result of SELECT COUNT(*) request:
----------------------------------------------------------------
Billing=# select now(); select count(*) from callbase; select now();
          now
------------------------
 2001-06-21 16:52:02+06
(1 row)

 count
--------
 500000
(1 row)

          now
------------------------
 2001-06-21 16:52:44+06
(1 row)
----------------------------------------------------------------
Note: it's too too long time - 32 seconds :(


the result of SELECT SUM(x) request:
----------------------------------------------------------------
Billing=# select now(); select sum(CallDuration) from callbase; select now();
          now
------------------------
 2001-06-21 17:11:09+06
(1 row)

   sum
----------
 26249970
(1 row)

          now
------------------------
 2001-06-21 17:11:59+06
(1 row)
----------------------------------------------------------------
Note: it's too long time also - 50 seconds


the result of SELECT SUM(x-y) request:
----------------------------------------------------------------
Billing=# select now(); select sum(TarifDurationAir-CallDuration) from callbase; select now();
          now
------------------------
 2001-06-21 17:13:36+06
(1 row)

   sum
----------
 12318973
(1 row)

          now
------------------------
 2001-06-21 17:14:28+06
(1 row)
----------------------------------------------------------------
Note: it's 52 seconds


The SQL command I need to request:
----------------------------------------------------------------
select numberid, sum(TarifDurationAir-CallDuration)/count(*)
from callbase
group by numberid;
----------------------------------------------------------------


The result of EXPLAIN on this request (after this request and VACUUM
ANALYZE):
----------------------------------------------------------------
NOTICE:  QUERY PLAN:

Aggregate  (cost=85493.92..89243.92 rows=50000 width=12)
  ->  Group  (cost=85493.92..86743.92 rows=500000 width=12)
        ->  Sort  (cost=85493.92..85493.92 rows=500000 width=12)
              ->  Seq Scan on callbase  (cost=0.00..20185.00 rows=500000 width=12)

EXPLAIN
----------------------------------------------------------------


The result of previous SQL command (with SELECT NOW() before and after
it):
----------------------------------------------------------------
          now
------------------------
 2001-06-21 16:59:05+06
(1 row)

 numberid |   ?column?
----------+---------------
       56 | 19.7777777778
 ........................
    10676 | 27.5357142857
(3562 rows)

          now
------------------------
 2001-06-21 17:00:58+06
(1 row)
----------------------------------------------------------------
Note: too long time - 113 seconds.


The same SQL request MS SQL 7.0 made for 24-25 seconds - that's 4.5
times slower :(((

--
Best regards,
 Paul                            mailto:magamos@mail.ru

Re: Postgres is too slow?

From

Richard Huxton

Date:

22 June 2001, 03:52:33

Paul Mamin wrote:
>
[table definition snipped]

> I fill this table by COPY FROM command with 500K records, exported
> from the same table from MS SQL 7.0 (SP3), and created index on field
> "numberid".
>
> postmaster runned with option -B 8000 (~64 Kb for buffers)
> and the result of linux "top" command during SQL command - all is OK:
> ----------------------------------------------------------------
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
>   253 sql       16   0 68320  66M 66424 R       0 80.0 53.4   0:33 postmaster
> ----------------------------------------------------------------

This is 64MB not 64kB - look at the "top" output. Probably too big for a
128MB machine - you're limiting Linux's ability to manage your memory.
I'm assuming you're not planning to have thousands of clients connecting simultaneously.

> the result of SELECT COUNT(*) request:
> ----------------------------------------------------------------
> Billing=# select now(); select count(*) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 16:52:02+06
> (1 row)
>
>  count
> --------
>  500000
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 16:52:44+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's too too long time - 32 seconds :(

Are you getting a lot of disk activity with this?

> The SQL command I need to request:
> ----------------------------------------------------------------
> select numberid, sum(TarifDurationAir-CallDuration)/count(*)
> from callbase
> group by numberid;
> ----------------------------------------------------------------
>
> The result of EXPLAIN on this request (after this request and VACUUM
> ANALYZE):
> ----------------------------------------------------------------
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=85493.92..89243.92 rows=50000 width=12)
>   ->  Group  (cost=85493.92..86743.92 rows=500000 width=12)
>         ->  Sort  (cost=85493.92..85493.92 rows=500000 width=12)
>               ->  Seq Scan on callbase  (cost=0.00..20185.00 rows=500000 width=12)

OK - sequential scan - what you'd expect since you need to access all
the entries anyway. The cost estimates are the same for the
aggregate,group and sort.

I'm no expert on performance tuning, but I'd refer you to ch 3.4 / 3.5
of the Administrator's manual - runtime config and kernel resources. I'd
leave the -B option alone and rerun - I'd guess performance won't get
any worse. Then try increasing the sort memory and/or buffers gradually
until you get a feel for how these affect the system.

I'd use "vmstat" to monitor the system while the query is running. You
shouldn't be seeing a lot of disk activity.

If nothing leaps out at you, and you can get me details of the
distribution of values for numberid by lunchtime (it's 8am here) I'll
try your query on similar hardware here.

HTH

- Richard Huxton

Re: Re[2]: Postgres is too slow?

From

"Yasuo Ohgaki"

Date:

22 June 2001, 07:05:14

I was curious about this result. So I tested on my test PC and got 4.3 to 14
times
faster results than Mamin's results.

*SNIP*
>
> I fill this table by COPY FROM command with 500K records, exported
> from the same table from MS SQL 7.0 (SP3), and created index on field
> "numberid".
*SNIP*
> the result of SELECT COUNT(*) request:
> ----------------------------------------------------------------
> Billing=# select now(); select count(*) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 16:52:02+06
> (1 row)
>
>  count
> --------
>  500000
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 16:52:44+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's too too long time - 32 seconds :(

It's surely so slow. (It tooks 42 seconds, intead of 32 seconds :)
I created table from your schema and 500k records.
It took 3 seconds to finish query on my system. My test PC is not
highend PC, but old slow desktop with Celeron 466 + 192MB RAM
+ UMDA33 HDD.

Took 3 seconds. (14 times faster - I have primary key)

yohgaki=# select now(); select count(id) from callbase; select now();
          now
------------------------
 2001-06-22 16:39:53+09
(1 row)

 count
--------
 500001
(1 row)

          now
------------------------
 2001-06-22 16:39:56+09
(1 row)

yohgaki=#


>
> the result of SELECT SUM(x) request:
> ----------------------------------------------------------------
> Billing=# select now(); select sum(CallDuration) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 17:11:09+06
> (1 row)
>
>    sum
> ----------
>  26249970
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 17:11:59+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's too long time also - 50 seconds

Took 10 seconds. (5 times faster)

yohgaki=# select now(); select sum(CallDuration) from callbase; select now();
          now
------------------------
 2001-06-22 16:42:44+09
(1 row)

    sum
-----------
 617001234
(1 row)

          now
------------------------
 2001-06-22 16:42:54+09
(1 row)

yohgaki=#

>
>
> the result of SELECT SUM(x-y) request:
> ----------------------------------------------------------------
> Billing=# select now(); select sum(TarifDurationAir-CallDuration) from
callbase; select now();
>           now
> ------------------------
>  2001-06-21 17:13:36+06
> (1 row)
>
>    sum
> ----------
>  12318973
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 17:14:28+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's 52 seconds

Took 12 seconds. (About 4.3 times faster)

yohgaki=# select now(); select sum(TarifDurationAir-CallDuration) from callbase;
select now();
          now
------------------------
 2001-06-22 16:43:45+09
(1 row)

    sum
------------
 2222004444
(1 row)

          now
------------------------
 2001-06-22 16:43:57+09
(1 row)

yohgaki=#

>
>
> The SQL command I need to request:
> ----------------------------------------------------------------
> select numberid, sum(TarifDurationAir-CallDuration)/count(*)
> from callbase
> group by numberid;
> ----------------------------------------------------------------
>
>
> The result of EXPLAIN on this request (after this request and VACUUM
> ANALYZE):
> ----------------------------------------------------------------
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=85493.92..89243.92 rows=50000 width=12)
>   ->  Group  (cost=85493.92..86743.92 rows=500000 width=12)
>         ->  Sort  (cost=85493.92..85493.92 rows=500000 width=12)
>               ->  Seq Scan on callbase  (cost=0.00..20185.00 rows=500000
width=12)
>
> EXPLAIN
> ----------------------------------------------------------------

NOTICE:  QUERY PLAN:

Aggregate  (cost=74982.03..78732.04 rows=50000 width=12)
  ->  Group  (cost=74982.03..76232.04 rows=500001 width=12)
        ->  Sort  (cost=74982.03..74982.03 rows=500001 width=12)
              ->  Seq Scan on callbase  (cost=0.00..9673.01 rows=500001
width=12)

EXPLAIN
yohgaki=#

I don't define any indexes for rows.

>
>
> The result of previous SQL command (with SELECT NOW() before and after
> it):
> ----------------------------------------------------------------
>           now
> ------------------------
>  2001-06-21 16:59:05+06
> (1 row)
>
>  numberid |   ?column?
> ----------+---------------
>        56 | 19.7777777778
>  ........................
>     10676 | 27.5357142857
> (3562 rows)
>
>           now
> ------------------------
>  2001-06-21 17:00:58+06
> (1 row)
> ----------------------------------------------------------------
> Note: too long time - 113 seconds.

Took 18 second. (About 6.3 times faster)

          now
------------------------
 2001-06-22 16:46:52+09
(1 row)

 numberid |    ?column?
----------+-----------------
          | 4444.0000000000
(1 row)

          now
------------------------
 2001-06-22 16:47:10+09
(1 row)

yohgaki=#

Took 18 seconds. However, I think this result is not fair to compare
since I have meaning less data in the table and have only 1 result. (Mamin
has more than 3000 thousands lines of outputs for terminal)

Do you really have enough memory and good options for Postgres?
Is  your kernel and options are optimized enough?
Did test this on busy server?
My test PC is running many servers (httpd, smbd, named, qmail, etc,), but
this PC is not used by anyone but me.

BTW, I always have primary key in a table, so I added primary key to the
table. I don't have index other than primary key. Query would finish faster
if I add some indexes.
If you have primary key, I think you'll have better results.

Regards,
--
Yasuo Ohgaki

Re: Postgres is too slow?

From

"Richard Huxton"

Date:

22 June 2001, 07:33:20

From: "Richard Huxton" <dev@archonet.com>

> Paul Mamin wrote:
> >
> > The SQL command I need to request:
> > ----------------------------------------------------------------
> > select numberid, sum(TarifDurationAir-CallDuration)/count(*)
> > from callbase
> > group by numberid;
> > ----------------------------------------------------------------

Made up some test data (3491 values of numberid, 140 entries for each).
Timings I'm getting on an old AMD-K6ii - 400Mhz w/256 MB (and a lot of
quiescent apps)

For your query: 54secs

By setting sort_mem to 8192 (8MB) I get a time of 49s but at the cost of a
much larger backend process.

Just doing:

SELECT numberid FROM callbase GROUP BY numberid;

Gives a timing of 34secs.

Disk activity is minimal - CPU is maxed out in user processing. I'd expect
your setup to be 10% faster at least.

HTH

- Richard Huxton

Re: Re[2]: Postgres is too slow?

From

Alex Pilosov

Date:

22 June 2001, 07:46:08

Paul,

Questions:
a) which linux kernel version is it?
b) what kind of a disk is it? IDE or SCSI?
c) what raw speed do you from the disk? do
'time dd if=/dev/hdxxx of=/dev/null bs=64k count=10000'

On Fri, 22 Jun 2001, Paul Mamin wrote:

> The explanation of the SQL request that works too slow on Postgres
> follows.
>
> >> Under Postgres I filled this table by COPY FROM cause.
> ...
> >> And ... MSSQL 7.0 worked in 2-2.5 times faster that Postgres :((
>
> RH> Post the output of the EXPLAIN and we'll see if PG is making any odd
> RH> assumptions.
>
> THE CREATE TABLE DEFINITION (about 200 bytes per record):
> ----------------------------------------------------------------
> CREATE TABLE CallBase (
>         f28 smallint NULL ,
>         NumberID int NULL ,
>         f29 varchar (18) NULL ,
>         f30 varchar (18) NULL ,
>         f10 int NULL ,
>         f11 smallint NULL ,
>         f12 smallint NULL ,
>         f13 smallint NULL ,
>         f14 smallint NULL ,
>         f31 datetime NULL ,
>         CallDuration int NULL ,
>         f32 int NULL ,
>         f33 float8 NULL ,
>         f34 float8 NULL ,
>         f35 float8 NULL ,
>         f36 float8 NULL ,
>         TarifDurationAir int NULL ,
>         f15 int NULL ,
>         f16 int NULL ,
>         f17 int NULL ,
>         f18 int NULL ,
>         f19 real NULL ,
>         f20 real NULL ,
>         f21 real NULL ,
>         f22 real NULL ,
>         f23 smallint NULL ,
>         f24 datetime NULL ,
>         f25 int NULL ,
>         f26 int NULL ,
>         f27 int NULL ,
>         f37 float8 NULL ,
>         int1 int NULL ,
>         int2 smallint NULL ,
>         int3 smallint NULL ,
>         bool1 bool NOT NULL ,
>         bool2 bool NOT NULL ,
>         bool3 bool NOT NULL ,
>         bool4 bool NOT NULL ,
>         bool5 bool NOT NULL ,
>         bool6 bool NOT NULL ,
>         bool7 bool NOT NULL ,
>         bool8 bool NOT NULL ,
>         f38 int NULL ,
>         f39 varchar (2) NULL ,
>         f40 varchar (2) NULL ,
>         f41 varchar (2) NULL ,
>         f42 int NOT NULL ,
>         f43 int NOT NULL ,
>         f44 smallint NOT NULL
> );
> ----------------------------------------------------------------
>
>
> I fill this table by COPY FROM command with 500K records, exported
> from the same table from MS SQL 7.0 (SP3), and created index on field
> "numberid".
>
> postmaster runned with option -B 8000 (~64 Kb for buffers)
> and the result of linux "top" command during SQL command - all is OK:
> ----------------------------------------------------------------
>   PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
>   253 sql       16   0 68320  66M 66424 R       0 80.0 53.4   0:33 postmaster
> ----------------------------------------------------------------
>
> the result of SELECT COUNT(*) request:
> ----------------------------------------------------------------
> Billing=# select now(); select count(*) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 16:52:02+06
> (1 row)
>
>  count
> --------
>  500000
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 16:52:44+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's too too long time - 32 seconds :(
>
>
> the result of SELECT SUM(x) request:
> ----------------------------------------------------------------
> Billing=# select now(); select sum(CallDuration) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 17:11:09+06
> (1 row)
>
>    sum
> ----------
>  26249970
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 17:11:59+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's too long time also - 50 seconds
>
>
> the result of SELECT SUM(x-y) request:
> ----------------------------------------------------------------
> Billing=# select now(); select sum(TarifDurationAir-CallDuration) from callbase; select now();
>           now
> ------------------------
>  2001-06-21 17:13:36+06
> (1 row)
>
>    sum
> ----------
>  12318973
> (1 row)
>
>           now
> ------------------------
>  2001-06-21 17:14:28+06
> (1 row)
> ----------------------------------------------------------------
> Note: it's 52 seconds
>
>
> The SQL command I need to request:
> ----------------------------------------------------------------
> select numberid, sum(TarifDurationAir-CallDuration)/count(*)
> from callbase
> group by numberid;
> ----------------------------------------------------------------
>
>
> The result of EXPLAIN on this request (after this request and VACUUM
> ANALYZE):
> ----------------------------------------------------------------
> NOTICE:  QUERY PLAN:
>
> Aggregate  (cost=85493.92..89243.92 rows=50000 width=12)
>   ->  Group  (cost=85493.92..86743.92 rows=500000 width=12)
>         ->  Sort  (cost=85493.92..85493.92 rows=500000 width=12)
>               ->  Seq Scan on callbase  (cost=0.00..20185.00 rows=500000 width=12)
>
> EXPLAIN
> ----------------------------------------------------------------
>
>
> The result of previous SQL command (with SELECT NOW() before and after
> it):
> ----------------------------------------------------------------
>           now
> ------------------------
>  2001-06-21 16:59:05+06
> (1 row)
>
>  numberid |   ?column?
> ----------+---------------
>        56 | 19.7777777778
>  ........................
>     10676 | 27.5357142857
> (3562 rows)
>
>           now
> ------------------------
>  2001-06-21 17:00:58+06
> (1 row)
> ----------------------------------------------------------------
> Note: too long time - 113 seconds.
>
>
> The same SQL request MS SQL 7.0 made for 24-25 seconds - that's 4.5
> times slower :(((
>
>

Re[4]: Postgres is too slow?

From

Paul

Date:

22 June 2001, 09:06:57

AP> Questions:
AP> a) which linux kernel version is it?
AP> b) what kind of a disk is it? IDE or SCSI?
AP> c) what raw speed do you from the disk? do
AP> 'time dd if=/dev/hdxxx of=/dev/null bs=64k count=10000'

HDD is IDE - Quantum lct10, 5400 rpm - yes, it's slow.
But I compared with MSSQL on the same PC with the same HDD...

# cat </proc/version
Linux version 2.2.16 (root@pm) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #5 Thu Dec 7 15:34:44
YEKT2000 

# time dd if=/dev/hda3 of=/dev/null bs=64k count=10000
 10000+0 records in
 10000+0 records out

 real    3m33.731s
 user    0m0.020s
 sys     2m37.960s

#  hdparm /dev/hda

/dev/hda:
 multcount    =  0 (off)
 I/O support  =  0 (default 16-bit)
 unmaskirq    =  0 (off)
 using_dma    =  0 (off)
 keepsettings =  0 (off)
 nowerr       =  0 (off)
 readonly     =  0 (off)
 readahead    =  8 (on)
 geometry     = 2482/255/63, sectors = 39876480, start = 0

# cat /proc/sys/fs/file-max
4096

# ipcs -l -s

------ Semaphore Limits --------
max number of arrays = 128
max semaphores per array = 250
max semaphores system wide = 32000
max ops per semop call = 100
semaphore max value = 32767

# ipcs -l -m

------ Shared Memory Limits --------
max number of segments = 128
max seg size (kbytes) = 131072
max total shared memory (kbytes) = 16777216
min seg size (bytes) = 1


Posmaster runned as:
$ postmaster -B 8000 -D /sql/data

vmstat's results during my SQL request
# vmstat
   procs                      memory    swap          io     system         cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy  id
 1  0  0      0   2036  68216  23652   0   0   199     2 1712    44   1  22  78

--
Best regards,
 Paul                            mailto:magamos@mail.ru

Re: Re[4]: Postgres is too slow?

From

Alex Pilosov

Date:

22 June 2001, 09:40:54

On Fri, 22 Jun 2001, Paul wrote:

> AP> Questions:
> AP> a) which linux kernel version is it?
> AP> b) what kind of a disk is it? IDE or SCSI?
> AP> c) what raw speed do you from the disk? do
> AP> 'time dd if=/dev/hdxxx of=/dev/null bs=64k count=10000'
>
> HDD is IDE - Quantum lct10, 5400 rpm - yes, it's slow.
> But I compared with MSSQL on the same PC with the same HDD...
>
> # cat </proc/version
> Linux version 2.2.16 (root@pm) (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)) #5 Thu Dec 7 15:34:44
YEKT2000 
>
> # time dd if=/dev/hda3 of=/dev/null bs=64k count=10000
>  10000+0 records in
>  10000+0 records out
>
>  real    3m33.731s
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>  user    0m0.020s
>  sys     2m37.960s

Whoa! Your drive is only capable of 3M/s (640M/213 sec) transfer! Your
data set is 500K*200 bytes=100M. No surprise your queries take 30 seconds
to run, since its exactly how much it'd take to transfer 100M of data from
your drive.

Please read Linux IDE tuning documentation, or upgrade to more recent
kernel. You must be using DMA to get any sort of performance with
Linux on IDE drive.

> #  hdparm /dev/hda
>
> /dev/hda:
>  multcount    =  0 (off)
>  I/O support  =  0 (default 16-bit)
This should be 1
>  unmaskirq    =  0 (off)
This should be 1
>  using_dma    =  0 (off)
This should be 1
>  keepsettings =  0 (off)
>  nowerr       =  0 (off)
>  readonly     =  0 (off)
>  readahead    =  8 (on)
>  geometry     = 2482/255/63, sectors = 39876480, start = 0
You must at LEAST

Re: Re[4]: Postgres is too slow?

From

"Richard Huxton"

Date:

22 June 2001, 12:40:38

From: "Alex Pilosov" <alex@pilosoft.com>

> On Fri, 22 Jun 2001, Paul wrote:

> > # time dd if=/dev/hda3 of=/dev/null bs=64k count=10000
> >  10000+0 records in
> >  10000+0 records out
> >
> >  real    3m33.731s
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >  user    0m0.020s
> >  sys     2m37.960s
>
> Whoa! Your drive is only capable of 3M/s (640M/213 sec) transfer! Your
> data set is 500K*200 bytes=100M. No surprise your queries take 30 seconds
> to run, since its exactly how much it'd take to transfer 100M of data from
> your drive.

Possibly, but my tests are saturating CPU anyway (granted only a 400MHz one)
even with the whole dataset cached.

$ vmstat 10
   procs                      memory    swap          io     system
cpu
 r  b  w   swpd   free   buff  cache  si  so    bi    bo   in    cs  us  sy
id
 0  0  0      0  29692  80908  86312   0   0     0     2   24    56   2   0
3
 1  0  0      0  27712  80908  86312   0   0     0    43  112    26  88   5
7
 1  0  0      0  27528  80908  86312   0   0     0   301  127    30  94   6
0
 1  0  0      0   4380  80952 109400   0   0     0   650  153    32  78  22
0
 1  0  0      0   2072  80972 111640   0   0     0   450  141    27  87  13
0
 2  0  0      0   3024  80244 111356   0   0     0     1  108    25  99   1
0
 0  0  0      0  30492  80244  86020   0   0     0   409  135    80  48   2
50

- Richard Huxton

Re: Re[4]: Postgres is too slow?

From

"Thalis A. Kalfigopoulos"

Date:

22 June 2001, 16:42:22

On Fri, 22 Jun 2001, Alex Pilosov wrote:

> Whoa! Your drive is only capable of 3M/s (640M/213 sec) transfer! Your
> data set is 500K*200 bytes=100M. No surprise your queries take 30 seconds
> to run, since its exactly how much it'd take to transfer 100M of data from
> your drive.
>
> Please read Linux IDE tuning documentation, or upgrade to more recent
> kernel. You must be using DMA to get any sort of performance with
> Linux on IDE drive.

Had the same problem with a disk and boosted its performance with hdparm (4->35Mb/s). I don't know if this was the
reasonI started getting the message "-- MARK --" in my syslog's files (20min intervals) or I just hadn't noticed them
earlier.Any idea what it means? 

TIA,
thalis

Re: Re[4]: Postgres is too slow?

From

Alex Pilosov

Date:

22 June 2001, 16:51:04

On Fri, 22 Jun 2001, Thalis A. Kalfigopoulos wrote:

> Had the same problem with a disk and boosted its performance with
> hdparm (4->35Mb/s). I don't know if this was the reason I started
> getting the message "-- MARK --" in my syslog's files (20min
> intervals) or I just hadn't noticed them earlier. Any idea what it
> means?
Syslogd has an option to put a 'MARK' statement in log every so often.
'man syslogd' should tell you more.

-alex

Re: Re[4]: Postgres is too slow?

From

Daniel Åkerud

Date:

22 June 2001, 16:51:31

man syslogd

-m interval
    The syslogd logs a mark timestamp regularly.... default 20 minutes.

No virus *sigh* :)

Daniel Åkerud

----- Original Message -----
From: "Thalis A. Kalfigopoulos" <thalis@cs.pitt.edu>
To: "Alex Pilosov" <alex@pilosoft.com>
Cc: "Paul" <magamos@mail.ru>; <pgsql-general@postgresql.org>
Sent: Friday, June 22, 2001 10:34 PM
Subject: Re: Re[4]: [GENERAL] Postgres is too slow?


> On Fri, 22 Jun 2001, Alex Pilosov wrote:
>
> > Whoa! Your drive is only capable of 3M/s (640M/213 sec) transfer! Your
> > data set is 500K*200 bytes=100M. No surprise your queries take 30
seconds
> > to run, since its exactly how much it'd take to transfer 100M of data
from
> > your drive.
> >
> > Please read Linux IDE tuning documentation, or upgrade to more recent
> > kernel. You must be using DMA to get any sort of performance with
> > Linux on IDE drive.
>
> Had the same problem with a disk and boosted its performance with hdparm
(4->35Mb/s). I don't know if this was the reason I started getting the
message "-- MARK --" in my syslog's files (20min intervals) or I just hadn't
noticed them earlier. Any idea what it means?
>
> TIA,
> thalis
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

Re: Re[4]: Postgres is too slow?

From

Martin Weinberg

Date:

22 June 2001, 16:54:23

"Thalis A. Kalfigopoulos" wrote on Fri, 22 Jun 2001 16:34:56 EDT
>On Fri, 22 Jun 2001, Alex Pilosov wrote:
>
> [SNIP]
>
>Had the same problem with a disk and boosted its performance with hdparm (4->3
>5Mb/s). I don't know if this was the reason I started getting the message "--
>MARK --" in my syslog's files (20min intervals) or I just hadn't noticed them
>earlier. Any idea what it means?

That is simply the syslogd timestamp (see man syslogd) telling you that
syslogd is alive.  Has nothing to do with tweaking using hdparm.

--Martin

Re[6]: Postgres is too slow?

From

Paul

Date:

25 June 2001, 00:03:06

I'm sorry for my mail thread, because the linux distribution I got was
very very bad - it was from the CD that came with ABit motherboard. I
checked my SQL query on another PC with Slackware and I have got the
same speed as MSSQL...

--
Best regards,
 Paul                            mailto:magamos@mail.ru