Thread: postgresql is slow with larger table even it is in RAM

postgresql is slow with larger table even it is in RAM

From
"sathiya psql"
Date:
Dear Friends,
     I have a table with 32 lakh record in it. Table size is nearly 700 MB, and my machine had a 1 GB + 256 MB RAM, i had created the table space in RAM, and then created this table in this RAM.

    So now everything is in RAM, if i do a count(*) on this table it returns 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is happening, swap is also not used )

Any Idea on this ???

I searched a lot in newsgroups ... can't find relevant things.... ( because everywhere they are speaking about disk access speed, here i don't want to worry about disk access )

If required i will give more information on this.


Re: postgresql is slow with larger table even it is in RAM

From
"sathiya psql"
Date:
On Tue, Mar 25, 2008 at 2:09 PM, jose javier parra sanchez <jojapasa@gmail.com> wrote:
It's been said zillions of times on the maillist. Using a select
count(*) in postgres is slow, and probably will be slow for a long
time. So that function is not a good way to measure perfomance.
Yes, but if the data is in HDD then we can say this...

but now the data is in RAM

Re: postgresql is slow with larger table even it is in RAM

From
hubert depesz lubaczewski
Date:
On Tue, Mar 25, 2008 at 02:05:20PM +0530, sathiya psql wrote:
> Any Idea on this ???

yes. dont use count(*).

if you want whole-table row count, use triggers to store the count.

it will be slow. regeardless of whether it's in ram or on hdd.

depesz

--
quicksil1er: "postgres is excellent, but like any DB it requires a
highly paid DBA.  here's my CV!" :)
http://www.depesz.com/ - blog dla ciebie (i moje CV)

Re: postgresql is slow with larger table even it is in RAM

From
Craig Ringer
Date:
hubert depesz lubaczewski wrote:
> On Tue, Mar 25, 2008 at 02:05:20PM +0530, sathiya psql wrote:
>> Any Idea on this ???
>
> yes. dont use count(*).
>
> if you want whole-table row count, use triggers to store the count.
>
> it will be slow. regeardless of whether it's in ram or on hdd.

In other words, if you're having performance problems please provide
EXPLAIN ANALYZE output from a more useful query that does real work,
rather than something like count(*).

COUNT(*) can be slow due to some MVCC limitations; it's been discussed
frequently here so you should search the archives for information.

--
Craig Ringer

Re: postgresql is slow with larger table even it is in RAM

From
Alvaro Herrera
Date:
sathiya psql escribió:

>     So now everything is in RAM, if i do a count(*) on this table it returns
> 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that
> no Disk I/O is happening.

It has to scan every page and examine visibility for every record.  Even
if there's no I/O involved, there's a lot of work to do.  I am not sure
if with your hardware it is expected for it to take 3 seconds though.
Do you see high CPU usage during that period?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: postgresql is slow with larger table even it is in RAM

From
Bill Moran
Date:
In response to "sathiya psql" <sathiya.psql@gmail.com>:

> Dear Friends,
>      I have a table with 32 lakh record in it. Table size is nearly 700 MB,
> and my machine had a 1 GB + 256 MB RAM, i had created the table space in
> RAM, and then created this table in this RAM.
>
>     So now everything is in RAM, if i do a count(*) on this table it returns
> 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that
> no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is
> happening, swap is also not used )
>
> Any Idea on this ???

Yes.  It takes your hardware about 3 seconds to read through 700M of ram.

Keep in mind that you're not just reading RAM.  You're pushing system
requests through the VFS layer of your operating system, which is treating
the RAM like a disk (with cylinder groups and inodes and blocks, etc) so
you have all that processing overhead as well.  What filesystem did you
format the RAM disk with?

Why are you doing this?  If you have enough RAM to store the table, why
not just allocate it to shared buffers?

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: postgresql is slow with larger table even it is in RAM

From
"sathiya psql"
Date:
Yes.  It takes your hardware about 3 seconds to read through 700M of ram.

 

Keep in mind that you're not just reading RAM.  You're pushing system
requests through the VFS layer of your operating system, which is treating
the RAM like a disk (with cylinder groups and inodes and blocks, etc) so
you have all that processing overhead as well.  What filesystem did you
format the RAM disk with?
tmpfs

Why are you doing this?  If you have enough RAM to store the table, why
not just allocate it to shared buffers?

just allocating will  make read from hdd to RAM at first time, to eliminate that


are you saying it will take 3 seconds surely if i have 50 lakh record

Re: postgresql is slow with larger table even it is in RAM

From
Bill Moran
Date:
In response to "sathiya psql" <sathiya.psql@gmail.com>:
> >
> > Yes.  It takes your hardware about 3 seconds to read through 700M of ram.
> >
> >
> > Keep in mind that you're not just reading RAM.  You're pushing system
> > requests through the VFS layer of your operating system, which is treating
> > the RAM like a disk (with cylinder groups and inodes and blocks, etc) so
> > you have all that processing overhead as well.  What filesystem did you
> > format the RAM disk with?
>
> tmpfs

I'm not an expert, but according to wikipedia:

"tmpfs (previously known as shmfs) distinguishes itself from the Linux ramdisk device by allocating memory dynamically
andby allowing less-used pages to be moved onto swap space." 

Both dynamically allocating and swapping are potential problems, but I
don't know how to tell you to determine if they're issues or not.

> > Why are you doing this?  If you have enough RAM to store the table, why
> > not just allocate it to shared buffers?
>
> just allocating will  make read from hdd to RAM at first time, to eliminate
> that

PostgreSQL is still going to copy the data from your RAM disk into shared
buffers before working with it, so you still have that overhead.  All
you're escaping is the time involved in physical disk activity, which is
what shared_buffers are designed to avoid.

> are you saying it will take 3 seconds surely if i have 50 lakh record

No.  That is dependent on your hardware and other factors.

You are trying to use the system in a non-standard configuration.  If it
doesn't work that way, don't be surprised.

Also, what are you expectations?  Honestly, I don't see any problems with
the results you're getting, they're about what I would expect.  Are you
trying to compare PostgreSQL to MySQL/MyISAM?  More directly, what is
your purpose in starting this email conversation?  What are you hoping
to accomplish?

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: postgresql is slow with larger table even it is in RAM

From
Luke Lonergan
Date:
Hello Sathiya,

1st:  you should not use a ramdisk for this, it will slow things down as compared to simply having the table on disk.  Scanning it the first time when on disk will load it into the OS IO cache, after which you will get memory speed.

2nd: you should expect the “SELECT COUNT(*)” to run at a maximum of about 350 – 600 MB/s (depending on PG version and CPU speed).  It is CPU speed limited to that rate of counting rows no matter how fast your IO is.

So, for your 700 MB table, you should expect a COUNT(*) to run in about 1-2 seconds best case.  This will approximate the speed at which other queries can run against the table.

- Luke


On 3/25/08 1:35 AM, "sathiya psql" <sathiya.psql@gmail.com> wrote:

Dear Friends,
     I have a table with 32 lakh record in it. Table size is nearly 700 MB, and my machine had a 1 GB + 256 MB RAM, i had created the table space in RAM, and then created this table in this RAM.

    So now everything is in RAM, if i do a count(*) on this table it returns 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is happening, swap is also not used )

Any Idea on this ???

I searched a lot in newsgroups ... can't find relevant things.... ( because everywhere they are speaking about disk access speed, here i don't want to worry about disk access )

If required i will give more information on this.



Re: postgresql is slow with larger table even it is in RAM

From
"sathiya psql"
Date:

1st:  you should not use a ramdisk for this, it will slow things down as compared to simply having the table on disk.  Scanning it the first time when on disk will load it into the OS IO cache, after which you will get memory speed.
absolutely....

after getting  some replies, i dropped the table from ramdisk,
and started to have that in the disk itself..

2nd: you should expect the "SELECT COUNT(*)" to run at a maximum of about 350 – 600 MB/s (depending on PG version and CPU speed).  It is CPU speed limited to that rate of counting rows no matter how fast your IO is.
am using 8.1
pentium duo core

So, for your 700 MB table, you should expect a COUNT(*) to run in about 1-2 seconds best case.  This will approximate the speed at which other queries can run against the table.
ok count(*) per say, but other queries is taking much time...

ok i ll do more experimentations and i ll be back....


Very great thanks for all of your replies GUYZ.....

- Luke



On 3/25/08 1:35 AM, "sathiya psql" <sathiya.psql@gmail.com> wrote:

Dear Friends,
     I have a table with 32 lakh record in it. Table size is nearly 700 MB, and my machine had a 1 GB + 256 MB RAM, i had created the table space in RAM, and then created this table in this RAM.

    So now everything is in RAM, if i do a count(*) on this table it returns 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is happening, swap is also not used )

Any Idea on this ???

I searched a lot in newsgroups ... can't find relevant things.... ( because everywhere they are speaking about disk access speed, here i don't want to worry about disk access )

If required i will give more information on this.




Re: postgresql is slow with larger table even it is in RAM

From
Chris Browne
Date:
sathiya.psql@gmail.com ("sathiya psql") writes:
>                       On Tue, Mar 25, 2008 at 2:09 PM, jose javier parra sanchez <jojapasa@gmail.com> wrote:
>
>
>                                It's been said zillions of times on the maillist. Using a select
>                               count(*) in postgres is slow, and probably will be slow for a long
>                                 time. So that function is not a good way to measure perfomance.
>
>
>
>                               Yes, but if the data is in HDD then we can say this...
>                                     but now the data is in RAM

Even if the data all is in RAM, it will still take some definitely
non-zero time to examine all of the pages, looking for tuples, and
then to determine which of those tuples are visible from the
perspective of your DB connection.

If 500MB of relevant data is sitting on disk, then it will take
whatever time it takes to pull it from disk; if it is in memory, there
is still work to be done...
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://linuxdatabases.info/info/finances.html
Rules of  the Evil Overlord #76.  "If the hero  runs up to my  roof, I
will not run up after him and  struggle with him in an attempt to push
him  over the  edge. I  will also  not  engage him  at the  edge of  a
cliff. (In the middle of a  rope-bridge over a river of molten lava is
not even worth considering.)" <http://www.eviloverlord.com/>

Re: postgresql is slow with larger table even it is in RAM

From
"Peter Koczan"
Date:
On Tue, Mar 25, 2008 at 3:35 AM, sathiya psql <sathiya.psql@gmail.com> wrote:
> Dear Friends,
>      I have a table with 32 lakh record in it. Table size is nearly 700 MB,
> and my machine had a 1 GB + 256 MB RAM, i had created the table space in
> RAM, and then created this table in this RAM.
>
>     So now everything is in RAM, if i do a count(*) on this table it returns
> 327600 in 3 seconds, why it is taking 3 seconds ????? because am sure that
> no Disk I/O is happening. ( using vmstat i had confirmed, no disk I/O is
> happening, swap is also not used )
>
> Any Idea on this ???
>
> I searched a lot in newsgroups ... can't find relevant things.... ( because
> everywhere they are speaking about disk access speed, here i don't want to
> worry about disk access )
>
>  If required i will give more information on this.

Two things:

- Are you VACUUM'ing regularly? It could be that you have a lot of
dead rows and the table is spread out over a lot of pages of mostly
dead space. That would cause *very* slow seq scans.

- What is your shared_buffers set to? If it's really low then postgres
could be constantly swapping from ram-disk to memory. Not much would
be cached, and performance would suffer.

FWIW, I did a select count(*) on a table with just over 300000 rows,
and it only took 0.28 sec.

Peter

Re: postgresql is slow with larger table even it is in RAM

From
Luke Lonergan
Date:
So your table is about 80 MB in size, or perhaps 120 MB if it fits in shared_buffers.  You can check it using “SELECT pg_size_pretty(pg_relation_size(‘mytable’))”

- Luke  


On 3/26/08 4:48 PM, "Peter Koczan" <pjkoczan@gmail.com> wrote:

FWIW, I did a select count(*) on a table with just over 300000 rows,
and it only took 0.28 sec.