Home > mailing lists

Thread: getting count for a specific querry

getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 14:29:15

Per a thread a while back the discussion was along the lines of serving data up to the web quick.

Our app currently pulls a bunch of data to several query pages.

I have tried and not gotten the queries to return as fast as they do now which is a huge disappointment as the hardware is twice as powerful as our current production. I can get it pretty close on some thing but others are just plain slower.

My idea is to use the limit and offset to return just the first 50 records, if they hit next I can set the offset.

My understanding was this gets slower as you move further into the data, but we have several options to modify the search, and I do not believe our clients will page very far intro a dataset.

One problem I think I will have though is they currently have count of the records matching their request and I would like to keep that as a display field

So given a table of associates my default query will be something like

Select * from tblassoc where clientnum = ‘WAZ’ and isdeleted is false

The user could add stuff like and where first name like ‘Joel’

Currently it returns all records with a count and a display of the records your viewing like 1-50 of 470, next page is 51-100 etc.

Is there a fast way to get the count? Will this concept fly? Also I am getting heat that my search is now case sensitive. What is the best way to get a case insensitive search? I could use ~* or perhaps do an UPPER(firstname) in the select etc?

Thanks for any ideas here. I have tried playing with various settings and have not seen my times change much, I will persue this on the performance mailing list.

Joel Fradkin

Re: getting count for a specific querry

From

Andrew Sullivan

Date:

08 April 2005, 16:07:39

On Fri, Apr 08, 2005 at 09:29:13AM -0400, Joel Fradkin wrote:
> My understanding was this gets slower as you move further into the data, but
> we have several options to modify the search, and I do not believe our
> clients will page very far intro a dataset.
> 

It gets slower because when you do an offset of 50, you have to pass
through the first 50 before picking up the ones you want.  If you
offset 100, you scan through the first 100.  &c.  If you don't want
to pay that, you need to use a cursor, but this causes the problem
that you have to keep your cursor open across page views, which is a
tricky issue on the Web.

> Currently it returns all records with a count and a display of the records
> your viewing like 1-50 of 470, next page is 51-100 etc.
> 
> Is there a fast way to get the count? 

Not really, no.  You have to perform a count() to get it, which is
possibly expensive.  One way to do it, though, is to do 
SELECT count(*) FROM tablename WHERE condition LIMIT n;

or something like that.  Assuming the condition is reasonably limited
(i.e. it's not going to cost you a fortune to run this), you'll get
the right number back if the number is < n or else you'll get
n.  If you have n, your application can say "viewing 1-50 of at least
n records".  This is something you see from time to time in this sort
of application.

> getting heat that my search is now case sensitive. What is the best way to
> get a case insensitive search? I could use ~* or perhaps do an
> UPPER(firstname) in the select etc? 

The upper() (or lower() -- whatever) stragegy is what I'd use.  In
any case, you want to make sure you put functional indexes on all
such columns, because otherwise you'll never get an index scan.

A
-- 
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.                --Brad Holland

Re: getting count for a specific querry

From

Ragnar Hafstað

Date:

08 April 2005, 17:16:35

On Fri, 2005-04-08 at 11:07 -0400, Andrew Sullivan wrote:
> On Fri, Apr 08, 2005 at 09:29:13AM -0400, Joel Fradkin wrote:
> > 
> > Is there a fast way to get the count? 
> 
> Not really, no.  You have to perform a count() to get it, which is
> possibly expensive.  One way to do it, though, is to do 
> 
>     SELECT count(*) FROM tablename WHERE condition LIMIT n;
> 
> or something like that.  Assuming the condition is reasonably limited
> (i.e. it's not going to cost you a fortune to run this), you'll get
> the right number back if the number is < n or else you'll get
> n.

come again ?

test=# select count(*) from a;count
-------    3
(1 row)

test=# select count(*) from a limit 2;count
-------    3
(1 row)

the LIMIT clause limits the number of rows returned by the select,
in this case 1 row.

maybe you mean something like:

test=# select count(*) from (select * from a limit 2) as foo;count
-------    2
(1 row)

gnari

Re: getting count for a specific querry

From

Ragnar Hafstað

Date:

08 April 2005, 17:24:21

On Fri, 2005-04-08 at 09:29 -0400, Joel Fradkin wrote:

> Our app currently pulls a bunch of data to several query pages.
> 
> My idea is to use the limit and offset to return just the first 50
> records, if they hit next I can set the offset.
> 
> My understanding was this gets slower as you move further into the
> data, but we have several options to modify the search, and I do not
> believe our clients will page very far intro a dataset.

you might reduce the performance loss if your dataset is ordered by
a UNIQUE index.

select * from mytable where somecondition                      ORDER by uniquecol limit 50;

and next:

select * from mytable where somecondition AND uniquecol>?                      ORDER by uniquecol limit 50 OFFSET 50;

where the ? is placeholder for last value returned by last query.

if your unique index is a multi-column one, the method is slightly
more complicated, but the same idea.

gnari

Re: getting count for a specific querry

From

Andrew Sullivan

Date:

08 April 2005, 17:30:50

On Fri, Apr 08, 2005 at 04:17:45PM +0000, Ragnar Hafstað wrote:
> On Fri, 2005-04-08 at 11:07 -0400, Andrew Sullivan wrote:
> > 
> >     SELECT count(*) FROM tablename WHERE condition LIMIT n;

> the LIMIT clause limits the number of rows returned by the select,
> in this case 1 row.
> 
> maybe you mean something like:
> 
> test=# select count(*) from (select * from a limit 2) as foo;

Yes, that was stupid of me.  That's what I meant, though.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
When my information changes, I alter my conclusions.  What do you do sir?    --attr. John Maynard Keynes

Re: getting count for a specific querry

From

Tom Lane

Date:

08 April 2005, 17:32:12

Ragnar Hafstað <gnari@simnet.is> writes:
> you might reduce the performance loss if your dataset is ordered by
> a UNIQUE index.

> select * from mytable where somecondition 
>                       ORDER by uniquecol limit 50;

> and next:

> select * from mytable where somecondition AND uniquecol>? 
>                       ORDER by uniquecol limit 50 OFFSET 50;

> where the ? is placeholder for last value returned by last query.

Uh, you don't want the OFFSET there do you?  But otherwise, yeah,
this is a popular solution for paging through some rows.  Doesn't really
help with the problem of counting the total dataset though ...
        regards, tom lane

Re: getting count for a specific querry

From

Rod Taylor

Date:

08 April 2005, 17:51:43

> > select * from mytable where somecondition AND uniquecol>? 
> >                       ORDER by uniquecol limit 50 OFFSET 50;
> 
> > where the ? is placeholder for last value returned by last query.
> 
> Uh, you don't want the OFFSET there do you?  But otherwise, yeah,
> this is a popular solution for paging through some rows.  Doesn't really
> help with the problem of counting the total dataset though ...

In the past I've done an EXPLAIN and parsed the plan to see what
PostgreSQL estimated for the number of <somecondition>. If less than
$threshhold, I would do a count(*). If more than $threshhold I would
display to the user "approx N records".

This seemed to be good enough for most cases.
--

Re: getting count for a specific querry

From

Ragnar Hafstað

Date:

08 April 2005, 17:54:30

On Fri, 2005-04-08 at 12:32 -0400, Tom Lane wrote:
> Ragnar Hafstað <gnari@simnet.is> writes:
> > you might reduce the performance loss if your dataset is ordered by
> > a UNIQUE index.
> 
> > select * from mytable where somecondition 
> >                       ORDER by uniquecol limit 50;
> 
> > and next:
> 
> > select * from mytable where somecondition AND uniquecol>? 
> >                       ORDER by uniquecol limit 50 OFFSET 50;
> 
> > where the ? is placeholder for last value returned by last query.
> 
> Uh, you don't want the OFFSET there do you? 

ooops! of course not. the uniquecol>? is meant to REPLACE
the OFFSET.

gnari

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 18:08:49

Thanks all.
I might have to add a button to do the count on command so they don't get
the hit.
I would want it to return the count of the condition, not the currently
displayed number of rows.

Is there any other database engines that provide better performance?
(We just 2 moths moving to postgres and it is not live yet, but if I am
going to get results back slower then my 2 proc box running MSSQL in 2 gig
and 2 processor I cant see any reason to move to it)
The Postgres is on a 4 proc Dell with 8 gigs of memory.
I thought I could analyze our queries and our config to optimize.


Joel Fradkin

Re: getting count for a specific querry

From

Scott Marlowe

Date:

08 April 2005, 19:11:19

On Fri, 2005-04-08 at 12:08, Joel Fradkin wrote:
> Thanks all.
> I might have to add a button to do the count on command so they don't get
> the hit.
> I would want it to return the count of the condition, not the currently
> displayed number of rows.
> 
> Is there any other database engines that provide better performance?
> (We just 2 moths moving to postgres and it is not live yet, but if I am
> going to get results back slower then my 2 proc box running MSSQL in 2 gig
> and 2 processor I cant see any reason to move to it)
> The Postgres is on a 4 proc Dell with 8 gigs of memory.
> I thought I could analyze our queries and our config to optimize.

Judging postgresql on one single data point (count(*) performance) is
quite unfair.  Unless your system only operates on static data and is
used to mostly do things like counting, in which case, why are you using
a database?

PostgreSQL is a great fit for certain loads, and a poor fit for others. 
Are you going to have lots of people updating the database WHILE the
select count(*) queries are running?  Are you going to be doing other,
more interesting things than simply counting?  If so, you really should
build a test case that emulates what you're really going to be doing
with the system.

I've found that the poor performance of aggregates in PostgreSQL is
generally more than made up for by the outstanding behaviour it exhibits
when under heavy parallel load.  

Note that the basic design of PostgreSQL's MVCC system is such that
without using some kind of trigger to maintain pre-calculated aggregate
information, it will NEVER be as fast as most other databases at doing
aggregates across large chunks of your data.

Re: getting count for a specific querry

From

Bob Henkel

Date:

08 April 2005, 19:54:43

On Apr 8, 2005 1:10 PM, Scott Marlowe <smarlowe@g2switchworks.com> wrote:

On Fri, 2005-04-08 at 12:08, Joel Fradkin wrote:
> Thanks all.
> I might have to add a button to do the count on command so they don't get
> the hit.
> I would want it to return the count of the condition, not the currently
> displayed number of rows.
>
> Is there any other database engines that provide better performance?
> (We just 2 moths moving to postgres and it is not live yet, but if I am
> going to get results back slower then my 2 proc box running MSSQL in 2 gig
> and 2 processor I cant see any reason to move to it)
> The Postgres is on a 4 proc Dell with 8 gigs of memory.
> I thought I could analyze our queries and our config to optimize.

Judging postgresql on one single data point (count(*) performance) is
quite unfair.  Unless your system only operates on static data and is
used to mostly do things like counting, in which case, why are you using
a database?

PostgreSQL is a great fit for certain loads, and a poor fit for others.
Are you going to have lots of people updating the database WHILE the
select count(*) queries are running?  Are you going to be doing other,
more interesting things than simply counting?  If so, you really should
build a test case that emulates what you're really going to be doing
with the system.

I've found that the poor performance of aggregates in PostgreSQL is
generally more than made up for by the outstanding behaviour it exhibits
when under heavy parallel load.

Note that the basic design of PostgreSQL's MVCC system is such that
without using some kind of trigger to maintain pre-calculated aggregate
information, it will NEVER be as fast as most other databases at doing
aggregates across large chunks of your data.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

From a simple/high level perspective why is this? That is why can't PostgreSQL do aggregates as well across large chunks of data. I'm assuming it extremely complicated. Otherwise the folks around here would have churned out a fix in a month or less and made this issue a past story.

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 20:23:33

Believe me I just spent two months converting our app, I do not wish to give
up on that work. We do a great deal more then count. Specifically many of
our queries run much slower on postgres. As mentioned I purchased a 4 proc
box with 8 gigs of memory for this upgrade (Dell may have been a poor choice
based on comments I have received). Even when I could see a query like
select * from tblassoc where clientnum = 'WAZ' using indexed joins on
location and jobtitle it is still taking 22 seconds to run compared to the 9
seconds on MSSQL on a 2 proc 2 gig box. I got one of my querries to run
faster using a page cost of .2 but then the assoc query was running 50
seconds, so I adjusted to a cost of 2 (tried 1.2, 2, 3, and 4 and did not
see hug changes in the assoc except it did not like .2).

I have placed a call to commandprompt.com and am going to pay for some
support to see if they have anything meaningful to add.

It could be something with my hardware, my hardware config, my postgres
config. I am just not sure. I know I have worked diligently to try to learn
all I can and I used to think I was kinda smart.

I set up the data on 4 10k scsi drives in a powervault and my wal on 2 15k
drives. I am using links to those from the install directory. It starts and
stops ok this way, but maybe it should be different.

I can tell you I am very happy to have this forum as I could not have gotten
to the point I am without the many usefull comments from folks on the list.
I greatly appreciate everyone who has helped. But truth is if I cant get to
work better then I have I may have to ditch the effort and bite the 70K
bullet. Its compounded by using 3 developers time for two months to yield an
answer that my boss may just fire me for. I figured since my first test
showed I could get data faster on the postgres box that I could with enough
study get all our data to go faster, but I am afraid I have not been very
successful.

My failure is not a reflection postgres as you mentioned it is definatley
great at some things. I have 90 some views not to mention as many stored
procedures that have been converted. I wrote an app to move the data and it
works great. But if it too slow I just will not be able to use for
production.

Joel


Judging postgresql on one single data point (count(*) performance) is
quite unfair.  Unless your system only operates on static data and is
used to mostly do things like counting, in which case, why are you using
a database?

PostgreSQL is a great fit for certain loads, and a poor fit for others.
Are you going to have lots of people updating the database WHILE the
select count(*) queries are running?  Are you going to be doing other,
more interesting things than simply counting?  If so, you really should
build a test case that emulates what you're really going to be doing
with the system.

I've found that the poor performance of aggregates in PostgreSQL is
generally more than made up for by the outstanding behaviour it exhibits
when under heavy parallel load.

Note that the basic design of PostgreSQL's MVCC system is such that
without using some kind of trigger to maintain pre-calculated aggregate
information, it will NEVER be as fast as most other databases at doing
aggregates across large chunks of your data.

Re: getting count for a specific querry

From

Tom Lane

Date:

08 April 2005, 20:28:11

Bob Henkel <luckyratfoot@gmail.com> writes:
> From a simple/high level perspective why is this? That is why can't
> PostgreSQL do aggregates as well across large chunks of data. I'm
> assuming it extremely complicated. Otherwise the folks around here
> would have churned out a fix in a month or less and made this issue a
> past story.

You can find very detailed discussions of this in the archives, but
the basic reason is that we have a very general/extensible view of
aggregates (which is how come we can support custom aggregates).
An aggregate is a function that you feed all the input rows to, one
at a time, and then it produces the answer.  Nice, general, extensible,
and not at all optimizable :-(

Now in general that is the only way to do it, and so Scott's implication
that we always suck compared to other databases is really an
overstatement.  Ask another database to do a standard deviation
calculation, for instance, and it'll be just as slow.  However there are
special cases that other DBs can optimize that we don't even try to.
The big ones are:

* COUNT(*) across a whole table --- most non-MVCC databases keep tabs of
the physical number of the rows in the table, and so they can answer
this very quickly.  Postgres doesn't keep such a count, and under MVCC
rules it wouldn't necessarily be the right answer if we had it.
(BTW, count of rows satisfying a particular condition is a different
ballgame entirely; in most cases that can't be optimized at all, AFAIK.)
If you are willing to accept approximate answers there are various
tricks you can use --- see the archives --- but we don't get to fudge
on COUNT(*) itself because it's in the SQL standard.

* MIN or MAX of an indexed column --- most DBs can use an index scan to
find such a row relatively quickly, although whether this trick works or
not depends a whole lot on whether you have WHERE or GROUP BY and just
what those conditions look like.

You can fake the min/max answer in Postgres by doing the transformstion
to an indexable query by hand, for instance instead of MAX(col) doSELECT col FROM tab ORDER BY col DESC LIMIT 1;

There are periodic discussions in the hackers list about teaching the
planner to do that automatically, and it will probably happen someday;
but it's a complicated task and not exceedingly high on the priority list.
        regards, tom lane

Re: getting count for a specific querry

From

Andrew Sullivan

Date:

08 April 2005, 20:33:40

On Fri, Apr 08, 2005 at 03:23:25PM -0400, Joel Fradkin wrote:
> Believe me I just spent two months converting our app, I do not wish to give
> our queries run much slower on postgres. As mentioned I purchased a 4 proc

I suspect you want the -performance list.  And it'd be real handy to
get some EXPLAIN ANALYSE results for the offending queries in order
to help you (where "handy" is read as "necessary").

A
-- 
Andrew Sullivan  | ajs@crankycanuck.ca
Information security isn't a technological problem.  It's an economics
problem.    --Bruce Schneier

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 20:37:23

As always thanks Tom,

I will definitely look at what I can do.
Since it is a count of matched condition records I may not have a way
around.
I don't think my clients would like me to aprox as it is a count of their
records. What I plan on doing assuming I can get all my other problems fixed
(as mentioned I am going to try and get paid help to see if I goofed it up
some where) is make the count a button, so they don't wait everytime, but
can choose to wait if need be, maybe I can store the last count with a count
on day for the generic search it defaults to, and just have them do a count
on demand if they have a specific query. Our screens have several criteria
fields in each application.

Joel Fradkin

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, April 08, 2005 2:28 PM
To: Bob Henkel
Cc: Scott Marlowe; Joel Fradkin; Andrew Sullivan; pgsql-sql@postgresql.org
Subject: Re: [SQL] getting count for a specific querry

Bob Henkel <luckyratfoot@gmail.com> writes:
> From a simple/high level perspective why is this? That is why can't
> PostgreSQL do aggregates as well across large chunks of data. I'm
> assuming it extremely complicated. Otherwise the folks around here
> would have churned out a fix in a month or less and made this issue a
> past story.

You can find very detailed discussions of this in the archives, but
the basic reason is that we have a very general/extensible view of
aggregates (which is how come we can support custom aggregates).
An aggregate is a function that you feed all the input rows to, one
at a time, and then it produces the answer.  Nice, general, extensible,
and not at all optimizable :-(

Now in general that is the only way to do it, and so Scott's implication
that we always suck compared to other databases is really an
overstatement.  Ask another database to do a standard deviation
calculation, for instance, and it'll be just as slow.  However there are
special cases that other DBs can optimize that we don't even try to.
The big ones are:

* COUNT(*) across a whole table --- most non-MVCC databases keep tabs of
the physical number of the rows in the table, and so they can answer
this very quickly.  Postgres doesn't keep such a count, and under MVCC
rules it wouldn't necessarily be the right answer if we had it.
(BTW, count of rows satisfying a particular condition is a different
ballgame entirely; in most cases that can't be optimized at all, AFAIK.)
If you are willing to accept approximate answers there are various
tricks you can use --- see the archives --- but we don't get to fudge
on COUNT(*) itself because it's in the SQL standard.

* MIN or MAX of an indexed column --- most DBs can use an index scan to
find such a row relatively quickly, although whether this trick works or
not depends a whole lot on whether you have WHERE or GROUP BY and just
what those conditions look like.

You can fake the min/max answer in Postgres by doing the transformstion
to an indexable query by hand, for instance instead of MAX(col) doSELECT col FROM tab ORDER BY col DESC LIMIT 1;

There are periodic discussions in the hackers list about teaching the
planner to do that automatically, and it will probably happen someday;
but it's a complicated task and not exceedingly high on the priority list.
        regards, tom lane

Re: getting count for a specific querry

From

Bob Henkel

Date:

08 April 2005, 20:39:06

On Apr 8, 2005 2:23 PM, Joel Fradkin <jfradkin@wazagua.com> wrote:

Believe me I just spent two months converting our app, I do not wish to give
up on that work. We do a great deal more then count. Specifically many of
our queries run much slower on postgres. As mentioned I purchased a 4 proc
box with 8 gigs of memory for this upgrade (Dell may have been a poor choice
based on comments I have received). Even when I could see a query like
select * from tblassoc where clientnum = 'WAZ' using indexed joins on
location and jobtitle it is still taking 22 seconds to run compared to the 9
seconds on MSSQL on a 2 proc 2 gig box. I got one of my querries to run
faster using a page cost of .2 but then the assoc query was running 50
seconds, so I adjusted to a cost of 2 (tried 1.2, 2, 3, and 4 and did not
see hug changes in the assoc except it did not like .2).

I have placed a call to commandprompt.com and am going to pay for some
support to see if they have anything meaningful to add.

It could be something with my hardware, my hardware config, my postgres
config. I am just not sure. I know I have worked diligently to try to learn
all I can and I used to think I was kinda smart.

I set up the data on 4 10k scsi drives in a powervault and my wal on 2 15k
drives. I am using links to those from the install directory. It starts and
stops ok this way, but maybe it should be different.

I can tell you I am very happy to have this forum as I could not have gotten
to the point I am without the many usefull comments from folks on the list.
I greatly appreciate everyone who has helped. But truth is if I cant get to
work better then I have I may have to ditch the effort and bite the 70K
bullet. Its compounded by using 3 developers time for two months to yield an
answer that my boss may just fire me for. I figured since my first test
showed I could get data faster on the postgres box that I could with enough
study get all our data to go faster, but I am afraid I have not been very
successful.

My failure is not a reflection postgres as you mentioned it is definatley
great at some things. I have 90 some views not to mention as many stored
procedures that have been converted. I wrote an app to move the data and it
works great. But if it too slow I just will not be able to use for
production.

Joel

Judging postgresql on one single data point (count(*) performance) is
quite unfair.  Unless your system only operates on static data and is
used to mostly do things like counting, in which case, why are you using
a database?

PostgreSQL is a great fit for certain loads, and a poor fit for others.
Are you going to have lots of people updating the database WHILE the
select count(*) queries are running?  Are you going to be doing other,
more interesting things than simply counting?  If so, you really should
build a test case that emulates what you're really going to be doing
with the system.

I've found that the poor performance of aggregates in PostgreSQL is
generally more than made up for by the outstanding behaviour it exhibits
when under heavy parallel load.

Note that the basic design of PostgreSQL's MVCC system is such that
without using some kind of trigger to maintain pre-calculated aggregate
information, it will NEVER be as fast as most other databases at doing
aggregates across large chunks of your data.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

Have you posted your postgresql config files for the folks here to review? I can't say I can help you with that because I too can only read the docs and go from there. But for specific situations you need specific configs. I would think you can get more out of postgresql with a some time and help from the people around here. Though count(*) looks like it may be slow.

Re: getting count for a specific querry

From

Mischa Sandberg

Date:

08 April 2005, 20:39:57

Quoting Scott Marlowe <smarlowe@g2switchworks.com>:

> On Fri, 2005-04-08 at 12:08, Joel Fradkin wrote:
> > I might have to add a button to do the count on command so they don't get
> > the hit.
> > I would want it to return the count of the condition, not the currently
> > displayed number of rows.
> 
> Judging postgresql on one single data point (count(*) performance) is
> quite unfair.  Unless your system only operates on static data and is
> used to mostly do things like counting, in which case, why are you using
> a database?

For the general discussion of slowness of count(*),
and given no entry on the subject in   http://www.postgresql.org/docs/faqs.FAQ.html
... I guess everyone has to be pointed at:http://archives.postgresql.org/pgsql-hackers/2005-01/msg00247.php

However, the gist of this person's problem is that an adhoc query,
NOT just a 'select count(*) from table', can take remarkably long.
Again, the problem is that PG can't just scan an index.
----------------------
One workaround for this is to use EXPLAIN.
THIS APPLIES TO CASES WHERE THE USER HAS WIDE LIBERTY IN QUERIES.
It's pointless overhead, otherwise.

default_statistics_target is cranked up to 200 on all such tables,
and pg_autovacuum is running. (If there were anything to improve,
it would be refining the thresholds on this).

If the "(cost...rows=nnnn" string returns a number higher than the 
QUERY row limit, the user is derailed ("That's not specific enough to answer
immediately; do you want an emailed report?").

Otherwise, it runs EXPLAIN ANALYZE, which is still faster than the query itself.
If the "(actual...rows=nnnn...)" is higher than the RESULT row limit (PAGE limit).

It then runs the query, with the PAGE rows offset and limit --- and happily,
practically everything that query needs is now in shared_buffers.
The count from the EXPLAIN analyze is displayed in the web page.

-- 
"Dreams come true, not free." -- S.Sondheim, ITW

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 20:47:47

I have asked specific questions and paid attention to the various threads on configuration.

I will take my config files and post on the performance thread that is a good suggestion (personnaly I have more faith in this forum then a paid consultant, but at this point I am willing to try both).

Thanks again.

The count thing I can get around using stored results and on demand counting, but some of my statistical reporting is just a must have.

I enclosed one of my views, I realize to get help I should also include tables and indexes etc, and maybe I will do that.

It is just there are so many of them. This one in particular did not run at all when I first got my data loaded.

I ended up adding a few indexes and not sure what else and got it to run faster on postgres.

Now it is running horrid, so I am back to the drawing board I change one thing and something else breaks.

I am just frustrated, maybe Monday I will have better strength to figure it all out.

Joel Fradkin

Attachment

viewcasecube.txt

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 20:51:04

I will also look at doing it the way you describe, they do have wide
liberty. Thanks so much for the ideas. Sorry I did not do a perusal of the
archives first (I normally try that, but think I am brain dead today).

Joel Fradkin
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, IncThis email message is for the use of the intended recipient(s)
andmay 
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]
On Behalf Of Mischa Sandberg
Sent: Friday, April 08, 2005 2:40 PM
To: Scott Marlowe
Cc: pgsql-sql@postgresql.org
Subject: Re: [SQL] getting count for a specific querry

Quoting Scott Marlowe <smarlowe@g2switchworks.com>:

> On Fri, 2005-04-08 at 12:08, Joel Fradkin wrote:
> > I might have to add a button to do the count on command so they don't
get
> > the hit.
> > I would want it to return the count of the condition, not the currently
> > displayed number of rows.
>
> Judging postgresql on one single data point (count(*) performance) is
> quite unfair.  Unless your system only operates on static data and is
> used to mostly do things like counting, in which case, why are you using
> a database?

For the general discussion of slowness of count(*),
and given no entry on the subject in   http://www.postgresql.org/docs/faqs.FAQ.html
... I guess everyone has to be pointed at:http://archives.postgresql.org/pgsql-hackers/2005-01/msg00247.php

However, the gist of this person's problem is that an adhoc query,
NOT just a 'select count(*) from table', can take remarkably long.
Again, the problem is that PG can't just scan an index.
----------------------
One workaround for this is to use EXPLAIN.
THIS APPLIES TO CASES WHERE THE USER HAS WIDE LIBERTY IN QUERIES.
It's pointless overhead, otherwise.

default_statistics_target is cranked up to 200 on all such tables,
and pg_autovacuum is running. (If there were anything to improve,
it would be refining the thresholds on this).

If the "(cost...rows=nnnn" string returns a number higher than the
QUERY row limit, the user is derailed ("That's not specific enough to answer
immediately; do you want an emailed report?").

Otherwise, it runs EXPLAIN ANALYZE, which is still faster than the query
itself.
If the "(actual...rows=nnnn...)" is higher than the RESULT row limit (PAGE
limit).

It then runs the query, with the PAGE rows offset and limit --- and happily,
practically everything that query needs is now in shared_buffers.
The count from the EXPLAIN analyze is displayed in the web page.

--
"Dreams come true, not free." -- S.Sondheim, ITW

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: getting count for a specific querry

From

Vivek Khera

Date:

08 April 2005, 21:23:14

On Apr 8, 2005, at 3:23 PM, Joel Fradkin wrote:

> I set up the data on 4 10k scsi drives in a powervault and my wal on 2 
> 15k
> drives. I am using links to those from the install directory. It 
> starts and
> stops ok this way, but maybe it should be different.
>

Your problem might just be the choice of using a Dell RAID controller.  
I have a 1 year old box connected to a 14 disk powervault (PowerEdge 
2650) and it is dog slow compared to a dual opteron with 8 disks that 
is replacing it.  It is all I/O for me, and the dell's just are not 
known for speedy I/O.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Re: getting count for a specific querry

From

Scott Marlowe

Date:

08 April 2005, 21:31:09

On Fri, 2005-04-08 at 15:23, Vivek Khera wrote:
> On Apr 8, 2005, at 3:23 PM, Joel Fradkin wrote:
> 
> > I set up the data on 4 10k scsi drives in a powervault and my wal on 2 
> > 15k
> > drives. I am using links to those from the install directory. It 
> > starts and
> > stops ok this way, but maybe it should be different.
> >
> 
> Your problem might just be the choice of using a Dell RAID controller.  
> I have a 1 year old box connected to a 14 disk powervault (PowerEdge 
> 2650) and it is dog slow compared to a dual opteron with 8 disks that 
> is replacing it.  It is all I/O for me, and the dell's just are not 
> known for speedy I/O.

Note that there are several different RAID controllers you can get with
a DELL.  I had good luck with the PERC 4C (AMI MegaRAID based) at my
last job.  In a dual 2400Mz machine with 2 gigs ram, it handily outran a
4 way (about 1200 MHz CPUs) windows / MSSQL box with 4 gigs of ram at
most tasks.  Especially inserts / updates.  The windows machine had the
more generic PERC 3I type controller in it.

Re: getting count for a specific querry

From

Bob Henkel

Date:

08 April 2005, 21:35:34

On Apr 8, 2005 3:23 PM, Vivek Khera <vivek@khera.org> wrote:

On Apr 8, 2005, at 3:23 PM, Joel Fradkin wrote:

> I set up the data on 4 10k scsi drives in a powervault and my wal on 2
> 15k
> drives. I am using links to those from the install directory. It
> starts and
> stops ok this way, but maybe it should be different.
>

Your problem might just be the choice of using a Dell RAID controller.
I have a 1 year old box connected to a 14 disk powervault (PowerEdge
2650) and it is dog slow compared to a dual opteron with 8 disks that
is replacing it. It is all I/O for me, and the dell's just are not
known for speedy I/O.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

But that is relative I would think. Is the Dell RAID much faster than my desktop SATA drive with no RAID? I'm by any means as knowledgeable about I/O setup as many of you are but my 2 cents wonders if the Dell RAID is really that much slower than a competitively priced/speced alternative? Would Joel's problems just fade away if he wasn't using a Dell RAID?

Re: getting count for a specific querry

From

Vivek Khera

Date:

08 April 2005, 21:37:03

On Apr 8, 2005, at 4:31 PM, Scott Marlowe wrote:

> Note that there are several different RAID controllers you can get with
> a DELL.  I had good luck with the PERC 4C (AMI MegaRAID based) at my
>

I've had bad luck regarding speed with *all* of them, AMI MegaRAID and 
Adaptec based ones, under high load.  Under moderate to low load 
they're acceptable.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Re: getting count for a specific querry

From

Vivek Khera

Date:

08 April 2005, 21:41:28

On Apr 8, 2005, at 4:35 PM, Bob Henkel wrote:

> desktop SATA drive with no RAID? I'm by any means as knowledgeable 
> about I/O
> setup as many of you are but my 2 cents wonders if the Dell RAID is 
> really
> that much slower than a competitively priced/speced alternative? Would
> Joel's problems just fade away if he wasn't using a Dell RAID?
>

"Dell RAID" is not one thing.  They sell "altered" RAID cards from 
Adaptec and LSI.  Whatever alteration they do to them tends to make 
them run not so fast.

I have a Dell SATA RAID (adaptec based) on the office server and it is 
OK, though not something I'd buy again.

I have various PERC 3 and PERC 4 RAID controllers on my servers (SCSI 
based) and they suck under heavy I/O load.

I wonder why the name-brand LSI cards work so much faster... perhaps it 
is the motherboard?  I don't know, and I don't care... :-)  For me, 
high performance DB and Dell servers are mutually exclusive.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Re: getting count for a specific querry

From

Scott Marlowe

Date:

08 April 2005, 21:42:05

On Fri, 2005-04-08 at 15:35, Bob Henkel wrote:
> On Apr 8, 2005 3:23 PM, Vivek Khera <vivek@khera.org> wrote: 
>         On Apr 8, 2005, at 3:23 PM, Joel Fradkin wrote:
>         
>         > I set up the data on 4 10k scsi drives in a powervault and
>         my wal on 2
>         > 15k
>         > drives. I am using links to those from the install
>         directory. It
>         > starts and
>         > stops ok this way, but maybe it should be different.
>         >
>         
>         Your problem might just be the choice of using a Dell RAID
>         controller.
>         I have a 1 year old box connected to a 14 disk powervault
>         (PowerEdge
>         2650) and it is dog slow compared to a dual opteron with 8
>         disks that
>         is replacing it.  It is all I/O for me, and the dell's just
>         are not
>         known for speedy I/O.
>         
>         Vivek Khera, Ph.D.
>         +1-301-869-4449 x806
>         
>         
> 
> But that is relative I would think.  Is the Dell RAID much faster than
> my desktop SATA drive with no RAID? I'm by any means as knowledgeable
> about I/O setup as many of you are but my 2 cents wonders if the Dell
> RAID is really that much slower than a competitively priced/speced
> alternative?  Would Joel's problems just fade away if he wasn't using
> a Dell RAID? 

My experience with the 3i controllers (See my earlier post) was that my
old Pentium Pro200x2 machine with 512 meg ram and a generic Ultra Wide
SCSI card and a half dozen drives running software RAID 5 was faster.

Seriously.  So was my P-II-350 desktop with the same controller, and an
older Dual P-III-750 with only UltraSCSI running in a RAID-1 mirror set.

The 3I is REALLY slow (or at least WAS slow) under linux.

Re: getting count for a specific querry

From

Scott Marlowe

Date:

08 April 2005, 21:43:05

On Fri, 2005-04-08 at 15:36, Vivek Khera wrote:
> On Apr 8, 2005, at 4:31 PM, Scott Marlowe wrote:
> 
> > Note that there are several different RAID controllers you can get with
> > a DELL.  I had good luck with the PERC 4C (AMI MegaRAID based) at my
> >
> 
> I've had bad luck regarding speed with *all* of them, AMI MegaRAID and 
> Adaptec based ones, under high load.  Under moderate to low load 
> they're acceptable.

Were you using battery backed cache with write-back enabled on the 4C? 
The 3C is also megaraid based, but it a pretty old design and not very
fast.

Re: getting count for a specific querry

From

Bob Henkel

Date:

08 April 2005, 21:46:34

On Apr 8, 2005 3:42 PM, Scott Marlowe <smarlowe@g2switchworks.com> wrote:

On Fri, 2005-04-08 at 15:35, Bob Henkel wrote:
> On Apr 8, 2005 3:23 PM, Vivek Khera <vivek@khera.org> wrote:
>         On Apr 8, 2005, at 3:23 PM, Joel Fradkin wrote:
>
>         > I set up the data on 4 10k scsi drives in a powervault and
>         my wal on 2
>         > 15k
>         > drives. I am using links to those from the install
>         directory. It
>         > starts and
>         > stops ok this way, but maybe it should be different.
>         >
>
>         Your problem might just be the choice of using a Dell RAID
>         controller.
>         I have a 1 year old box connected to a 14 disk powervault
>         (PowerEdge
>         2650) and it is dog slow compared to a dual opteron with 8
>         disks that
>         is replacing it.  It is all I/O for me, and the dell's just
>         are not
>         known for speedy I/O.
>
>         Vivek Khera, Ph.D.
>         +1-301-869-4449 x806
>
>
>
> But that is relative I would think.  Is the Dell RAID much faster than
> my desktop SATA drive with no RAID? I'm by any means as knowledgeable
> about I/O setup as many of you are but my 2 cents wonders if the Dell
> RAID is really that much slower than a competitively priced/speced
> alternative?  Would Joel's problems just fade away if he wasn't using
> a Dell RAID?

My experience with the 3i controllers (See my earlier post) was that my
old Pentium Pro200x2 machine with 512 meg ram and a generic Ultra Wide
SCSI card and a half dozen drives running software RAID 5 was faster.

Seriously.  So was my P-II-350 desktop with the same controller, and an
older Dual P-III-750 with only UltraSCSI running in a RAID-1 mirror set.

The 3I is REALLY slow (or at least WAS slow) under linux.

Interesting... Maybe Joel after a weekend of rest can try it on a different setup even if that different setup is just a power users development machine to see if he has same or worse timing results. Be wonderful if it magically sped up.

Re: getting count for a specific querry

From

Scott Marlowe

Date:

08 April 2005, 21:50:56

On Fri, 2005-04-08 at 15:41, Vivek Khera wrote:
> On Apr 8, 2005, at 4:35 PM, Bob Henkel wrote:
> 
> > desktop SATA drive with no RAID? I'm by any means as knowledgeable 
> > about I/O
> > setup as many of you are but my 2 cents wonders if the Dell RAID is 
> > really
> > that much slower than a competitively priced/speced alternative? Would
> > Joel's problems just fade away if he wasn't using a Dell RAID?
> >
> 
> "Dell RAID" is not one thing.  They sell "altered" RAID cards from 
> Adaptec and LSI.  Whatever alteration they do to them tends to make 
> them run not so fast.
> 
> I have a Dell SATA RAID (adaptec based) on the office server and it is 
> OK, though not something I'd buy again.
> 
> I have various PERC 3 and PERC 4 RAID controllers on my servers (SCSI 
> based) and they suck under heavy I/O load.
> 
> I wonder why the name-brand LSI cards work so much faster... perhaps it 
> is the motherboard?  I don't know, and I don't care... :-)  For me, 
> high performance DB and Dell servers are mutually exclusive.

It would  be nice to be able to put a stock ami megaraid in one and see.

Do you run your 2650s with hyperthreading on?  I found that slowed mine
down under load, but we never had more than a couple dozen users hitting
the db at once, so we may well have had a different load profile than
what you're seeing.

Re: getting count for a specific querry

From

"Joel Fradkin"

Date:

08 April 2005, 22:09:28

I turned off hyperthreading (I saw that on the list that it did not  help on
Linux).

I am using a pretty lightweight windows box Optiplex with IDE 750-meg
internal 2.4 mghz cpu.

My desktop has 2 gig, so might not be bad idea to try it local (I have
installed), but me thinks its not totally a hardware issue for us.

Joel Fradkin

Do you run your 2650s with hyperthreading on?  I found that slowed mine
down under load, but we never had more than a couple dozen users hitting
the db at once, so we may well have had a different load profile than
what you're seeing.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Re: getting count for a specific querry

From

PFC

Date:

08 April 2005, 22:38:06

> Since it is a count of matched condition records I may not have a way
> around.
What you could do is cache the search results (just caching the id's of  
the rows to display is enough and uses little space) in a cache table,  
numbering them with your sort order using a temporary sequence, so that  
you can :SELECT ... FROM cache WHERE row_position BETWEEN page_no*per_page AND  
(page_no+1)*per_page-1to get the count :SELECT row_position FROM CACHE ORDER BY row_position DESC LIMIT 1
Add a session_id referencing your sessions table with an ON DELETE  
CASCADE and the cache will be auto-purged when sessions expire.

Re: getting count for a specific querry

From

PFC

Date:

08 April 2005, 23:15:59

Please run this disk throughput test on your system :

http://boutiquenumerique.com/pf/multi_io.py
It just spawns N threads which will write a lot of data simultaneously to  
the disk, then measures the total time. Same for read. Modify the  
parameters in the source... it's set to generate 10G of files in the  
current directory and re-read them, all with 8 threads.
How much I/O do you get ?
Also  hdparm -t /dev/hd? would be interesting.

On Fri, 08 Apr 2005 21:51:02 +0200, Joel Fradkin <jfradkin@wazagua.com>  
wrote:

> I will also look at doing it the way you describe, they do have wide
> liberty. Thanks so much for the ideas. Sorry I did not do a perusal of  
> the
> archives first (I normally try that, but think I am brain dead today).
>
> Joel Fradkin
> Wazagua, Inc.
> 2520 Trailmate Dr
> Sarasota, Florida 34243
> Tel.  941-753-7111 ext 305
> jfradkin@wazagua.com
> www.wazagua.com
> Powered by Wazagua
> Providing you with the latest Web-based technology & advanced tools.
> C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
>  This email message is for the use of the intended recipient(s) and may
> contain confidential and privileged information.  Any unauthorized  
> review,
> use, disclosure or distribution is prohibited.  If you are not the  
> intended
> recipient, please contact the sender by reply email and delete and  
> destroy
> all copies of the original message, including attachments.
>
>
> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org  
> [mailto:pgsql-sql-owner@postgresql.org]
> On Behalf Of Mischa Sandberg
> Sent: Friday, April 08, 2005 2:40 PM
> To: Scott Marlowe
> Cc: pgsql-sql@postgresql.org
> Subject: Re: [SQL] getting count for a specific querry
>
> Quoting Scott Marlowe <smarlowe@g2switchworks.com>:
>
>> On Fri, 2005-04-08 at 12:08, Joel Fradkin wrote:
>> > I might have to add a button to do the count on command so they don't
> get
>> > the hit.
>> > I would want it to return the count of the condition, not the  
>> currently
>> > displayed number of rows.
>>
>> Judging postgresql on one single data point (count(*) performance) is
>> quite unfair.  Unless your system only operates on static data and is
>> used to mostly do things like counting, in which case, why are you using
>> a database?
>
> For the general discussion of slowness of count(*),
> and given no entry on the subject in
>    http://www.postgresql.org/docs/faqs.FAQ.html
> ... I guess everyone has to be pointed at:
>  http://archives.postgresql.org/pgsql-hackers/2005-01/msg00247.php
>
> However, the gist of this person's problem is that an adhoc query,
> NOT just a 'select count(*) from table', can take remarkably long.
> Again, the problem is that PG can't just scan an index.
> ----------------------
> One workaround for this is to use EXPLAIN.
> THIS APPLIES TO CASES WHERE THE USER HAS WIDE LIBERTY IN QUERIES.
> It's pointless overhead, otherwise.
>
> default_statistics_target is cranked up to 200 on all such tables,
> and pg_autovacuum is running. (If there were anything to improve,
> it would be refining the thresholds on this).
>
> If the "(cost...rows=nnnn" string returns a number higher than the
> QUERY row limit, the user is derailed ("That's not specific enough to  
> answer
> immediately; do you want an emailed report?").
>
> Otherwise, it runs EXPLAIN ANALYZE, which is still faster than the query
> itself.
> If the "(actual...rows=nnnn...)" is higher than the RESULT row limit  
> (PAGE
> limit).
>
> It then runs the query, with the PAGE rows offset and limit --- and  
> happily,
> practically everything that query needs is now in shared_buffers.
> The count from the EXPLAIN analyze is displayed in the web page.
>

Re: getting count for a specific querry

From

John DeSoi

Date:

09 April 2005, 16:00:33

On Apr 8, 2005, at 3:37 PM, Joel Fradkin wrote:

> I don't think my clients would like me to aprox as it is a count of 
> their
> records. What I plan on doing assuming I can get all my other problems 
> fixed
> (as mentioned I am going to try and get paid help to see if I goofed 
> it up
> some where) is make the count a button, so they don't wait everytime, 
> but
> can choose to wait if need be, maybe I can store the last count with a 
> count
> on day for the generic search it defaults to, and just have them do a 
> count
> on demand if they have a specific query. Our screens have several 
> criteria
> fields in each application.

Here is an interface idea I'm working on for displaying query results 
in PostgreSQL. Maybe it will work for you if your connection method 
does not prevent you from using cursors. I create a cursor an then 
fetch the first 1000 rows. The status display has 4 paging buttons, 
something like this:

|< < rows 1 - 1000 of ? > >|

The user can hit the "next" button to get the next 1000. If less than 
1000 are fetched the ? is replaced with the actual count. They can 
press the "last" button to move to the end of the cursor and get the 
actual count if they need it. So here the initial query should be fast, 
the user can get the count if they need it, and you don't have to 
re-query using limit and offset.

The problem I'm looking into now (which I just posted on the general 
list) is I don't see a way to get the table and column information from 
a cursor. If I fetch from a cursor, the table OID and column number 
values are 0 in the row description. If I execute the same query 
directly without a cursor, the row description has the correct values 
for table OID and column number.

John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL

Re: getting count for a specific querry

From

Bob Henkel

Date:

09 April 2005, 16:43:14

On Apr 9, 2005 10:00 AM, John DeSoi <desoi@pgedit.com> wrote:

On Apr 8, 2005, at 3:37 PM, Joel Fradkin wrote:

> I don't think my clients would like me to aprox as it is a count of
> their
> records. What I plan on doing assuming I can get all my other problems
> fixed
> (as mentioned I am going to try and get paid help to see if I goofed
> it up
> some where) is make the count a button, so they don't wait everytime,
> but
> can choose to wait if need be, maybe I can store the last count with a
> count
> on day for the generic search it defaults to, and just have them do a
> count
> on demand if they have a specific query. Our screens have several
> criteria
> fields in each application.

Here is an interface idea I'm working on for displaying query results
in PostgreSQL. Maybe it will work for you if your connection method
does not prevent you from using cursors. I create a cursor an then
fetch the first 1000 rows. The status display has 4 paging buttons,
something like this:

|< < rows 1 - 1000 of ? > >|

The user can hit the "next" button to get the next 1000. If less than
1000 are fetched the ? is replaced with the actual count. They can
press the "last" button to move to the end of the cursor and get the
actual count if they need it. So here the initial query should be fast,
the user can get the count if they need it, and you don't have to
re-query using limit and offset.

The problem I'm looking into now (which I just posted on the general
list) is I don't see a way to get the table and column information from
a cursor. If I fetch from a cursor, the table OID and column number
values are 0 in the row description. If I execute the same query
directly without a cursor, the row description has the correct values
for table OID and column number.

John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL

Oracle Forms uses a similar method as you described and it works just fine. It will say Record 1 of ?(But I think the developer can set the amount of records cached so that if you set it to 10 and queried 5 records it would say record 1 of 5 because it would be under the cache amount.) Forms also offers a button that say get hit count. So if you really need to know the record count you can get it without moving off the current record.

Re: getting count for a specific querry

From

John DeSoi

Date:

09 April 2005, 16:51:19

On Apr 9, 2005, at 11:43 AM, Bob Henkel wrote:

> Forms also offers a button that say get hit count. So if you really 
> need to know the record count you can get it without moving off the 
> current record.

That's a good idea too. Maybe in my interface you could click on the ? 
to get the count without changing the rows you are viewing.

John DeSoi, Ph.D.
http://pgedit.com/
Power Tools for PostgreSQL

Re: getting count for a specific querry

From

Vivek Khera

Date:

12 April 2005, 16:29:34

On Apr 8, 2005, at 4:50 PM, Scott Marlowe wrote:

> Do you run your 2650s with hyperthreading on?  I found that slowed mine
> down under load, but we never had more than a couple dozen users 
> hitting
> the db at once, so we may well have had a different load profile than
> what you're seeing.
>

Yep. Turned off as per various recommendations on this list.  The RAID 
card on this box is a PERC 3/DC.  It is a very big disappointment.  The 
Opteron based generic system totally outperforms this Dell box.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Re: getting count for a specific querry

From

Scott Marlowe

Date:

12 April 2005, 17:23:19

On Tue, 2005-04-12 at 14:29, Vivek Khera wrote:
> On Apr 8, 2005, at 4:50 PM, Scott Marlowe wrote:
> 
> > Do you run your 2650s with hyperthreading on?  I found that slowed mine
> > down under load, but we never had more than a couple dozen users 
> > hitting
> > the db at once, so we may well have had a different load profile than
> > what you're seeing.
> >
> 
> Yep. Turned off as per various recommendations on this list.  The RAID 
> card on this box is a PERC 3/DC.  It is a very big disappointment.  The 
> Opteron based generic system totally outperforms this Dell box.


How much memory is in the box?  I've heard horror stories about
performance with >2 gigs of ram, which is why I made them order mine
with 2 gigs.  Does the 3/DC have battery backed cache set to write back?

Re: getting count for a specific querry

From

Vivek Khera

Date:

12 April 2005, 17:32:22

On Apr 12, 2005, at 4:23 PM, Scott Marlowe wrote:

> How much memory is in the box?  I've heard horror stories about
> performance with >2 gigs of ram, which is why I made them order mine
> with 2 gigs.  Does the 3/DC have battery backed cache set to write 
> back?
>

4GB RAM and battery backed cache set to write-back mode.  FreeBSD 4.11.

Vivek Khera, Ph.D.
+1-301-869-4449 x806

Re: getting count for a specific querry

From

Scott Marlowe

Date:

12 April 2005, 17:42:59

On Tue, 2005-04-12 at 15:32, Vivek Khera wrote:
> On Apr 12, 2005, at 4:23 PM, Scott Marlowe wrote:
> 
> > How much memory is in the box?  I've heard horror stories about
> > performance with >2 gigs of ram, which is why I made them order mine
> > with 2 gigs.  Does the 3/DC have battery backed cache set to write 
> > back?
> >
> 
> 4GB RAM and battery backed cache set to write-back mode.  FreeBSD 4.11.

If you've got the time, try running it with only 2 gigs and compare the
speed.  I never really got a chance to run mine with >2 gigs, but I know
that I read plenty of posts at the time that the chipset in the 2650 was
REALLY slow at using memory over 2 gig.