Thread: Very slow queries on 8.1

Very slow queries on 8.1

From
David Rysdam
Date:
I'm porting an application from Sybase and I've noticed that similar
application functions take 2 to 3 times longer on postgres than they
used to on the same machine running under Sybase.  I've tried changing
various "performance tuning" parameters, such as shared_buffers,
effective_cache_size, etc but there's little or no effect.  I'm
beginning to think there's a deeper root cause to the slowness.

Right now, I'm working on a test case that involves a table with ~360k
rows called "nb.sigs".  My sample query is:

select * from nb.sigs where signum > 250000

With no index, explain says this query costs 11341.  After CREATE INDEX
on the signum field, along with an ANALYZE for nb.sigs, the query costs
3456 and takes around 4 seconds to return the first row.  This seems
extremely slow to me, but I can't figure out what I might be doing
wrong.  Any ideas?

(If necessary, I can write an entire script that creates and populates a
table and then give my performance on that sample for someone else to
check against.)

Re: Very slow queries on 8.1

From
"codeWarrior"
Date:
What is the data type for "signum" ???



"David Rysdam" <drysdam@ll.mit.edu> wrote in message
news:437CA40A.8020507@ll.mit.edu...
> I'm porting an application from Sybase and I've noticed that similar
> application functions take 2 to 3 times longer on postgres than they used
> to on the same machine running under Sybase.  I've tried changing various
> "performance tuning" parameters, such as shared_buffers,
> effective_cache_size, etc but there's little or no effect.  I'm beginning
> to think there's a deeper root cause to the slowness.
>
> Right now, I'm working on a test case that involves a table with ~360k
> rows called "nb.sigs".  My sample query is:
>
> select * from nb.sigs where signum > 250000
>
> With no index, explain says this query costs 11341.  After CREATE INDEX on
> the signum field, along with an ANALYZE for nb.sigs, the query costs 3456
> and takes around 4 seconds to return the first row.  This seems extremely
> slow to me, but I can't figure out what I might be doing wrong.  Any
> ideas?
>
> (If necessary, I can write an entire script that creates and populates a
> table and then give my performance on that sample for someone else to
> check against.)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>



Re: Very slow queries on 8.1

From
Tom Lane
Date:
David Rysdam <drysdam@ll.mit.edu> writes:
> Right now, I'm working on a test case that involves a table with ~360k
> rows called "nb.sigs".  My sample query is:
> select * from nb.sigs where signum > 250000
> With no index, explain says this query costs 11341.  After CREATE INDEX
> on the signum field, along with an ANALYZE for nb.sigs, the query costs
> 3456 and takes around 4 seconds to return the first row.  This seems
> extremely slow to me, but I can't figure out what I might be doing
> wrong.  Any ideas?

How many rows does that actually return, and what client interface are
you fetching it with?  libpq, at least, likes to fetch the entire query
result before it gives it to you --- so you're talking about 4 sec to
get all the rows, not only the first one.  That might be reasonable if
you're fetching 100k rows via an indexscan...

            regards, tom lane

Re: Very slow queries on 8.1

From
David Rysdam
Date:
int4, not null and the index is unique.  I even tried clustering on it
to no avail.

codeWarrior wrote:

>What is the data type for "signum" ???
>
>
>
>"David Rysdam" <drysdam@ll.mit.edu> wrote in message
>news:437CA40A.8020507@ll.mit.edu...
>
>
>>I'm porting an application from Sybase and I've noticed that similar
>>application functions take 2 to 3 times longer on postgres than they used
>>to on the same machine running under Sybase.  I've tried changing various
>>"performance tuning" parameters, such as shared_buffers,
>>effective_cache_size, etc but there's little or no effect.  I'm beginning
>>to think there's a deeper root cause to the slowness.
>>
>>Right now, I'm working on a test case that involves a table with ~360k
>>rows called "nb.sigs".  My sample query is:
>>
>>select * from nb.sigs where signum > 250000
>>
>>With no index, explain says this query costs 11341.  After CREATE INDEX on
>>the signum field, along with an ANALYZE for nb.sigs, the query costs 3456
>>and takes around 4 seconds to return the first row.  This seems extremely
>>slow to me, but I can't figure out what I might be doing wrong.  Any
>>ideas?
>>
>>(If necessary, I can write an entire script that creates and populates a
>>table and then give my performance on that sample for someone else to
>>check against.)
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 6: explain analyze is your friend
>>
>>
>>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>
>
>
>


Re: Very slow queries on 8.1

From
David Rysdam
Date:
Tom Lane wrote:

>David Rysdam <drysdam@ll.mit.edu> writes:
>
>
>>Right now, I'm working on a test case that involves a table with ~360k
>>rows called "nb.sigs".  My sample query is:
>>select * from nb.sigs where signum > 250000
>>With no index, explain says this query costs 11341.  After CREATE INDEX
>>on the signum field, along with an ANALYZE for nb.sigs, the query costs
>>3456 and takes around 4 seconds to return the first row.  This seems
>>extremely slow to me, but I can't figure out what I might be doing
>>wrong.  Any ideas?
>>
>>
>
>How many rows does that actually return, and what client interface are
>you fetching it with?  libpq, at least, likes to fetch the entire query
>result before it gives it to you --- so you're talking about 4 sec to
>get all the rows, not only the first one.  That might be reasonable if
>you're fetching 100k rows via an indexscan...
>
>            regards, tom lane
>
>
>
>
Right, it's about 100k rows and it is through libpq (pgadmin in this
case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
to not do what it "likes" and do what I need instead?  I didn't see
anything in the docs, but I didn't look very hard.

Re: Very slow queries on 8.1

From
Tom Lane
Date:
David Rysdam <drysdam@ll.mit.edu> writes:
> Right, it's about 100k rows and it is through libpq (pgadmin in this
> case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
> to not do what it "likes" and do what I need instead?

The only way ATM is to declare a cursor on the query and then FETCH
whatever amount seems reasonable at a time.

There is support in the on-the-wire protocol for partial fetches from
ordinary queries (ie, without the notational overhead of creating a
cursor).  libpq doesn't expose that at the moment.  There's a thread
right now in pgsql-interfaces about adding such a feature to libpq ...

            regards, tom lane

Re: Very slow queries on 8.1

From
Bruno Wolff III
Date:
On Thu, Nov 17, 2005 at 11:31:27 -0500,
  David Rysdam <drysdam@ll.mit.edu> wrote:
> Right, it's about 100k rows and it is through libpq (pgadmin in this
> case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
> to not do what it "likes" and do what I need instead?  I didn't see
> anything in the docs, but I didn't look very hard.

You could use a cursor. That will bias the plan toward fast start plans which
might give you lower throughput if you are normally planning to fetch all of
the rows, but will give you quicker access to the first row.

Re: Very slow queries on 8.1

From
Martijn van Oosterhout
Date:
On Thu, Nov 17, 2005 at 11:31:27AM -0500, David Rysdam wrote:
> Right, it's about 100k rows and it is through libpq (pgadmin in this
> case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
> to not do what it "likes" and do what I need instead?  I didn't see
> anything in the docs, but I didn't look very hard.

Use the async interface. There you submit the query and retrieve rows
as they come in. It's a bit trickier to program but it can be done.

psql doesn't do this though, it's not clear how it could anyway, given
the way it formats.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: Very slow queries on 8.1

From
David Rysdam
Date:
Martijn van Oosterhout wrote:

>On Thu, Nov 17, 2005 at 11:31:27AM -0500, David Rysdam wrote:
>
>
>>Right, it's about 100k rows and it is through libpq (pgadmin in this
>>case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
>>to not do what it "likes" and do what I need instead?  I didn't see
>>anything in the docs, but I didn't look very hard.
>>
>>
>
>Use the async interface. There you submit the query and retrieve rows
>as they come in. It's a bit trickier to program but it can be done.
>
>psql doesn't do this though, it's not clear how it could anyway, given
>the way it formats.
>
>Have a nice day,
>
>
I'm experimenting with the async interface right now.  Hopefully it will
fit in well.

It's OK if psql/pgadmin don't do it, as long as the app does.

Re: Very slow queries on 8.1

From
David Rysdam
Date:
Bruno Wolff III wrote:

>On Thu, Nov 17, 2005 at 11:31:27 -0500,
>  David Rysdam <drysdam@ll.mit.edu> wrote:
>
>
>>Right, it's about 100k rows and it is through libpq (pgadmin in this
>>case, but my app uses libpq from pgtcl).  Is there a way to tell libpq
>>to not do what it "likes" and do what I need instead?  I didn't see
>>anything in the docs, but I didn't look very hard.
>>
>>
>
>You could use a cursor. That will bias the plan toward fast start plans which
>might give you lower throughput if you are normally planning to fetch all of
>the rows, but will give you quicker access to the first row.
>
>
>
>
That is exactly what I want, but is it possible to use a cursor from pgtcl?

Re: Very slow queries on 8.1

From
Michael Fuhr
Date:
On Thu, Nov 17, 2005 at 10:38:50AM -0500, David Rysdam wrote:
> I'm porting an application from Sybase and I've noticed that similar
> application functions take 2 to 3 times longer on postgres than they
> used to on the same machine running under Sybase.  I've tried changing
> various "performance tuning" parameters, such as shared_buffers,
> effective_cache_size, etc but there's little or no effect.

What's your OS and hardware environment and what are your non-default
PostgreSQL settings?

> Right now, I'm working on a test case that involves a table with ~360k
> rows called "nb.sigs".  My sample query is:
>
> select * from nb.sigs where signum > 250000
>
> With no index, explain says this query costs 11341.  After CREATE INDEX
> on the signum field, along with an ANALYZE for nb.sigs, the query costs
> 3456 and takes around 4 seconds to return the first row.

Please post the EXPLAIN ANALYZE output for the query -- that'll
show us the query plan, the accuracy of the planner's row count
estimate, and how long the query takes to execute on the server.
It might also be useful to see the table definition and the output
of the following query:

SELECT null_frac, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'nb' AND tablename = 'sigs' AND attname = 'signum';

What client interface are you using?  If the query returns a lot
of rows then you might benefit from using a cursor to fetch the
result set in chunks; otherwise the client library is probably
waiting for the entire result set to arrive before returning any
rows to you.  If the result set is large then you can experience
performance problems due to a shortage of real memory.

How volatile is the data and how common are queries based on signum?
You might benefit from clustering on the signum index.

> (If necessary, I can write an entire script that creates and populates a
> table and then give my performance on that sample for someone else to
> check against.)

If it's a short script that populates the table with canned data
then go ahead and post it.

--
Michael Fuhr

Re: Very slow queries on 8.1

From
David Rysdam
Date:
Michael Fuhr wrote:

>On Thu, Nov 17, 2005 at 10:38:50AM -0500, David Rysdam wrote:
>
>
>>I'm porting an application from Sybase and I've noticed that similar
>>application functions take 2 to 3 times longer on postgres than they
>>used to on the same machine running under Sybase.  I've tried changing
>>various "performance tuning" parameters, such as shared_buffers,
>>effective_cache_size, etc but there's little or no effect.
>>
>>
>
>What's your OS and hardware environment and what are your non-default
>PostgreSQL settings?
>
>
>
Fedora Core 2, dual 2.8 GHz, 2 GB ram.

shared_buffers = 10000
effective_cache_size = 100000

>>Right now, I'm working on a test case that involves a table with ~360k
>>rows called "nb.sigs".  My sample query is:
>>
>>select * from nb.sigs where signum > 250000
>>
>>With no index, explain says this query costs 11341.  After CREATE INDEX
>>on the signum field, along with an ANALYZE for nb.sigs, the query costs
>>3456 and takes around 4 seconds to return the first row.
>>
>>
>
>Please post the EXPLAIN ANALYZE output for the query -- that'll
>show us the query plan, the accuracy of the planner's row count
>estimate, and how long the query takes to execute on the server.
>It might also be useful to see the table definition and the output
>of the following query:
>
>SELECT null_frac, n_distinct, correlation
>FROM pg_stats
>WHERE schemaname = 'nb' AND tablename = 'sigs' AND attname = 'signum';
>
>
>
The first things would be problematic to supply, since they are actually
on a computer that doesn't have access to the Internet or to the machine
I'm writing this on.  As for the query:

Row  null_frac  n_distinct  correlation
1        0               -1               1

>What client interface are you using?  If the query returns a lot
>of rows then you might benefit from using a cursor to fetch the
>result set in chunks; otherwise the client library is probably
>waiting for the entire result set to arrive before returning any
>rows to you.
>
That does seem to be the problem.  I've never worked with cursors, so
I'll have to see if I can fit our DB module into that mold.