Re: Performance Bottleneck - Mailing list pgsql-performance

From Martin Foster
Subject Re: Performance Bottleneck
Date
Msg-id tlUQc.32079$cd2.29211@clgrps12
Whole thread Raw
In response to Re: Performance Bottleneck  (Mike Benoit <ipso@snappymail.ca>)
Responses Re: Performance Bottleneck  (Mike Benoit <ipso@snappymail.ca>)
List pgsql-performance
Mike Benoit wrote:

> On Wed, 2004-08-04 at 17:25 +0200, Gaetano Mendola wrote:
>
>
>>>The queries themselves are simple, normally drawing information from one
>>>table with few conditions or in the most complex cases using joins on
>>>two table or sub queries.   These behave very well and always have, the
>>>problem is that these queries take place in rather large amounts due to
>>>the dumb nature of the scripts themselves.
>>
>>Show us the explain analyze on that queries, how many rows the tables are
>>containing, the table schema could be also usefull.
>>
>
>
> If the queries themselves are optimized as much as they can be, and as
> you say, its just the sheer amount of similar queries hitting the
> database, you could try using prepared queries for ones that are most
> often executed to eliminate some of the overhead.
>
> I've had relatively good success with this in the past, and it doesn't
> take very much code modification.
>

One of the biggest problems is most probably related to the indexes.
Since the performance penalty of logging the information needed to see
which queries are used and which are not is a slight problem, then I
cannot really make use of it for now.

However, I am curious how one would go about preparing query?   Is this
similar to the DBI::Prepare statement with placeholders and simply
changing the values passed on execute?  Or is this something database
level such as a view et cetera?

SELECT
  Post.PostIDNumber,
  Post.$format,
  Post.PuppeteerLogin,
  Post.PuppetName,
  Post.PostCmd,
  Post.PostClass
FROM Post
WHERE Post.PostIDNumber > ?::INT
   AND (Post.PostTo='all' OR Post.PostTo=?)
   AND (NOT EXISTS (SELECT PuppetIgnore.PuppetLogin
    FROM PuppetIgnore
    WHERE PuppetIgnore.PuppetIgnore='global'
      AND PuppetIgnore.PuppeteerLogin=?
      AND PuppetIgnore.PuppetLogin=Post.PuppeteerLogin)
    OR Post.PuppeteerLogin IS NULL)
   AND (NOT EXISTS (SELECT PuppetIgnore.PuppetName
    FROM PuppetIgnore
    WHERE PuppetIgnore.PuppetIgnore='single'
      AND PuppetIgnore.PuppeteerLogin=?
      AND PuppetIgnore.PuppetName=Post.PuppetName)
    OR Post.PuppetName IS NULL)
ORDER BY Post.PostIDNumber LIMIT 100

The range is determined from the previous run or through a query listed
below.   It was determined that using INT was far faster then limiting
by timestamp.

SELECT MIN(PostIDNumber)
FROM Post
WHERE RealmName=?
   AND PostClass IN ('general','play')
   AND PostTo='all'

The above simply provides a starting point, nothing more.   Once posts
are pulled the script will throw in the last pulled number as to start
from a fresh point.

Under MySQL time was an stored as an INT which may have helped it handle
timestamps more efficiently.    It also made use of three or more
queries, where two were done to generate an IN statement for the query
actually running at the time.

    Martin Foster
    Creator/Designer Ethereal Realms
    martin@ethereal-realms.org

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: The black art of postgresql.conf tweaking
Next
From: Gaetano Mendola
Date:
Subject: Re: Performance Bottleneck