Re: Is There Any Way .... - Mailing list pgsql-performance

From Alan Stange
Subject Re: Is There Any Way ....
Date
Msg-id 435D8151.8000200@rentec.com
Whole thread Raw
In response to Re: Is There Any Way ....  (Alex Turner <armtuk@gmail.com>)
List pgsql-performance
Alex Turner wrote:
> This is possible with Oracle utilizing the keep pool
>
> alter table t_name storage ( buffer_pool keep);
>
> If Postgres were to implement it's own caching system, this seems like
> it would be easily to implement (beyond the initial caching effort).
>
> Alex
>
>
> On 10/24/05, Craig A. James <cjames@modgraph-usa.com> wrote:
>
>> Jim C. Nasby" <jnasby ( at ) pervasive ( dot ) com> wrote:
>>
>>>> Stefan Weiss wrote:
>>>> ... IMO it would be useful to have a way to tell
>>>> PG that some tables were needed frequently, and should be cached if
>>>> possible. This would allow application developers to consider joins with
>>>> these tables as "cheap", even when querying on columns that are
>>>> not indexed.
>>>>
>>> Why do you think you'll know better than the database how frequently
>>> something is used? At best, your guess will be correct and PostgreSQL
>>> (or the kernel) will keep the table in memory. Or, your guess is wrong
>>> and you end up wasting memory that could have been used for something
>>> else.
>>>
>>> It would probably be better if you describe why you want to force this
>>> table (or tables) into memory, so we can point you at more appropriate
>>> solutions.
>>>
>> Or perhaps we could explain why we NEED to force these tables into memory, so we can point you at a more appropriate
implementation. ;-) 
>>
>> Ok, wittiness aside, here's a concrete example.  I have an application with one critical index that MUST remain in
memoryat all times.  The index's tablespace is about 2 GB.  As long as it's in memory, performance is excellent - a
user'squery takes a fraction of a second.  But if it gets swapped out, the user's query might take up to five minutes
asthe index is re-read from memory. 
>>
>> Now here's the rub.  The only performance I care about is response to queries from the web application.  Everything
elseis low priority.  But there is other activity going on.  Suppose, for example, that I'm updating tables, performing
queries,doing administration, etc., etc., for a period of an hour, during which no customer visits the site.  The
anothercustomer comes along and performs a query. 
>>
>> At this point, no heuristic in the world could have guessed that I DON'T CARE ABOUT PERFORMANCE for anything except
myweb application.  The performance of all the other stuff, the administration, the updates, etc., is utterly
irrelevantcompared to the performance of the customer's query. 
>>
>> What actually happens is that the other activities have swapped out the critical index, and my customer waits, and
waits,and waits... and goes away after a minute or two.  To solve this, we've been forced to purchase two computers,
andmirror the database on both.  All administration and modification happens on the "offline" database, and the web
applicationonly uses the "online" database.  At some point, we swap the two servers, sync the two databases, and carry
on. It's a very unsatisfactory solution. 
We have a similar problem with vacuum being the equivalent of
"continuously flush all system caches for a long time".  Our database is
about 200GB in size and vacuums take hours and hours.   The performance
is acceptable still, but only because we've hidden the latency in our
application.

I've occasionally thought it would be good to have the backend doing a
vacuum or analyze also call priocntl() prior to doing any real work to
lower its priority.   We'll be switching to the 8.1 release ASAP just
because the direct IO capabilities are appearing to be a win on our
development system.

-- Alan


pgsql-performance by date:

Previous
From: Alex Turner
Date:
Subject: Re: Is There Any Way ....
Next
From: "Christian Paul B. Cosinas"
Date:
Subject: Re: Used Memory