Re: Caching of Queries - Mailing list pgsql-performance

From Aaron Werman
Subject Re: Caching of Queries
Date
Msg-id BAY18-DAV126nx8dyDd000801d8@hotmail.com
Whole thread Raw
In response to Caching of Queries  (Scott Kirkwood <scottakirkwood@gmail.com>)
Responses Re: Caching of Queries
List pgsql-performance
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Iain" <iain@mst.co.jp>
Cc: "Jim C. Nasby" <decibel@decibel.org>; <pgsql-performance@postgresql.org>
Sent: Monday, September 27, 2004 11:17 PM
Subject: Re: [PERFORM] Caching of Queries


> "Iain" <iain@mst.co.jp> writes:
> > I can only tell you (roughly) how it works wth Oracle,
>
> Which unfortunately has little to do with how it works with Postgres.
> This "latches" stuff is irrelevant to us.

Latches are the Oracle term for semaphores. Both Oracle and pg use
semaphores and spin locks to serialize activity in critical sections. I
believe that the point that blocking/queuing reduces scalability is valid.

>
> In practice, any repetitive planning in PG is going to be consulting
> catalog rows that it draws from the backend's local catalog caches.
> After the first read of a given catalog row, the backend won't need
> to re-read it unless the associated table has a schema update.  (There
> are some other cases, like a VACUUM FULL of the catalog the rows came
> from, but in practice catalog cache entries don't change often in most
> scenarios.)  We need place only one lock per table referenced in order
> to interlock against schema updates; not one per catalog row used.
>
> The upshot of all this is that any sort of shared plan cache is going to
> create substantially more contention than exists now --- and that's not
> even counting the costs of managing the cache, ie deciding when to throw
> away entries.

I imagine a design where a shared plan cache would consist of the plans,
indexed by a statement hash and again by dependant objects.  A statement to
be planned would be hashed and matched to the cache. DDL would need to
synchronously destroy all dependant plans. If each plan maintains a validity
flag, changing the cache wouldn't have to block so I don't see where there
would be contention.

>
> A backend-local plan cache would avoid the contention issues, but would
> of course not allow amortizing planning costs across multiple backends.
>
> I'm personally dubious that sharing planning costs is a big deal.
> Simple queries generally don't take that long to plan.  Complicated
> queries do, but I think the reusability odds go down with increasing
> query complexity.
>

I think both the parse and planning are major tasks if the transaction rate
is high. Simple queries can easily take much longer to plan than execute, so
this is a scalability concern. Caching complicated queries is valuable -
apps seem to have lots of similar queries because they are intimately
related to the data model.

> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>

pgsql-performance by date:

Previous
From: "Iain"
Date:
Subject: Re: Caching of Queries
Next
From: Tom Lane
Date:
Subject: Re: Caching of Queries