Thread: Tuning single row operations

Tuning single row operations

From

"Simon Riggs"

Date:

21 December 2006, 09:33:26

For 8.3 my goal is to improve the performance of single row operations,
such asINSERT INTO foo ... VALUES (...)UPDATE foo SET .... WHERE <unique index> = <values>DELETE FROM foo     WHERE
<uniqueindex> = <values>
 

OLTP Assumptions
- all statements are prepared first, then executed with bound
parameters. 
- need to cover cases where these are executed from functions in a PL,
as well as the case where they are executed via a protocol v3+ message
from a latest-edition driver.
- we are likely to be using RETURNING clauses
- we care about both single connections issuing a stream of these
requests, as well as performance with many concurrent sessions
- we have Referential Integrity constraints defined
- we care about both with/without Slony replication
- we care about log-shipping/warm standby also
- we want to support all other features also: Most Advanced OSDB etc

I would like to discuss what opportunities exist to improve these
operations and to prioritise them for work during 8.3 and beyond.

Currently, I'm aware of these possibilities, some fairly vague
- set up index scan at plan time, not in executor
- stop the index scan immediately a single row is returned
- reduce WAL for updates when SET clause doesn't mention all cols
- avoid RI checks for update of a column not mentioned in SET
- separate prepared plan from plan state, to avoid memcpy
- avoid double access of buffer for UPDATE/DELETE by producing new fast
path through executor, streamlined for unique accesses
- turn off WAL for (some?) indexes and rebuild them following a crash
- HOT updates: don't do index inserts for unchanged indexed cols
- avoid explicit locking of indexes (at cost of concurrent index ops)
- improve RI check perf by caching small, static tables in each backend
- apply index filter conditions on index scan to avoid heap lookup
- others... feel free to add your own etc

Clearly, some of these need further work. The question is which ones
have sufficient promise to be worth taking further and what would the
priority order for that work be? I assume that a full feasibility
investigation is needed for each item and that there is *no* presumption
that something prioritised higher means it is pre-approved for
inclusion. I'll document the responses as an additional section of the
public TODO, some of which may be removed later if they prove
infeasible. Those would possibly be labelled: OLTP Performance and
Investigations: Items thought to be worth investigation.

I'd like to initiate some open discussion on how, given the above goal,
to improve performance of PostgreSQL. If you don't have any ideas or
opinions now, you're welcome to reply to this thread in the future to
introduce new possibilities.

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: Tuning single row operations

From

Richard Huxton

Date:

21 December 2006, 10:08:09

Simon Riggs wrote:
> For 8.3 my goal is to improve the performance of single row operations,

Great. That's something that's useful across the board.

> Currently, I'm aware of these possibilities, some fairly vague
...
> - avoid RI checks for update of a column not mentioned in SET

Linked at least logically - conditional triggers ("where 
old.status<>new.status" or similar) could save on IF statements early in 
trigger functions.

...
> - improve RI check perf by caching small, static tables in each backend
> - apply index filter conditions on index scan to avoid heap lookup

For fkey checks against a basically static table could you get away with 
just checking the index and not the table?

--   Richard Huxton  Archonet Ltd

Re: Tuning single row operations

From

Matthew O'Connor

Date:

21 December 2006, 10:36:14

Richard Huxton wrote:
> Simon Riggs wrote:
>> - improve RI check perf by caching small, static tables in each backend
>> - apply index filter conditions on index scan to avoid heap lookup
> 
> For fkey checks against a basically static table could you get away with 
> just checking the index and not the table?

I'm not sure that would fly, there is always the possibility it could 
change, I think the ENUMs patch will solve this performance problem.

Re: Tuning single row operations

From

"Simon Riggs"

Date:

21 December 2006, 10:57:26

On Thu, 2006-12-21 at 09:36 -0500, Matthew O'Connor wrote:
> Richard Huxton wrote:
> > Simon Riggs wrote:
> >> - improve RI check perf by caching small, static tables in each backend
> >> - apply index filter conditions on index scan to avoid heap lookup
> > 
> > For fkey checks against a basically static table could you get away with 
> > just checking the index and not the table?
> 
> I'm not sure that would fly, there is always the possibility it could 
> change, I think the ENUMs patch will solve this performance problem.

Not using SQL Standard syntax it won't. 

I'd be happier if it worked with DOMAINs and happier still if we can get
it to optimise just bare datatypes. My objective is to tune a database
without needing to reload any of the tables and to ensure that RI is
effective in both directions (from referencing and referenced tables).

Perhaps there's a way to make that happen...

--  Simon Riggs              EnterpriseDB   http://www.enterprisedb.com

Re: Tuning single row operations

From

Jim Nasby

Date:

03 January 2007, 19:31:55

On Dec 21, 2006, at 9:56 AM, Simon Riggs wrote:
> On Thu, 2006-12-21 at 09:36 -0500, Matthew O'Connor wrote:
>> Richard Huxton wrote:
>>> Simon Riggs wrote:
>>>> - improve RI check perf by caching small, static tables in each  
>>>> backend
>>>> - apply index filter conditions on index scan to avoid heap lookup
>>>
>>> For fkey checks against a basically static table could you get  
>>> away with
>>> just checking the index and not the table?
>>
>> I'm not sure that would fly, there is always the possibility it could
>> change, I think the ENUMs patch will solve this performance problem.
>
> Not using SQL Standard syntax it won't.
>
> I'd be happier if it worked with DOMAINs and happier still if we  
> can get
> it to optimise just bare datatypes. My objective is to tune a database
> without needing to reload any of the tables and to ensure that RI is
> effective in both directions (from referencing and referenced tables).
>
> Perhaps there's a way to make that happen...

A thought I've had is that if we allowed users to control the  
thresholds for compressing and toasting a field, we could essentially  
produce the effects of small, static 'lookup' tables by just using a  
text field in the base table and forcing it to always toast (you'd  
need a check constraint, too). If we detected that situation, we  
might be able to optimize for it...

As for the other items, it would be useful to have some idea what  
kind of performance gains are to be had... maybe it would be  
worthwhile to put together quick-hack patches just for performance  
testing.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)