Thread: Insertions slower than Updates?
Hi all,
In performance testing we’re doing we are currently running two scenarios:
- Starting from an empty db, therefore all operations are INSERTs.
- Starting from an existing db – thereby UPDATing all records.
I should also mention that the tables we are dealing with are heavily indexed.
I would expect that the first scenario would give much better results than the second one as:
- INSERT should be cheaper than UPDATE due to only dealing with one record instead of two.
- We also have SELECT queries before the operation and in the first configuration, the SELECTs will be dealing with much less data for most of the run.
To our surprise, we see that the second scenario gives better results with an average processing time of an event at around %70 of the time run in the first scenario.
Anyone have any ideas on why the empty db is giving worse results??
Many Thanks,
Ofer
Ofer Israeli <oferi@checkpoint.com> wrote: > INSERT should be cheaper than UPDATE due to only dealing with one > record instead of two. ... unless the UPDATE is a HOT update, in which case the indexes don't need to be touched. > Anyone have any ideas on why the empty db is giving worse > results?? Besides the HOT updates being fast, there is the issue of having space already allocated and ready for the database to use, rather than needing to make calls to the OS to create and extend files as space is needed. -Kevin
Kevin Grittner wrote: > Ofer Israeli <oferi@checkpoint.com> wrote: >> Anyone have any ideas on why the empty db is giving worse results?? > > Besides the HOT updates being fast, there is the issue of having > space already allocated and ready for the database to use, rather > than needing to make calls to the OS to create and extend files as > space is needed. > I thought about this direction as well, but on UPDATES, some of them will need to ask the OS for more space anyhow at leastat the beginning of the run, additional pages will be needed. Do you expect that the OS level allocations are so expensiveas to show an ~%40 increase of processing time in average? Thanks, Ofer
Ofer Israeli <oferi@checkpoint.com> wrote: > Kevin Grittner wrote: >> Ofer Israeli <oferi@checkpoint.com> wrote: >>> Anyone have any ideas on why the empty db is giving worse >>> results?? >> >> Besides the HOT updates being fast, there is the issue of having >> space already allocated and ready for the database to use, rather >> than needing to make calls to the OS to create and extend files >> as space is needed. > > I thought about this direction as well, but on UPDATES, some of > them will need to ask the OS for more space anyhow at least at the > beginning of the run, additional pages will be needed. Do you > expect that the OS level allocations are so expensive as to show > an ~%40 increase of processing time in average? Gut feel, 40% does seem high for just that; but HOT updates could easily account for that, especially since you said that the tables are "heavily indexed". That is, as long as there are enough updates which don't modify indexed columns. -Kevin
Ofer Israeli <oferi@checkpoint.com> wrote:Gut feel, 40% does seem high for just that; but HOT updates could
> Kevin Grittner wrote:
>> Ofer Israeli <oferi@checkpoint.com> wrote:
>>> Anyone have any ideas on why the empty db is giving worse
>>> results??
>>
>> Besides the HOT updates being fast, there is the issue of having
>> space already allocated and ready for the database to use, rather
>> than needing to make calls to the OS to create and extend files
>> as space is needed.
>
> I thought about this direction as well, but on UPDATES, some of
> them will need to ask the OS for more space anyhow at least at the
> beginning of the run, additional pages will be needed. Do you
> expect that the OS level allocations are so expensive as to show
> an ~%40 increase of processing time in average?
easily account for that, especially since you said that the tables
are "heavily indexed". That is, as long as there are enough updates
which don't modify indexed columns.
-Kevin
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
Steve Horn
http://www.stevehorn.cc
steve@stevehorn.cc
http://twitter.com/stevehorn
740-503-2300
Kevin Grittner wrote: > Ofer Israeli <oferi@checkpoint.com> wrote: >> Kevin Grittner wrote: >>> Ofer Israeli <oferi@checkpoint.com> wrote: >>>> Anyone have any ideas on why the empty db is giving worse results?? >>> >>> Besides the HOT updates being fast, there is the issue of having >>> space already allocated and ready for the database to use, rather >>> than needing to make calls to the OS to create and extend files >>> as space is needed. >> >> I thought about this direction as well, but on UPDATES, some of them >> will need to ask the OS for more space anyhow at least at the >> beginning of the run, additional pages will be needed. Do you expect >> that the OS level allocations are so expensive as to show an ~%40 >> increase of processing time in average? > > Gut feel, 40% does seem high for just that; but HOT updates could > easily account for that, especially since you said that the tables > are "heavily indexed". That is, as long as there are enough updates > which don't modify indexed columns. Most, if not all of our UPDATEs, involve updating an indexed column, so HOT updates are actually not performed at all :(
Ofer Israeli <oferi@checkpoint.com> wrote:
> In performance testing we’re doing we are currently running two scenarios:
> I should also mention that the tables we are dealing with are heavily indexed.
> I would expect that the first scenario would give much better results than the second one as:
> To our surprise, we see that the second scenario gives better results with an average processing time of an event at around %70 of the time run in the first scenario.
> Anyone have any ideas on why the empty db is giving worse results??
A little googleing led me to this thought, will be happy to hear you're input on this. If the database is initially empty, the analyzer will probably decide to query the tables by full table scan as opposed to index searching. Now supposedly, during our test run, the analyzer does not run frequently enough and so at some point we are using the wrong method for SELECTs.
The version we are using is 8.3.7.
My questions are:
1. Does the above seem reasonable to you?
2. How often does the analyzer run? To my understanding it will always run with autovacuum, is that right? Is it triggered at other times as well?
3. Does the autoanalyzer work only partially on the db like autovacuum going to sleep after a certain amount of work was done or does it work until finished? If it is partial work, maybe it does not reach our relevant tables. What dictates the order in which it will work?
Many thanks,
Ofer