Re: Sequential vs. random values - number of pages in B-tree - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Sequential vs. random values - number of pages in B-tree
Date
Msg-id CA+bJJbyJd5SvE8Od=ud0iE2iqttK1JcTdctyiEDPwszMQseJMw@mail.gmail.com
Whole thread Raw
In response to Re: Sequential vs. random values - number of pages in B-tree  (pinker <pinker@onet.eu>)
Responses Re: Sequential vs. random values - number of pages in B-tree  (Rob Sargent <robjsargent@gmail.com>)
List pgsql-general
Hi pinker:

On Tue, Aug 23, 2016 at 2:26 PM, pinker <pinker@onet.eu> wrote:
> I am just surprised by the order of magnitude in the difference though. 2
> and 27 minutes that's the huge difference... I did another, simplified test,
> to make sure there is no duplicates and the only difference between both
> sets is the order:
...
> INSERT INTO t_sequential SELECT * FROM source_sequential;
> 102258,949 ms
> INSERT INTO t_random SELECT * FROM source_random;
> 1657575,699 ms

If I read correctly, you are getting 100s/10Mkeys=10us/key in
sequential, and 165 in random.

I'm not surprissed at all. I've got greater differences on a memory
tree, sorted insertion can be easily optimized to be very fast. AS an
example, sequential insertion can easily avoid moving data while
filling the pages and, with a little care, it can also avoid some of
them when splitting. I'm not current with the current postgres
details, but it does not surprise me they have big optimizations for
this, especially when index ordered insertion is quite common in
things like bulk loads or timestamped log lines.

Francisco Olarte.


pgsql-general by date:

Previous
From: "Ilya Kazakevich"
Date:
Subject: Re: question on error during COPY FROM
Next
From: Francisco Olarte
Date:
Subject: Re: question on error during COPY FROM