RE: GSOC 2018 Project - A New Sorting Routine - Mailing list pgsql-hackers

From Kefan Yang
Subject RE: GSOC 2018 Project - A New Sorting Routine
Date
Msg-id 5b5654cc.1c69fb81.5ea44.d698@mx.google.com
Whole thread Raw
In response to Re: GSOC 2018 Project - A New Sorting Routine  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: GSOC 2018 Project - A New Sorting Routine  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers

Hi Tomas!

 

I did a few tests on my own Linux machine, but the problem is that my resources on AWS(CPU, RAM and even Disk space) are very limited. I considered establishing virtual machine on my own PC but the performance is even worse.

 

My original patch has two main optimizations: (1) switch to heap sort when depth limit exceeded (2) check whether the array is presorted only once at the beginning. Now I want to test these optimizations separately. On AWS EC2 instance, regressions on CREATE INDEX cases seems to be less significant if we use (1) only, but I can only test up to 100000 records and 512MB memory using your scripts.

 

So would you mind re-running the tests using the two patches I provided in the attachment? That will be very helpful

 

Regards,

Kefan

 

From: Tomas Vondra
Sent: July 18, 2018 2:26 PM
To: Kefan Yang
Cc: Andrey Borodin; Peter Geoghegan; PostgreSQL Hackers
Subject: Re: GSOC 2018 Project - A New Sorting Routine

 

I don't have any script for that - load the files into a spreadsheet,

create pivot tables and you're done.

 

regards

 

On 07/18/2018 11:13 PM, Kefan Yang wrote:

> Hey Tomas!

>

>  

>

> I am trying to reproduce the results on my machine. Could you please

> share the script to generate .ods files?

>

>  

>

> Regards,

>

> Kefan

>

>  

>

> *From: *Tomas Vondra <mailto:tomas.vondra@2ndquadrant.com>

> *Sent: *July 18, 2018 2:05 AM

> *To: *Andrey Borodin <mailto:x4mmm@yandex-team.ru>

> *Cc: *Peter Geoghegan <mailto:pg@bowt.ie>; Kefan Yang

> <mailto:starordust@gmail.com>; PostgreSQL Hackers

> <mailto:pgsql-hackers@lists.postgresql.org>

> *Subject: *Re: GSOC 2018 Project - A New Sorting Routine

>

>  

>

>  

>

>  

>

> On 07/18/2018 07:06 AM, Andrey Borodin wrote:

>

>> Hi, Tomas!

>

>> 

>

>>> 15 июля 2018 г., в 1:20, Tomas Vondra <tomas.vondra@2ndquadrant.com

>

>>> <mailto:tomas.vondra@2ndquadrant.com>> написал(а):

>

>>> 

>

>>> So I doubt it's this, but I've tweaked the scripts to also set this GUC

>

>>> and restarted the tests on both machines. Let's see what that does.

>

>> 

>

>> Do you observe any different results?

>

>> 

>

>  

>

> It did change the CREATE INDEX results, depending on the scale. The full

>

> data is available at [1] and [2], attached is a spreadsheet summary from

>

> the Xeon box.

>

>  

>

> For the largest scale (1M rows) the regressions for CREATE INDEX queries

>

> mostly disappeared. For 10k rows it still affects CREATE INDEX with a

>

> text column, and the 100k case behaves just like before (so significant

>

> regressions for CREATE INDEX).

>

>  

>

> I don't have time to investigate this further at the moment, but I'm

>

> still of the opinion that there's little to gain by replacing our

>

> current sort algorithm with this.

>

>  

>

>  

>

> [1] https://bitbucket.org/tvondra/sort-intro-sort-xeon/src/master/

>

> [2] https://bitbucket.org/tvondra/sort-intro-sort-i5/src/master/

>

>  

>

> regards

>

>  

>

> --

>

> Tomas Vondra                  http://www.2ndQuadrant.com

>

> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

>

>  

>

 

--

Tomas Vondra                  http://www.2ndQuadrant.com

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

 

Attachment

pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: Have an encrypted pgpass file
Next
From: Tom Lane
Date:
Subject: Re: "interesting" issue with restore from a pg_dump with a database-wide search_path