Home > mailing lists

Randomisation for ensuring nlogn complexity in quicksort - Mailing list pgsql-hackers

From	Atri Sharma
Subject	Randomisation for ensuring nlogn complexity in quicksort
Date	June 30, 2013 12:30:29
Msg-id	896779CC-C2BD-420D-8BB8-F45B0DAAF2BF@gmail.com Whole thread Raw
Responses	Re: Randomisation for ensuring nlogn complexity in quicksort Re: Randomisation for ensuring nlogn complexity in quicksort
List	pgsql-hackers

Tree view

Hi all,

I have been reading the recent discussion and was researching a bit, and I think that we should really go with the idea
ofrandomising the input data(if it is not completely presorted), to ensure that we do not get quadratic complexity. 

One easy way to do that could be to take a sample of the data set, and take a pivot out of it. Still a better way could
beto take multiple samples which are spread of the data set, select a value from each of them, and then take a
cumulativepivot(median,maybe). 

Anyways, I really think that if we do not go with the above ideas, then, we should some how factor in the degree of
randomnessof the input data when making the decision between quicksort and external merge sort for a set of rows. 

This shouldn't be too complex, and should give us a fixed nlogn complexity even for wild data sets, without affecting
existingnormal data sets that are present in every day transactions. I even believe that those data sets will also
benefitfrom the above optimisation. 

Thoughts/Comments?

Regards,
Atri

Sent from my iPad

pgsql-hackers by date:

From: Szymon Guz
Date: 30 June 2013, 12:18:13
Subject: Re: plpython implementation

From: Martijn van Oosterhout
Date: 30 June 2013, 12:32:01
Subject: Re: plpython implementation

Randomisation for ensuring nlogn complexity in quicksort - Mailing list pgsql-hackers

Previous

Next