Re: A worst case for qsort - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: A worst case for qsort
Date
Msg-id 20140808132954.GL16422@tamriel.snowman.net
Whole thread Raw
In response to Re: A worst case for qsort  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
* Robert Haas (robertmhaas@gmail.com) wrote:
> On Thu, Aug 7, 2014 at 5:52 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > On Thu, Aug 7, 2014 at 2:34 PM, Rod Taylor <rod.taylor@gmail.com> wrote:
> >> This one is frequently sorted as batch operations against the files are
> >> performed in alphabetical order to reduce conflict issues that a random
> >> ordering may cause between jobs.
> >
> > Sure. There are cases out there. But, again, I have a hard time
> > imagining why you'd expect those to be pre-sorted in practice, ...
>
> Well, I'm not sure why you're having a hard time imagining it.
> Presorted input is a common case in general; that's why we have a
> check for it.  That check adds overhead in the non-pre-sorted case to
> improve the pre-sorted case, and nobody's ever argued for removing it
> that I can recall.

Agreed.  This is not all that uncommon to happen in practice.

That said- this is perhaps another good case for where it'd be
extremely handy to have anonymized statistics information from real
systems which would allow us to more easily go *look* at this
specific question.

There are organizations out there who run many thousands of PG instances
who have expressed interest in supporting exactly that kind of
statistics gathering (indeed, I'm pretty sure Peter is familiar with
one... ;) - what we need is an architecture, design, and implementation
to make it happen..

I'm guessing you (Robert and Peter, especially) have already been
thinking about what we could do to make the above wish/dream a reality.
Perhaps if we could get the initial requirements down, someone would be
able to have time to work on making it happen; it would be really great
to see progress on this front for 9.5.  Or, if existing products
implement such metrics collection already, perhaps some numbers could
be shared with the community to help address this (and other)
questions.
Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Minmax indexes
Next
From: Andrew Dunstan
Date:
Subject: Re: jsonb format is pessimal for toast compression