Re: WIP Patch: Use sortedness of CSV foreign tables for query planning - Mailing list pgsql-hackers

From Tom Lane
Subject Re: WIP Patch: Use sortedness of CSV foreign tables for query planning
Date
Msg-id 9700.1344263586@sss.pgh.pa.us
Whole thread Raw
In response to Re: WIP Patch: Use sortedness of CSV foreign tables for query planning  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: WIP Patch: Use sortedness of CSV foreign tables for query planning
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Aug 5, 2012 at 10:41 PM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> I think file_fdw is useful for managing log files such as PG CSV logs.  Since
>> often, such files are sorted by timestamp, I think the patch can improve the
>> performance of log analysis, though I have to admit my demonstration was not
>> realistic.

> Hmm, I guess I could buy that as a plausible use case.

In the particular case of PG log files, I'd bet good money against them
being *exactly* sorted by timestamp.  Clock skew between backends, or
varying amounts of time to construct and send messages, will result in
small inconsistencies.  This would generally not matter, until the
planner relied on the claim of sortedness for something like a mergejoin
... and then it would matter a lot.

In general I'm quite suspicious of the idea of believing that externally
supplied data is sorted in exactly the way that PG thinks it should
sort.  If we implement this you can bet that people will screw up, for
instance by using the wrong locale/collation to sort text data.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: tzdata2012d
Next
From: Robert Haas
Date:
Subject: Re: WIP patch for LATERAL subqueries