Re: COPY and Volatile default expressions - Mailing list pgsql-hackers

From David Fetter
Subject Re: COPY and Volatile default expressions
Date
Msg-id 20130415152427.GH5337@fetter.org
Whole thread Raw
In response to COPY and Volatile default expressions  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: COPY and Volatile default expressions
List pgsql-hackers
On Mon, Apr 15, 2013 at 03:00:34PM +0100, Simon Riggs wrote:
> COPY cannot be optimised correctly if we have before triggers or
> volatile default expressions.
> 
> The multi-insert code detects those cases and falls back to the
> single row mechanism in those cases.
> 
> There a common class of volatile functions that wouldn't cause
> problems: any volatile function that doesn't touch the table being
> loaded and still works correctly when called with alternately
> ordered data.

"Doesn't touch already-existing rows?"  Makes a lot of sense :)

> I claim this is a common class, since sequence next_val functions and
> uuid generators meet that criteria and most common forms of auditing
> trigger, as well as any other form of data-reformatting trigger. Since
> this is a common case, it seems worth optimising.

Do you have numbers on this, or ways to gather same?  In other words,
how do we know what resources (time, CPU cycles, disk seeks, etc.) are
being consumed here?

> What I'd like to do is to invent a new form of labelling that allows
> us to understand that COPY can still be optimised. I'm thinking to add
> a new function label, something like one of
> * IDEMPOTENT
> * ORDER INDEPENDENT
> * BATCHABLE
> * NON SELF REFERENCING
> * GO FASTER DAMMIT
> etc
> 
> I'm sure many people will have a much more exact description and a
> better name than I do.
> 
> This becomes more important when we think about parallelising SQL,
> since essentially the same problem exists with parallel SQL calling
> volatile functions. Oracle handles that by having a pragma to allow a
> function to be declared as parallel safe.

What happens when you misinform Oracle about this?  Does it attempt to
check?  More importantly, what *should* happen?

> I was also thinking that the best way to do this would be to invent a
> new flexible function labelling scheme, so use something like hstore
> to store a list of function attributes. Something that would mean we
> don't have to invent new keywords every time we have a new function
> label.
> 
> Suggestions please.

JSON's in core.  How about using that?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



pgsql-hackers by date:

Previous
From: hubert depesz lubaczewski
Date:
Subject: Why are JSON extraction functions STABLE and not IMMUTABLE?
Next
From: David Fetter
Date:
Subject: Re: WIP: index support for regexp search