Re: a performence question - Mailing list pgsql-general

From Rafal Pietrak
Subject Re: a performence question
Date
Msg-id 1220791609.14522.93.camel@zorro.isa-geek.com
Whole thread Raw
In response to Re: a performence question  ("Filip Rembiałkowski" <plk.zuber@gmail.com>)
List pgsql-general
Thenx,

The preparation of such simulation would take me much longer :(. Thenx!

But out of the results, I was wandering if this is an exotic RDMB use,
or may be it would be worth pondering at this point, of improvements to
the Postgres RULE system.

I'm not really up to any technical discussion regarding those
improvements, I can only try to express my ideas "at the syntax
level" (hoping that those might be more readable then "not so plain
English"), with the following wish-list:

wish nr.1:
The RULE version (of the "routing table" case) could possibly get
performance improvement if only one could make a single query for ALL
branches of the RULE-list, *before* evaluation of the WHERE clausure of
the RULE-list. An SQL extension would be necessary.

Syntactically, this could be expressed by a phrase: "CREATE RULE ... AS
BEFORE INSERT TO .... DO SELECT sel INTO NEW.sel FROM
current_route" (or: "... DO INSERT INTO NEW.sel FROM (SELECT sel FROM
current_route)" or: "...DO UPDATE *NEW* SET sel=cur.sel FROM
current_route cur") - thus defining a RULE which could initialize some
of the NEW.* fields with values retrieved from the database. This would
allow for omission of the "SELECT 1 FORM..." and have that replaced by
simple NEW.sel = NNN clausure .... thus bringing the "routing_table
case" performance of the RULE system in line with the case of "routing
INSERT based solely on inserted entity".

In other words, I wish there was a stage of "readying" data (in fact,
readying data *for* the WHERE clausure), before RULEs evaluate.

And: Would such "readying" be usefull for UPDATES or DELETES, too? I
think: possibly.

wish nr.2:
Should there be a way to pass "schema element" reference (meaning: a
name of the effected table) into plpsql function,  writing (and
maintaining the code) of such TRIGGERs would be immensely simpler.

And thenx again for the help in evaluating the routing performance. This
helped me a lot!

-R


On Fri, 2008-09-05 at 17:04 +0200, Filip Rembiałkowski wrote:
> 2008/9/4 Rafal Pietrak <rafal@zorro.isa-geek.com>:
> > Hi,
> >
> > Maybe someone on this list actually have already tried this:
> >
> > I'm planning to make a partitioned database. From Postgres documentation
> > I can see, that there are basically two methods to route INSERTS into
> > partitioned table:
> >        one. is a TRIGGER
> >        other. is a RULE
> >
> > My Table will have over 1000 partitions. Some not so big, but
> > significant number of them will be of multimillion rows. Partitioning
> > will be done using a single column, on equality.... meaning:
> >
> > CREATE TABLE mainlog (sel int, tm timestamp, info text,...);
> > CREATE TABLE mainlog_p1 (CHECK (sel=1)) INHERITS (mainlog);
> > CREATE TABLE mainlog_p2 (CHECK (sel=2)) INHERITS (mainlog);
> > ...etc.
> >
> > If I route INSERT with a TRIGGER, the function would look like:
> > CREATE .... TRIGGER...AS $$ DECLARE x RECORD; BEGIN
> > SELECT id INTO x FROM current_route; NEW.sel := x.id;
> > IF NEW.sel = 1 THEN INSERT INTO mainlog_p1 VALUES (NEW.*);
> > ELSE IF NEW.sel = 2 THEN INSERT INTO mainlog_p2 VALUES (NEW.*);
> > ....
> > END IF;
> > RETURN NULL;
> > $$;
> >
> > If I route INSETS with a RULE, I'd have something like 1000 rules hooked
> > up to MAINLOG, all looking like:
> > CREATE RULE .... ON INSERT ... WHERE EXISTS(SELECT 1 FROM current_route
> > WHERE id = 1) DO INSTEAD INSERT INTO mainlog_p1 VALUES SELECT
> > x.id,new.tm... FROM (SELECT id FROM current_route) x;
> > ... and similar RULES for cases "WHERE id = 2", etc.
> >
> > My question is, where should I expect better performance on those
> > INSERTS).
> >
> > I would prefer a set of RULES (as I wouldn't like to rewrite TRIGGER
> > function every time I add a partition ... a thousand lines function),
> > but since they all must make a select query on CURRENT_ROUTE table, may
> > be that will not be particularly effective? The TRIGGER function does a
> > single query - may be it'll be faster? I was planning to generate some
> > dummy data and run a simulation, but may be someone already has that
> > experience? Or maybe the TRIGGER should look differently? Or the set of
> > RULES?
> >
>
> I had a bit spare time so I tested this
>
> see http://filip.rembialkowski.net/postgres-partitioning-performance-rules-vs-triggers/
>
> seems that in your scenario trigger will be better.
>
> but If I had to do this, and if performance was very important, I
> would move "partition selection" logic out of the INSERT phase. the
> application can know this before the actual insert. unless you want to
> shift selections very often...
>
>
>
>
>
> --
> Filip Rembiałkowski
>

pgsql-general by date:

Previous
From: Devrim GÜNDÜZ
Date:
Subject: New shapshot RPMs (Sep 7 2008) are ready for testing
Next
From: Andrew Dunstan
Date:
Subject: Re: [HACKERS] New shapshot RPMs (Sep 7 2008) are ready for testing