Thread: creating functions with variable argument lists

creating functions with variable argument lists

From
Marc Evans
Date:
Hi -

I am trying to make use of table partitions. In doing so I would like to
use a rule to call a functioning which inserts the data into the proper
partition. To do so, I believe that I need to find a way to opaquely pass
NEW from the rule to a function which then passes it to INSERT. (Well, I
could spell out all of the columns in the table as arguments to the
function, but that is not as maintainable, e.g. every time the table
columns change, so to the function and rule change.)

I am not finding any way to do this in the proceedural languages. That
said, I would happily believe that I am just missing something, and am
hoping that someone on this list has already figured out an answer.

For consideration, here is an example:

create table foobars (
   id bigserial,
   created_at timestamp not null,
   name text
);

create table foobars_200612 (
   check (created_at >= timestamp '2006-12-01 00:00:00' and created_at < timestamp '2007-01-01 00:00:00')
) inherits (foobars);

create table foobars_200701 (
   check (created_at >= timestamp '2007-01-01 00:00:00' and created_at < timestamp '2007-02-01 00:00:00')
) inherits (foobars);

-- Warning, pseudo code follows (e.g. NEW):
create or replace function foo_insert(NEW) returns void as $$
   begin
     execute 'insert into foobars_' ||
       (select extract(year from $1) || extract(month from $1)) ||
       ' values (' || NEW || ')';
   end;
$$ language plpgsql;

create rule foobars_insert as on insert to foobars
   do instead select foo_insert(NEW);


The key to my success for the above is to find a way for NEW to be used
something like the pseudo code shown. Suggestions?

- Marc

Re: creating functions with variable argument lists

From
Tom Lane
Date:
Marc Evans <Marc@SoftwareHackery.Com> writes:
> I am trying to make use of table partitions. In doing so I would like to
> use a rule to call a functioning which inserts the data into the proper
> partition.

Basically, you're guaranteeing yourself large amounts of pain by
insisting on using a rule for this.  I'd suggest using a trigger
instead.  A "BEFORE INSERT" trigger on the parent table can redirect
the data to the appropriate place and then return NULL to prevent the
insertion into the parent.

Given your example, I'd do something like

create or replace function foobars_insert() returns trigger
language plpgsql as $$
begin
  if new.created_at >= timestamp '2006-12-01 00:00:00' and
     new.created_at < timestamp '2007-01-01 00:00:00' then
    insert into foobars_200612 values(new.*);
  elsif new.created_at >= timestamp '2007-01-01 00:00:00' and
        new.created_at < timestamp '2007-02-01 00:00:00' then
    insert into foobars_200701 values(new.*);
  elsif ...
  else
    raise exception 'No partition for timestamp %', new.created_at;
  end if;
  return null;
end$$;

create trigger foobars_insert before insert on foobars
for each row execute procedure foobars_insert();

Obviously you have to adjust the function definition every time
you add or remove a partition, but you'll have a script for that
anyway, no?

BTW, I think using "new.*" this way only works as of 8.2.

            regards, tom lane

Re: creating functions with variable argument lists

From
Marc Evans
Date:
On Fri, 8 Dec 2006, Tom Lane wrote:

> Marc Evans <Marc@SoftwareHackery.Com> writes:
>> I am trying to make use of table partitions. In doing so I would like to
>> use a rule to call a functioning which inserts the data into the proper
>> partition.
>
> Basically, you're guaranteeing yourself large amounts of pain by
> insisting on using a rule for this.  I'd suggest using a trigger
> instead.  A "BEFORE INSERT" trigger on the parent table can redirect
> the data to the appropriate place and then return NULL to prevent the
> insertion into the parent.
>
> Given your example, I'd do something like
>
> create or replace function foobars_insert() returns trigger
> language plpgsql as $$
> begin
>  if new.created_at >= timestamp '2006-12-01 00:00:00' and
>     new.created_at < timestamp '2007-01-01 00:00:00' then
>    insert into foobars_200612 values(new.*);
>  elsif new.created_at >= timestamp '2007-01-01 00:00:00' and
>        new.created_at < timestamp '2007-02-01 00:00:00' then
>    insert into foobars_200701 values(new.*);
>  elsif ...
>  else
>    raise exception 'No partition for timestamp %', new.created_at;
>  end if;
>  return null;
> end$$;
>
> create trigger foobars_insert before insert on foobars
> for each row execute procedure foobars_insert();
>
> Obviously you have to adjust the function definition every time
> you add or remove a partition, but you'll have a script for that
> anyway, no?
>
> BTW, I think using "new.*" this way only works as of 8.2.

In reply to myself, mainly so that when people search archives they may
benefit from the end result, I am including here the solution I finally
cam up with to provide lazy support for partitioned data insertion into
tables. The code shown here will automatically create a partition based on
the year/month, inserting the data into the partition. The column from the
insert is defined by the trigger call, defaulting to the year/month found
by gmtime (this could result in a small race problem when relying on the
default).

If you have suggestions about ways to improve this, or thoughts like OMG,
that code will be painfully high in overhead, please let me know how you
would approach the problem (short of hard-coding all partitions).

- Marc

CREATE OR REPLACE FUNCTION yearmonth_to_timestamp(INTEGER,INTEGER) RETURNS
TIMESTAMP AS $$
   BEGIN RETURN ($1 || '-' || lpad($2,2,0) || '-01 00:00:00')::timestamp;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION create_date_partition(TIMESTAMP,TEXT,TEXT)
RETURNS VOID as $$
   DECLARE
     date        ALIAS FOR $1;
     table_name  ALIAS FOR $2;
     date_column ALIAS FOR $3;
     year INTEGER := EXTRACT(year FROM date);
     month INTEGER := EXTRACT(month FROM date);
     ts TIMESTAMP := yearmonth_to_timestamp(year,month);
     te TIMESTAMP;
   BEGIN
     IF month < 12 THEN
       te := yearmonth_to_timestamp(year,month+1);
     ELSE
       te := yearmonth_to_timestamp(year+1,1);
     END IF;
     EXECUTE 'CREATE TABLE ' || table_name || '_' || year || lpad(month,2,0) || '(
       CHECK (' || quote_ident(date_column) || ' >= ' || quote_literal(ts) || ' and ' ||
                   quote_ident(date_column) || ' < ' || quote_literal(te) || ')
     ) INHERITS (' || table_name || ')';
   END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION insert_date_partition() RETURNS TRIGGER AS $$
   my %values = %{$_TD->{new}};
   my $table_name = $_TD->{table_name};
   my $column_name = $_TD->{args}[0];
   my $date = $values{$column_name};
   my ($year,$month,$day);
   if (!defined($date)) {
     ($year,$month,$day) = (gmtime)[5,4,3];
     $year += 1900;
     $month += 1;
     $date = sprintf "%d %02d %02d",$year,$month,$day;
   } else {
     ($year,$month) = (split(/(\d\d\d\d)[-\/ ](\d\d?)[-\/ ]/,$date))[1,2];
   }
   my $partition = $table_name . '_' . sprintf "%04d%02d",$year,$month;
   my $columns = join(',',keys(%values));
   my $data = join(',',map { $_ =~ s/'/''/g; defined($_) ? "'$_'" : 'NULL';
} values(%values));
   $insert = "INSERT INTO $partition ($columns) VALUES ($data)";
   eval { spi_exec_query($insert); };
   if ($@) {
     my $create = "SELECT create_date_partition('$date'::timestamp,'$table_name'::text,'$column_name'::text)";
     spi_exec_query($create);
     spi_exec_query($insert);
   }
   return SKIP;
$$ LANGUAGE plperlu;
-- plperlu is used instead of plperl above because of the eval.

-- An example trigger would be:
CREATE TRIGGER test_table_partition_trigger
   BEFORE INSERT ON test_table
   FOR EACH ROW EXECUTE PROCEDURE insert_date_partition('created_at');