Re: coalesce in plpgsql, and other style questions - Mailing list pgsql-novice
From | Jeff Davis |
---|---|
Subject | Re: coalesce in plpgsql, and other style questions |
Date | |
Msg-id | 1339543839.12295.102.camel@sussancws0025 Whole thread Raw |
In response to | coalesce in plpgsql, and other style questions (Ross Boylan <ross@biostat.ucsf.edu>) |
Responses |
Re: coalesce in plpgsql, and other style questions
|
List | pgsql-novice |
On Tue, 2012-06-12 at 10:46 -0700, Ross Boylan wrote: > I just wrote my first pl/pgsql function, and would appreciate any > comments people have on it. I'll be writing a bunch of similar > functions, with semantics "give me the id of the object if exists, > otherwise create it and give me the id." > > My solution seems excessively procedural to me. I think what you're trying to do is procedural, and that's not necessarily bad. You are trying to get an existing ID if available, or assign a new ID if not; and I assume you are doing so to save the space associated with storing the full names (essentially like dictionary encoding). > I thought I could get > the right semantics with something like > select coalesce((select id from mytable where name='foo'), > (insert into mytable ('name') values('foo') returning id)) > but I could not get that to work in plgsql. In pl/pgsql, you can't just use SELECT, you have to use something like SELECT INTO the_id ... or use PERFORM. Also, you have a potential race condition there, because someone might insert "foo" into mytable after the select (first arg to coalesce) and before the insert (second arg to coalesce). > Also, I wonder if calling a column 'name' is asking for trouble. I recommend against it. > The actual function is a little more complicated because the table has > the possibility of canonical entries with other entries pointing to > them, I use 'did' for database id to avoid confusion with various other > message ids the system uses; it's for tracking emails. > > Here's the code. As I said, I'd love comments on any aspect of it. > > /* gom = get or make functions retrieve did if available, otherwise create record > and return its did. */ > > create or replace function gom_hostids(IN hostname text, IN canonicalname text = NULL, > OUT hostid bigint, OUT canonicalid bigint) language plpgsql as $$ Sometimes it's good to declare your variables with names like in_hostname or out_canonicalid, to prevent confusion reading similarly-named variables coming from the tables. > DECLARE > BEGIN > select did, canonical into hostid, canonicalid from host > where name = hostname; > if FOUND then > return; > end if; > if canonicalname is not NULL then > select did into canonicalid from host where name = canonicalname; > if not FOUND then > insert into host (name) values(canonicalname) returning did into canonicalid; > end if; > end if; > if hostname != canonical then Is canonical a proper variable here? It's not in the argument list, and it's not DECLAREd. Did you mean canonicalname? > insert into host (name, canonical) values(hostname, canonicalid) > returning did into hostid; > else > hostid := canonicalid; I would recommend keeping more explicit track of NULL values and what should happen to them. It looks like a NULL canonicalname would fall into this else branch (I could be mistaken), but it's a little hard to follow exactly what you want to happen. A few comments would help. > return; > END > $$; Again, you have a potential race between checking for the canonicalname in host, and inserting it into host when it doesn't exist. If another transaction runs in between those two, you will get duplicates in the host table. > Table definition: > create table host ( > did bigint primary key default (nextval('rb_id_seq')), ---database id, like oid > name text, --IP address legal, usually enclosed in []. More commonly an internet domain. > -- domain is a sql keyword and so I used host. > canonical bigint references host (did) --if not null, preferred name for host > --may refer to self > ) inherits (RBObject); > > I'm using Posgresql 8.4. I recommend a unique constraint on host.name. That will catch the race conditions I mentioned above, and turn them into errors. You can catch those errors in plpgsql, allowing you to do what you want in that case. Regards, Jeff Davis
pgsql-novice by date: