Thread: Is This A Set Based Solution?
Hi- Below is a small test case that illustrates what I'm attempting which is to provide a comma separated list of numbers to a procedure which subsequently uses this list in a join with another table. My questions are is this a set based solution and is this the best approach in terms of using the data types and methods afforded by PostgreSQL? I'm mostly inquiring about the double FOR loop which just doesn't feel right to me and I'd also like to feel that I'm generally on the right track before converting the other 400 procedures from SQL Server 2000 to PostgreSQL. CREATE TYPE fn_return_int4 AS (N int); CREATE TABLE test_table ( id SMALLINT not null, tname varchar(50) not null); INSERT INTO test_table SELECT 1, 'Adams' UNION SELECT 2, 'Baker' UNION SELECT 3, 'Chrysler' UNION SELECT 4, 'Douglas' UNION SELECT 5, 'Everyman'; CREATE OR REPLACE FUNCTION fn_Split_List ( pList TEXT) RETURNS SETOF fn_return_int4 AS $fn_Split_List$ DECLARE v_row fn_return_int4%rowtype; v_list alias for $1; v_delim text := ','; v_arr text[]; BEGIN v_arr := string_to_array(v_list, v_delim); FOR i IN array_lower(v_arr, 1)..array_upper(v_arr, 1) LOOP FOR v_row IN SELECT v_arr[i] LOOP RETURN NEXT v_row; END LOOP; END LOOP; RETURN; END; $fn_Split_List$ LANGUAGE plpgsql; SELECT * FROM fn_Split_List('5,1,3') SL INNER JOIN test_table T ON SL.N=T.ID; I did discover that I was able to define the function with a native type but then the usage looked a little odd: SELECT * FROM fn_Split_List('5,1,3') SL INNER JOIN test_table T ON SL=T.ID; Stefan Berglund www.horseshowtime.com Online Show Entry - Instant Internet Horse Show Schedules and Results stefan@horseshowtime.com tel 714.968.9112 fax 714.968.5940
I don't know if you could change your schema. but I'd consider your problem as a overlapping arrays task and use contrib/intarray for that. Oleg On Fri, 9 Mar 2007, Stefan Berglund wrote: > Hi- > > Below is a small test case that illustrates what I'm attempting which is > to provide a comma separated list of numbers to a procedure which > subsequently uses this list in a join with another table. > > My questions are is this a set based solution and is this the best > approach in terms of using the data types and methods afforded by > PostgreSQL? I'm mostly inquiring about the double FOR loop which just > doesn't feel right to me and I'd also like to feel that I'm generally on > the right track before converting the other 400 procedures from SQL > Server 2000 to PostgreSQL. > > CREATE TYPE fn_return_int4 AS (N int); > > CREATE TABLE test_table ( > id SMALLINT not null, > tname varchar(50) not null); > > INSERT INTO test_table > SELECT 1, 'Adams' > UNION SELECT 2, 'Baker' > UNION SELECT 3, 'Chrysler' > UNION SELECT 4, 'Douglas' > UNION SELECT 5, 'Everyman'; > > CREATE OR REPLACE FUNCTION fn_Split_List ( > pList TEXT) RETURNS SETOF fn_return_int4 AS $fn_Split_List$ > > DECLARE > v_row fn_return_int4%rowtype; > v_list alias for $1; > v_delim text := ','; > v_arr text[]; > > BEGIN > v_arr := string_to_array(v_list, v_delim); > FOR i IN array_lower(v_arr, 1)..array_upper(v_arr, 1) LOOP > FOR v_row IN SELECT v_arr[i] LOOP > RETURN NEXT v_row; > END LOOP; > END LOOP; > RETURN; > END; > $fn_Split_List$ LANGUAGE plpgsql; > > SELECT * > FROM > fn_Split_List('5,1,3') SL INNER JOIN > test_table T ON SL.N=T.ID; > > I did discover that I was able to define the function with a native type > but then the usage looked a little odd: > > SELECT * > FROM > fn_Split_List('5,1,3') SL INNER JOIN > test_table T ON SL=T.ID; > > Stefan Berglund > www.horseshowtime.com > Online Show Entry - Instant Internet Horse Show Schedules and Results > stefan@horseshowtime.com > tel 714.968.9112 fax 714.968.5940 > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
Stefan Berglund <stefan_berglund@msn.com> writes: > Below is a small test case that illustrates what I'm attempting which is > to provide a comma separated list of numbers to a procedure which > subsequently uses this list in a join with another table. > My questions are is this a set based solution and is this the best > approach in terms of using the data types and methods afforded by > PostgreSQL? I'm mostly inquiring about the double FOR loop which just > doesn't feel right to me ... It looks pretty ugly to me too, but you haven't explained your problem clearly enough for anyone to be able to recommend a better solution path. Why do you feel you need to do this? What is the context? regards, tom lane
On Sat, 10 Mar 2007 00:37:08 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: in <7887.1173505028@sss.pgh.pa.us> >Stefan Berglund <stefan_berglund@msn.com> writes: >> Below is a small test case that illustrates what I'm attempting which is >> to provide a comma separated list of numbers to a procedure which >> subsequently uses this list in a join with another table. > >> My questions are is this a set based solution and is this the best >> approach in terms of using the data types and methods afforded by >> PostgreSQL? I'm mostly inquiring about the double FOR loop which just >> doesn't feel right to me ... > >It looks pretty ugly to me too, but you haven't explained your problem >clearly enough for anyone to be able to recommend a better solution path. >Why do you feel you need to do this? What is the context? I've been lurking for several thousand posts and I'm flattered that you've responded but I'm also a little flustered that I failed to communicate so I'll try again. In SQL Server I was able to pass a string of IDs such as '1,5,3' to a procedure that would create a set of tuples where each tuple was one of those ids: Row 1 : 1 Row 2: 5 Row 3: 3 I could then use this table in a join with another table. What I want to do is to create a function that takes a comma separated string of numbers and produces a table (where each row is one of those numbers) that can be joined to other tables as in the example first provided. fn_Split_List is supposed to take a list of numbers and return a table of rows of those numbers. I hope this better explains what I'm trying to do but somehow from your reaction I get the feeling that I'm missing something really basic? --- This posting is provided "AS IS" with no warranties and no guarantees either express or implied. Stefan Berglund
On Sat, 10 Mar 2007 08:26:32 +0300 (MSK), oleg@sai.msu.su (Oleg Bartunov) wrote: in <Pine.LNX.4.64.0703100824300.400@sn.sai.msu.ru> >I don't know if you could change your schema. but I'd consider your >problem as a overlapping arrays task and use contrib/intarray for that. > >Oleg I can very definitely change my schema at this point. I'm refactoring an application from SQL Server to PostgreSQL and I'm doing a lot of exploring trying to find the best fits. I'll see what I can glean from you've indicated but that sounds like what I'm looking for. The string converts easily to an array but then what's the best way to get from an array to a table? --- This posting is provided "AS IS" with no warranties and no guarantees either express or implied. Stefan Berglund
Stefan Berglund <sorry.no.koolaid@for.me> writes: > On Sat, 10 Mar 2007 00:37:08 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: >> It looks pretty ugly to me too, but you haven't explained your problem >> clearly enough for anyone to be able to recommend a better solution path. >> Why do you feel you need to do this? What is the context? > What I want to do is to create a function that takes a comma separated > string of numbers and produces a table (where each row is one of those > numbers) that can be joined to other tables as in the example first > provided. That was what you said before. The question is why you need to do that. It strikes me that having such a requirement is a symptom of poor data representation choices. Perhaps an array would be better, or maybe you ought to refactor your table layout altogether. But, as I said, you haven't provided any info that would let someone give advice at that level. regards, tom lane
On Mon, 12 Mar 2007 10:41:21 -0400, tgl@sss.pgh.pa.us (Tom Lane) wrote: in <15437.1173710481@sss.pgh.pa.us> >Stefan Berglund <sorry.no.koolaid@for.me> writes: >> On Sat, 10 Mar 2007 00:37:08 -0500, tgl@sss.pgh.pa.us (Tom Lane) wrote: >>> It looks pretty ugly to me too, but you haven't explained your problem >>> clearly enough for anyone to be able to recommend a better solution path. >>> Why do you feel you need to do this? What is the context? > >> What I want to do is to create a function that takes a comma separated >> string of numbers and produces a table (where each row is one of those >> numbers) that can be joined to other tables as in the example first >> provided. > >That was what you said before. The question is why you need to do that. >It strikes me that having such a requirement is a symptom of poor data >representation choices. Perhaps an array would be better, or maybe you >ought to refactor your table layout altogether. But, as I said, you >haven't provided any info that would let someone give advice at that >level. Perhaps it is a case of poor data representation choices and that is exactly why I posted originally - because I wasn't sure if that was the best way of doing what I want to do: I have an app where the user makes multiple selections from a list. I can either construct a huge WHERE clause such as SELECT blah blah FROM foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could alternatively pass the string of IDs ('53016,27,292,512') to a table returning function which TABLE is then JOINed with the table I wish to query instead of using the unwieldy WHERE clause. The latter strikes me as a far more scalable method since it eliminates having to use dynamic SQL to construct the ridiculously long WHERE clause which will no doubt ultimately bump up against parser length restrictions or some such. I didn't find any examples that showed JOINing an array with a table. How do other developers solve this basic problem and why does my approach seem so foreign? SELECT blah blah FROM fn_Split_List('53016,27,292,512') SL INNER JOIN foo T ON SL.N=T.ID; or SELECT blah blah FROM foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) --- Stefan Berglund
Stefan Berglund wrote: > foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could > alternatively pass the string of IDs ('53016,27,292,512') to a table > returning function which TABLE is then JOINed with the table I wish to Stefan, The user selections will be in some sort of list. Could you not use WHERE ID IN (the list)? Regards, George
On Mon, Mar 12, 2007 at 11:15:01 -0700, Stefan Berglund <sorry.no.koolaid@for.me> wrote: > > I have an app where the user makes multiple selections from a list. I > can either construct a huge WHERE clause such as SELECT blah blah FROM > foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could > alternatively pass the string of IDs ('53016,27,292,512') to a table > returning function which TABLE is then JOINed with the table I wish to > query instead of using the unwieldy WHERE clause. The latter strikes me > as a far more scalable method since it eliminates having to use dynamic > SQL to construct the ridiculously long WHERE clause which will no doubt > ultimately bump up against parser length restrictions or some such. How big is huge? If the list of IDs is in the 1000s or higher, then it may be better to load the data into a temp table and ANALYSE it before running your query. Otherwise, for smaller lists the IN suggestion should work well in recent versions.
Bruno Wolff III schrieb: > On Mon, Mar 12, 2007 at 11:15:01 -0700, > Stefan Berglund <sorry.no.koolaid@for.me> wrote: >> I have an app where the user makes multiple selections from a list. I >> can either construct a huge WHERE clause such as SELECT blah blah FROM >> foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could >> alternatively pass the string of IDs ('53016,27,292,512') to a table >> returning function which TABLE is then JOINed with the table I wish to >> query instead of using the unwieldy WHERE clause. The latter strikes me >> as a far more scalable method since it eliminates having to use dynamic >> SQL to construct the ridiculously long WHERE clause which will no doubt >> ultimately bump up against parser length restrictions or some such. > > How big is huge? > If the list of IDs is in the 1000s or higher, then it may be better to > load the data into a temp table and ANALYSE it before running your query. > Otherwise, for smaller lists the IN suggestion should work well in recent > versions. Show me a user which really clicks on 1000 or more checkboxes on a webpage or similar ;) I'd think around 20 values is plenty. Regards Tino
Tino Wildenhain <tino@wildenhain.de> writes: > Show me a user which really clicks on 1000 or more checkboxes on a > webpage or similar ;) > I'd think around 20 values is plenty. On the other hand, show me a page with 1000 or more checkboxes to be clicked at once and I'd show a developer / designer that needs a new career... :-) -- Jorge Godoy <jgodoy@gmail.com>
On Thu, 15 Mar 2007 09:47:27 -0500, gweaver@shaw.ca (George Weaver) wrote: in <008001c76710$da487db0$6400a8c0@Dell4500> > >Stefan Berglund wrote: > >> foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could >> alternatively pass the string of IDs ('53016,27,292,512') to a table >> returning function which TABLE is then JOINed with the table I wish to > >The user selections will be in some sort of list. Could you not use WHERE >ID IN (the list)? Coming from SQL Server where that is not allowed, it didn't occur to me that PostgreSQL would allow a substitutable parameter in the IN clause. However, it seems that it can't be done in this fashion without using dynamic SQL unless I'm missing something. I tried this: create or replace function foo(plist TEXT) RETURNS SETOF Show_Entries as $$ SELECT * FROM Show_Entries WHERE Show_ID = 1250 AND Show_Number IN ($1); $$ LANGUAGE sql; When I use select * from foo('101,110,115,120'); I get no results. When I use select * from foo(101,110,115,120); I get the correct results. At any rate, I'm happy with what I've come up with and so far performance is excellent: CREATE TABLE test_table ( id int not null, tname varchar(50) not null); INSERT INTO test_table SELECT 1, 'Adams' UNION SELECT 2, 'Baker' UNION SELECT 3, 'Chrysler' UNION SELECT 4, 'Douglas' UNION SELECT 5, 'Everyman'; CREATE OR REPLACE FUNCTION foo ( pList TEXT) RETURNS SETOF INTEGER AS $foo$ DECLARE v_arr text[]; BEGIN v_arr := string_to_array($1, ','); FOR i IN array_lower(v_arr, 1)..array_upper(v_arr, 1) LOOP RETURN NEXT v_arr[i]::int; END LOOP; RETURN; END; $foo$ LANGUAGE plpgsql; SELECT * FROM foo('5,1,3') SL INNER JOIN test_table T ON SL=T.ID; SELECT * FROM foo('52001,17,22,42,47') ORDER BY foo; --- Stefan Berglund
On Thu, 15 Mar 2007 15:46:33 -0500, bruno@wolff.to (Bruno Wolff III) wrote: in <20070315204633.GA2156@wolff.to> >On Mon, Mar 12, 2007 at 11:15:01 -0700, > Stefan Berglund <sorry.no.koolaid@for.me> wrote: >> >> I have an app where the user makes multiple selections from a list. I >> can either construct a huge WHERE clause such as SELECT blah blah FROM >> foo WHERE (ID = 53016 OR ID = 27 OR ID = 292 OR ID = 512) or I could >> alternatively pass the string of IDs ('53016,27,292,512') to a table >> returning function which TABLE is then JOINed with the table I wish to >> query instead of using the unwieldy WHERE clause. The latter strikes me >> as a far more scalable method since it eliminates having to use dynamic >> SQL to construct the ridiculously long WHERE clause which will no doubt >> ultimately bump up against parser length restrictions or some such. > >How big is huge? >If the list of IDs is in the 1000s or higher, then it may be better to >load the data into a temp table and ANALYSE it before running your query. >Otherwise, for smaller lists the IN suggestion should work well in recent >versions. Sorry, huge was an exaggeration. I doubt it would ever approach 1000 - more like a couple hundred. I'll look at it a little closer. --- Stefan Berglund
On Fri, 16 Mar 2007 09:38:52 -0300, jgodoy@gmail.com (Jorge Godoy) wrote: in <877ithjrc3.fsf@gmail.com> >Tino Wildenhain <tino@wildenhain.de> writes: > >> Show me a user which really clicks on 1000 or more checkboxes on a >> webpage or similar ;) >> I'd think around 20 values is plenty. > >On the other hand, show me a page with 1000 or more checkboxes to be clicked >at once and I'd show a developer / designer that needs a new career... :-) Just to allay your fears, a fairly typical scenario might have the user presented with a list of from twenty to fifty names of horse trainers depending on the size of the show. Since each trainer can have anywhere from one to thirty or forty horses in their barn you can do the math to see that the list of IDs passed from the app to the database can be anywhere from a single ID up to possibly thousands of IDs. --- Stefan Berglund
On Sat, 10 Mar 2007 08:26:32 +0300 (MSK), oleg@sai.msu.su (Oleg Bartunov) wrote: in <Pine.LNX.4.64.0703100824300.400@sn.sai.msu.ru> >I don't know if you could change your schema. but I'd consider your >problem as a overlapping arrays task and use contrib/intarray for that. That's a nice piece of work, Oleg, and extremely quick. I played with it and pored over the docs but it just seems to keep coming back to the fact that all of the array type manipulations are column based as opposed to row based. In fact, this from section 8.10.5 of the docs confirms it: Tip Arrays are not sets; searching for specific array elements may be a sign of database misdesign. Consider using a separate table with a row for each item that would be an array element. This will be easier to search, and is likely to scale up better to large numbers of elements. I was able to clean up the function I originally posted removing the extraneous LOOP and I'm more than happy with the performance. What's funny is that the function as it now stands is what I initially obtained by googling, but I mistakenly added the extra loop. :-) What I finally came up with is here: <r7viv2hr16uvmsb4tti398rr8oce4e1nvm@4ax.com> --- Stefan Berglund
On Thu, Mar 15, 2007 at 10:26:48AM -0700, Stefan Berglund wrote: > that PostgreSQL would allow a substitutable parameter in the IN clause. > However, it seems that it can't be done in this fashion without using > dynamic SQL unless I'm missing something. The substitutable list has to be an array, not a text value. So if the parameter is specified as ARRAY OF INTEGER, you can call it like: SELECT foo(ARRAY[1,2,3,45]); SELECT foo('{1,2,3,4,56}'); Hope this helps, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
Attachment
Stefan Berglund <sorry.no.koolaid@for.me> writes: > I tried this: > create or replace function foo(plist TEXT) > RETURNS SETOF Show_Entries as $$ > SELECT * > FROM Show_Entries > WHERE Show_ID = 1250 AND Show_Number IN ($1); > $$ LANGUAGE sql; > When I use select * from foo('101,110,115,120'); I get no results. When > I use select * from foo(101,110,115,120); I get the correct results. Just for the record, the reason that didn't work is that Postgres saw it as a comparison to a single scalar IN-list item. What you had was effectively WHERE Show_ID = 1250 AND Show_Number::text IN ('101,110,115,120'); which of course will fail to find any rows. In recent releases (8.2 for sure, don't remember if 8.1 can do this efficiently) you could instead do create or replace function foo(plist int[]) RETURNS SETOF Show_Entries as $$ SELECT * FROM Show_Entries WHERE Show_ID = 1250 AND Show_Number IN ($1); $$ LANGUAGE sql; select * from foo(array[101,110,115,120]); regards, tom lane