Thread: On the performance of views
I have an application that I'm porting from MSSQL to PostgreSQL. Part of this application consists of hundreds of stored procedures that I need to convert to Postgres functions ... or views? At first I was going to just convert all MSSQL procedures to Postgres functions. But now that I'm looking at it, a lot of them may be candidates for views. A lot of them take on the format of: SELECT a.cola, b.colb, c.colc FROM a JOIN b JOIN c WHERE a.prikey=$1 (this is slightly oversimplified, but as a generalization of hundreds of functions, it's pretty accurate) Now, I know this questions is pretty generalized, and I intend to test before actually commiting to a particular course of action, but I'm very early in the conversion and I'm curious as to whether people with more experience than I think that views will provide better performance than functions containing SQL statements like the above. The client is _very_ interested in performance (just like anyone who needs to market things these days ... apparently, if you admit that something might take a few seconds in your advertising, you're sunk) Any opinions are welcome. Also, if this is a relatively unknown thing, I'd be curious to know that as well, because then it would be worthwhile for me to record and publish my experience. -- Bill Moran Potential Technologies http://www.potentialtech.com
Bill Moran wrote: > I have an application that I'm porting from MSSQL to PostgreSQL. Part > of this > application consists of hundreds of stored procedures that I need to > convert > to Postgres functions ... or views? > > At first I was going to just convert all MSSQL procedures to Postgres > functions. > But now that I'm looking at it, a lot of them may be candidates for > views. A > lot of them take on the format of: > > SELECT a.cola, b.colb, c.colc > FROM a JOIN b JOIN c > WHERE a.prikey=$1 Make sure that you typecase correctly. It makes a differnce of order of magnitude when you say 'where intpkey=<somevalue>::int' rather than 'where intpkey=<somevalue>'. It is called typecasting and highly recommened in postgresql for correctly choosing indexes. I remember another post on some list, which said pl/pgsql seems to be very strongly typed language compared to MSSQL counterpart. So watch for that as well. > > (this is slightly oversimplified, but as a generalization of hundreds of > functions, it's pretty accurate) > > Now, I know this questions is pretty generalized, and I intend to test > before > actually commiting to a particular course of action, but I'm very early > in the > conversion and I'm curious as to whether people with more experience than I > think that views will provide better performance than functions containing > SQL statements like the above. The client is _very_ interested in To my understanding, views are expanded at runtime and considered while preparing plan for the complete (and possibly bigger) query(Consider a view joined with something else). That is not as easy/possible if at all, when it is function. For postgresql query planner, the function is a black box(rightly so, I would say). So using views opens possibility of changing query plans if required. Most of the times that should be faster than using them as functions. Of course, the standard disclaimer, YMMV. Try yourself. Correct me if I am wrong. HTH Shridhar
Bill Moran <wmoran@potentialtech.com> writes: > At first I was going to just convert all MSSQL procedures to Postgres functions. > But now that I'm looking at it, a lot of them may be candidates for views. A > lot of them take on the format of: > SELECT a.cola, b.colb, c.colc > FROM a JOIN b JOIN c > WHERE a.prikey=$1 You'd probably be better off using views, if making that significant a notational change is feasible for you. Functions that return multiple columns are notationally messy in Postgres. A view-based solution would be more flexible and likely have better performance. regards, tom lane
Bill, > > SELECT a.cola, b.colb, c.colc > > FROM a JOIN b JOIN c > > WHERE a.prikey=$1 If your views are simple, PostgreSQL will be able to "push down" any filter criteria into the view itself. For example, CREATE view_a AS SELECT a.cola, b.colb, c.colc FROM a JOIN b JOIN c; SELECT * FROM view_a WHERE a.prikey = 2334432; will execute just like: SELECT a.cola, b.colb, c.colc FROM a JOIN b JOIN c WHERE a.prikey = 2334432; However, this does not work for really complex views, which have to be materialized or executed as a sub-loop. The "Procedures faster than views" thing is a SQL Server peculiarity which is a result of MS's buggering up views since they bought the code from Sybase. > To my understanding, views are expanded at runtime and considered while > preparing plan for the complete (and possibly bigger) query(Consider a view > joined with something else). That is not as easy/possible if at all, when it is > function. For postgresql query planner, the function is a black box(rightly so, > I would say). Well, as of 7.4 SQL functions are inlined. And simple PL/pgSQL functions will be "prepared". So it's possible that either could execute as fast as a view. Also, if your client is really concerned about no-holds-barred speed, you should investigate prepared queries. -- -Josh Berkus Aglio Database Solutions San Francisco
Tom Lane wrote: > Bill Moran <wmoran@potentialtech.com> writes: > >>At first I was going to just convert all MSSQL procedures to Postgres functions. >>But now that I'm looking at it, a lot of them may be candidates for views. A >>lot of them take on the format of: > >>SELECT a.cola, b.colb, c.colc >>FROM a JOIN b JOIN c >>WHERE a.prikey=$1 > > You'd probably be better off using views, if making that significant a > notational change is feasible for you. Functions that return multiple > columns are notationally messy in Postgres. A view-based solution would > be more flexible and likely have better performance. Well, I don't see a huge difference in how the application will be built. Basically, PQexec calls will have a string like "SELECT * FROM view_name WHERE prikey=%i" instead of "SELECT * FROM function_name(%i)" ... which really doesn't make life much more difficult (unless there's something I'm missing?) Thanks for the input, Tom. I'll definately try out views where possible to see if it improves things. -- Bill Moran Potential Technologies http://www.potentialtech.com
Shridhar Daithankar wrote: > Bill Moran wrote: > >> I have an application that I'm porting from MSSQL to PostgreSQL. Part >> of this >> application consists of hundreds of stored procedures that I need to >> convert >> to Postgres functions ... or views? >> >> At first I was going to just convert all MSSQL procedures to Postgres >> functions. >> But now that I'm looking at it, a lot of them may be candidates for >> views. A >> lot of them take on the format of: >> >> SELECT a.cola, b.colb, c.colc >> FROM a JOIN b JOIN c >> WHERE a.prikey=$1 > > Make sure that you typecase correctly. It makes a differnce of order of > magnitude when you say 'where intpkey=<somevalue>::int' rather than > 'where intpkey=<somevalue>'. > > It is called typecasting and highly recommened in postgresql for > correctly choosing indexes. > > I remember another post on some list, which said pl/pgsql seems to be > very strongly typed language compared to MSSQL counterpart. So watch for > that as well. Oh yeah. This particular difference is causing a lot of headaches, as I have to track down the type of each field to create a new type for each function. MSSQL seems to be _very_ loosely typed, in that it will return anything you want and determine the return type at run time. Some functions they prototyped in MSSQL even return different types, based on certian parameters, I'm not sure how I'll do this in Postgres, but I'll have to figure something out. >> (this is slightly oversimplified, but as a generalization of hundreds of >> functions, it's pretty accurate) >> >> Now, I know this questions is pretty generalized, and I intend to test >> before >> actually commiting to a particular course of action, but I'm very >> early in the >> conversion and I'm curious as to whether people with more experience >> than I >> think that views will provide better performance than functions >> containing >> SQL statements like the above. The client is _very_ interested in > > To my understanding, views are expanded at runtime and considered while > preparing plan for the complete (and possibly bigger) query(Consider a > view joined with something else). That is not as easy/possible if at > all, when it is function. For postgresql query planner, the function is > a black box(rightly so, I would say). > > So using views opens possibility of changing query plans if required. > Most of the times that should be faster than using them as functions. > > Of course, the standard disclaimer, YMMV. Try yourself. Thanks. I think I'm going to track the performance of many of these functions and write up what I discover. -- Bill Moran Potential Technologies http://www.potentialtech.com
Bill, > Some functions they prototyped in MSSQL even return different types, based > on certian parameters, I'm not sure how I'll do this in Postgres, but I'll > have to figure something out. We support that as of 7.4.1 to an extent; check out "Polymorphic Functions". -- -Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus wrote: > Bill, > > >>Some functions they prototyped in MSSQL even return different types, based >>on certian parameters, I'm not sure how I'll do this in Postgres, but I'll >>have to figure something out. > > > We support that as of 7.4.1 to an extent; check out "Polymorphic Functions". To my understanding, polymorphism means more than one function with same name but different signature(Sorry C++ days!!). That still can not return rwos of two types in one call. At any moment, rowset returned by a function call would be homogenous. Is MSSQL allows to mix rows of two types in single function invocation, I am sure that would be a hell lot of porting trouble.. Just a thought.. Shridhar
Shridhar Daithankar wrote: > Josh Berkus wrote: > >> Bill, >> >>> Some functions they prototyped in MSSQL even return different types, >>> based >>> on certian parameters, I'm not sure how I'll do this in Postgres, but >>> I'll >>> have to figure something out. >> >> We support that as of 7.4.1 to an extent; check out "Polymorphic >> Functions". > > To my understanding, polymorphism means more than one function with same > name but different signature(Sorry C++ days!!). > > That still can not return rwos of two types in one call. At any moment, > rowset returned by a function call would be homogenous. > > Is MSSQL allows to mix rows of two types in single function invocation, > I am sure that would be a hell lot of porting trouble.. These are two seperate problems. 1) Returning a homogenious set of rows, but the composition of those rows will not be known until run time, as a different set of logic will be done depending on the values of some parameters. 2) Returning what MSSQL calls "combined recordsets", which are many rows, but the rows are not homogenious. As I see it, #1 can be solved by polymorphism in Postgres functions. #2 has to be solved at the application level. My solution so far has been to create multiple Postgres functions, call each one in turn, join the results in C, and return them as a structure via SOAP to the client. May not be the easiest way to get it working, but it's working so far. (although I'm always open to suggestions if someone knows of a better way) -- Bill Moran Potential Technologies http://www.potentialtech.com
Shridhar, Bill, > > Is MSSQL allows to mix rows of two types in single function invocation, > > I am sure that would be a hell lot of porting trouble.. There's also the question of whether or not PG would every want to do this. Frankly, as a once-upon-a-time SQL Server application developer, I found the ability to return multiple rowsets from a single SQL Server procedure pretty useless, and a source of endless debugging if we tried to implement it. > 1) Returning a homogenious set of rows, but the composition of those rows > will not be known until run time, as a different set of logic will be > done depending on the values of some parameters. This can be done with Set Returning Functions. The issue is that the call to the function requires special syntax, and the program calling the function must know what columns are going to be returned at the time of the call. Hmmm, is that clear or confusing? > #2 has to be solved at the application level. My solution so far has > been to create multiple Postgres functions, call each one in turn, join > the results in C, and return them as a structure via SOAP to the client. > May not be the easiest way to get it working, but it's working so far. > (although I'm always open to suggestions if someone knows of a better > way) See my comment above. I frankly don't understand what the use of a non-homogenous recordset is. Can you explain? -- Josh Berkus Aglio Database Solutions San Francisco
Josh Berkus wrote: > Shridhar, Bill, > >>>Is MSSQL allows to mix rows of two types in single function invocation, >>>I am sure that would be a hell lot of porting trouble.. > > There's also the question of whether or not PG would every want to do this. > Frankly, as a once-upon-a-time SQL Server application developer, I found the > ability to return multiple rowsets from a single SQL Server procedure pretty > useless, and a source of endless debugging if we tried to implement it. Well, I would have agreed with the uselessness, until this project. The "source of endless debugging" frightens me! >>1) Returning a homogenious set of rows, but the composition of those rows >> will not be known until run time, as a different set of logic will be >> done depending on the values of some parameters. > > This can be done with Set Returning Functions. The issue is that the call to > the function requires special syntax, and the program calling the function > must know what columns are going to be returned at the time of the call. > Hmmm, is that clear or confusing? Clear as mud. In my case, my application simply doesn't care what row of what kind are returned. See, I'm writing the server end, and all said and done, it's really just glue (frighteningly thick glue, but glue nonetheless) Basically, all I do is call each query in turn until I've collected all the results, then marshall the results in to a SOAP XML response (using gsoap, if anyone's curious) and give them back to the client application. It's the client app's job to figure out what to do with them, not mine. I never would have written it this way on my own, but the client app is already written, so as I migrate it to the client-server model, the programmers who wrote the client app are specifying what they expect me to provide them. The only advantage I see is that combining a number of result sets into a single response reduces the number of round trips between the client and server. If Postgres supported combined recordsets, it would simplify my C code somewhat, and possibly speed up things a bit by making less calls between the soap server and Postgrees ... overall, I don't see a huge advantage to it. >>#2 has to be solved at the application level. My solution so far has >>been to create multiple Postgres functions, call each one in turn, join >>the results in C, and return them as a structure via SOAP to the client. >>May not be the easiest way to get it working, but it's working so far. >>(although I'm always open to suggestions if someone knows of a better >>way) > > See my comment above. I frankly don't understand what the use of a > non-homogenous recordset is. Can you explain? I hope what I already mentioned explains enough. If I understand the application enough (and it's amazing how little I understand about it, considering I'm writing the server end!) what they're doing with these combined recordsets is driving their forms. When a form instantiates, it makes a single soap call that causes me to return one of these non-homogenious recordsets. One row may have data on how to display the form, while another has data on what buttons are available, and another has the actual data for the header of the form, while the remaing rows might have data to fill in the lower (grid) portion of the form. If I had designed this, I would just have done the same thing with a homogenious recordset that had values set to null where they weren't apropriate. This would have bloated the data being transfered, but otherwise would have worked in the same way. Now that I'm aware of MSSQL's combined recordset capability, I'm not sure if I would do it differently or not (were I developing on a system that had that capability) I probably won't have a concept of whether or not I think this is a good idea until this project is further along. -- Bill Moran Potential Technologies http://www.potentialtech.com
Bill Moran wrote: > Basically, all I do is call each query in turn until I've collected all the > results, then marshall the results in to a SOAP XML response (using gsoap, > if anyone's curious) and give them back to the client application. It's > the client app's job to figure out what to do with them, not mine. I > never would have written it this way on my own, but the client app is > already written, so as I migrate it to the client-server model, the > programmers who wrote the client app are specifying what they expect me > to provide them. In C++/OO way, that would have been a array/list of base object pointers with virtual methods for - Obtaining result set - Making a soap/XML out of it - Deliver it - And pre/post processing And delegate the unique query/data handling to specific subclasses. Now that you are doing that in C, you can as well use function pointers to mimic inheritance and virtual functions. The only difference being a. No typechecking. A function pointer in C, won't crib if you pass it wrong number/wrong types of argument. C++ compiler will catch that. b. Maintainance is nightmare unless you thoroughly document it. In C++, glancing over source code might suffice. And no, in-source comment rarely suffice when you use C for such applications. I maintain such an application for day-job and I know what help/menace it can be at times..:-( But either way, asking for/using/maintaining non-homogeneous recordset is a classic three finger salute. Load the gun, point to head and shoot. Just a thought.. Shridhar