Thread: pgstat SRF?
While looking over the statistics-for-functions patch (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), I came back to a thought I've had before - why do we keep one function per column for pgstat functions, instead of using a set returning function? Is there some actual reason for this, or is it just legacy from a time when it was (much) harder to write SRFs? If there's no actual reason, I think it would be a good idea to make at least new views added based on SRFs instead.... //Magnus
Magnus Hagander <magnus@hagander.net> writes: > While looking over the statistics-for-functions patch > (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), I > came back to a thought I've had before - why do we keep one function > per column for pgstat functions, instead of using a set returning > function? Is there some actual reason for this, or is it just legacy > from a time when it was (much) harder to write SRFs? I think it's so that you can build your own stats views instead of being compelled to select the data someone thought was good for you. regards, tom lane
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > > While looking over the statistics-for-functions patch > > (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), > > I came back to a thought I've had before - why do we keep one > > function per column for pgstat functions, instead of using a set > > returning function? Is there some actual reason for this, or is it > > just legacy from a time when it was (much) harder to write SRFs? > > I think it's so that you can build your own stats views instead of > being compelled to select the data someone thought was good for you. You can still do that if it's an SRF. You could even make the SRF take an optional argument to either return a single value (filtered the same way the individual functions are) or when this one is set to NULL, return the whole table. It would make the overhead a lot lower in the most common case ("SELECT * FROM pg_stat_<somethingorother>"), while only adding a little in the other cases, I think. Though I'm not sure that overhead is big enough to care about in the first place, but if you're VIEWs are longish it could be... //Magnus
On Apr 21, 2008, at 8:34 AM, Magnus Hagander wrote: > While looking over the statistics-for-functions patch > (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), I > came back to a thought I've had before - why do we keep one function > per column for pgstat functions, instead of using a set returning > function? Is there some actual reason for this, or is it just legacy > from a time when it was (much) harder to write SRFs? > > If there's no actual reason, I think it would be a good idea to > make at > least new views added based on SRFs instead.... +1. I could probably use this for pgstats at some point. -- Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828
Magnus Hagander wrote: > Tom Lane wrote: > > Magnus Hagander <magnus@hagander.net> writes: > > > While looking over the statistics-for-functions patch > > > (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), > > > I came back to a thought I've had before - why do we keep one > > > function per column for pgstat functions, instead of using a set > > > returning function? Is there some actual reason for this, or is it > > > just legacy from a time when it was (much) harder to write SRFs? > > > > I think it's so that you can build your own stats views instead of > > being compelled to select the data someone thought was good for you. > > You can still do that if it's an SRF. You could even make the SRF take > an optional argument to either return a single value (filtered the > same way the individual functions are) or when this one is set to > NULL, return the whole table. > > It would make the overhead a lot lower in the most common case > ("SELECT > * FROM pg_stat_<somethingorother>"), while only adding a little in the > other cases, I think. > > Though I'm not sure that overhead is big enough to care about in the > first place, but if you're VIEWs are longish it could be... Actually, looking at this once more, the interface to the functions sucked more than I thought. They're not actually accepting procpid as parameters, but just an index into the current array in pgstats.. Basically, they're not supposed to be used in any other way than accessing all the rows at once :-) Attached is a version of the functions required for pg_stat_activity implemented as a SRF instead of different functions. A quick benchmark (grabbing the VIEW 10,000 times on a system with about 500 active backends) shows it's about 20% faster than the function-per-value approach, but the runtime per view is still very quick as it is today. (And most of what overhead there is most likely comes from reading the stats file) However, it also implements the lookup-by-PID functionality that IMHO makes a lot more sense than lookup-by-backend-array-index. This is obviously a lot more performant than querying the VIEW for all rows - something that might be a big win for monitoring apps that look for info about a single backend. Unsure if we want to go ahead and convert all functions, but I think we can make a good argument for making *new* stats views (like the ones about functions that in the queue) based on SRFs instead. It also has the nice side-effect of less rows in the system tables ;) Comments? //Magnus
Attachment
Magnus Hagander wrote: > Magnus Hagander wrote: > > Tom Lane wrote: > > > Magnus Hagander <magnus@hagander.net> writes: > > > > While looking over the statistics-for-functions patch > > > > (http://archives.postgresql.org/pgsql-patches/2008-03/msg00300.php), > > > > I came back to a thought I've had before - why do we keep one > > > > function per column for pgstat functions, instead of using a set > > > > returning function? Is there some actual reason for this, or is > > > > it just legacy from a time when it was (much) harder to write > > > > SRFs? > > > > > > I think it's so that you can build your own stats views instead of > > > being compelled to select the data someone thought was good for > > > you. > > > > You can still do that if it's an SRF. You could even make the SRF > > take an optional argument to either return a single value (filtered > > the same way the individual functions are) or when this one is set > > to NULL, return the whole table. > > > > It would make the overhead a lot lower in the most common case > > ("SELECT > > * FROM pg_stat_<somethingorother>"), while only adding a little in > > the other cases, I think. > > > > Though I'm not sure that overhead is big enough to care about in the > > first place, but if you're VIEWs are longish it could be... > > Actually, looking at this once more, the interface to the functions > sucked more than I thought. They're not actually accepting procpid as > parameters, but just an index into the current array in pgstats.. > Basically, they're not supposed to be used in any other way than > accessing all the rows at once :-) > > Attached is a version of the functions required for pg_stat_activity > implemented as a SRF instead of different functions. A quick benchmark > (grabbing the VIEW 10,000 times on a system with about 500 active > backends) shows it's about 20% faster than the function-per-value > approach, but the runtime per view is still very quick as it is today. > (And most of what overhead there is most likely comes from reading the > stats file) > > However, it also implements the lookup-by-PID functionality that IMHO > makes a lot more sense than lookup-by-backend-array-index. This is > obviously a lot more performant than querying the VIEW for all rows - > something that might be a big win for monitoring apps that look for > info about a single backend. > > Unsure if we want to go ahead and convert all functions, but I think > we can make a good argument for making *new* stats views (like the > ones about functions that in the queue) based on SRFs instead. It > also has the nice side-effect of less rows in the system tables ;) > > Comments? Unless there are any objections I'll complete this patch along with the proper documentation, and apply it for now. I will look at converting some further pgstats stuff into SRFs as well, assuming they show similar results.. //Magnus