Thread: Accessing schema data in information schema
I'm updating the information schema for SQL:2003. I'm having some difficulties with the "sequences" view. It should look approximately like this (uninteresting stuff omitted): CREATE VIEW sequences AS SELECT CAST(current_database() AS sql_identifier) AS sequence_catalog, CAST(nc.nspname ASsql_identifier) AS sequence_schema, CAST(c.relname AS sql_identifier) AS sequence_name, CAST(null AS cardinal_number)AS maximum_value, -- FIXME CAST(null AS cardinal_number) AS minimum_value, -- FIXME CAST(nullAS cardinal_number) AS increment, -- FIXME CAST(null AS character_data) AS cycle_option -- FIXME FROMpg_namespace nc, pg_class c WHERE c.relnamespace = nc.oid AND c.relkind = 's'; How does one get at the missing fields. The only way I know is selecting from the sequence, but how does one work this into this query? Somehow it seems that these things should be stored in a real system catalog. Ideas (short of using PERFORM in PL/pgSQL)? -- Peter Eisentraut http://developer.postgresql.org/~petere/
Peter Eisentraut <peter_e@gmx.net> writes: > How does one get at the missing fields. The only way I know is > selecting from the sequence, but how does one work this into this > query? Somehow it seems that these things should be stored in a real > system catalog. Yeah. I've occasionally toyed with the idea that sequences should be rows in a single catalog instead of independent tables as they are now. This would make for a much smaller disk footprint (with consequent I/O savings) and would solve problems like the one you have. Unfortunately the backward-compatibility issues seem a bit daunting :-(. It's probably not completely impossible, but how do we preserve the existing behavior that you can "SELECT * FROM seqname" and get the parameters? Ideally I'd likeSELECT * FROM seqname; -- gets params of one sequenceSELECT * FROM pg_sequence; -- gets paramsof all sequences One possible kluge is to make all the sequences be child tables of a pg_sequence catalog that exists only to be their inheritance parent. This seems pretty ugly from a performance point of view though. Selecting from pg_sequence would be really expensive if you have a lot of sequences, and there wouldn't be any opportunity for reducing the disk footprint. (Thinks a bit...) Maybe it would work for pg_sequence to be a real catalog with a row per sequence, and we also create a view named after the sequence that simply selects from pg_sequence with an appropriate WHERE condition. Plan C would be to say that we don't need to preserve "SELECT * FROM seqname", but I'll bet there would be some hollering. regards, tom lane
Ühel kenal päeval, K, 2006-03-22 kell 16:11, kirjutas Tom Lane: > Peter Eisentraut <peter_e@gmx.net> writes: > > How does one get at the missing fields. The only way I know is > > selecting from the sequence, but how does one work this into this > > query? Somehow it seems that these things should be stored in a real > > system catalog. > > Yeah. I've occasionally toyed with the idea that sequences should be > rows in a single catalog instead of independent tables as they are now. > This would make for a much smaller disk footprint (with consequent I/O > savings) and would solve problems like the one you have. Would it not make page locking problems much worse with all get_next()'s competeing to update the same page? At least unless you reserve one page for each sequence. ------------- Hannu
Hannu Krosing <hannu@skype.net> writes: > Ühel kenal päeval, K, 2006-03-22 kell 16:11, kirjutas Tom Lane: >> Yeah. I've occasionally toyed with the idea that sequences should be >> rows in a single catalog instead of independent tables as they are now. >> This would make for a much smaller disk footprint (with consequent I/O >> savings) and would solve problems like the one you have. > Would it not make page locking problems much worse with all get_next()'s > competeing to update the same page? Well, there'd be at most about 80 sequences per page (ballpark estimate remembering that we'd still want to store a sequence name) and the reduction in demand for shared buffers might outweigh the increased contention for any one buffer. I haven't seen any examples where get_next is the key source of contention anyhow. A last point is that in simple cases where the contention is all on one sequence, you're going to have that problem anyway. > At least unless you reserve one page for each sequence. Which is exactly what I don't want. But we could imagine padding the tuples to achieve any particular tuples/page ratio we want, if 80 proves to be uncomfortably many. regards, tom lane
Ühel kenal päeval, K, 2006-03-22 kell 17:29, kirjutas Tom Lane: > Hannu Krosing <hannu@skype.net> writes: > > Ühel kenal päeval, K, 2006-03-22 kell 16:11, kirjutas Tom Lane: > >> Yeah. I've occasionally toyed with the idea that sequences should be > >> rows in a single catalog instead of independent tables as they are now. > >> This would make for a much smaller disk footprint (with consequent I/O > >> savings) and would solve problems like the one you have. > > > Would it not make page locking problems much worse with all get_next()'s > > competeing to update the same page? > > Well, there'd be at most about 80 sequences per page (ballpark estimate > remembering that we'd still want to store a sequence name) and the > reduction in demand for shared buffers might outweigh the increased > contention for any one buffer. I haven't seen any examples where get_next > is the key source of contention anyhow. Probably true. I can't think of one right now either. And we have caching to solve these cases. > A last point is that in simple > cases where the contention is all on one sequence, you're going to have > that problem anyway. > > > At least unless you reserve one page for each sequence. > > Which is exactly what I don't want. But we could imagine padding the > tuples to achieve any particular tuples/page ratio we want, if 80 proves > to be uncomfortably many. I guess we can't easily start locking some subarea of a page, say 256 byte subpage, or just the tuple. OTOH it may be possible as we don't need to lock page header for sequences as the tuple is updated in place and will not change in size. OTOOH, I'm afraid we still need to WAL the whole page, so the savings will be marginal. ------------ Hannu
Hannu Krosing wrote: > I guess we can't easily start locking some subarea of a page, say 256 > byte subpage, or just the tuple. > OTOH it may be possible as we don't need to lock page header for > sequences as the tuple is updated in place and will not change in size. Huh, we _can_ lock individual tuples, using LockTuple() (or rather, heap_lock_tuple). Since the tuple is modified in place, there's no need to lock the whole page. > OTOOH, I'm afraid we still need to WAL the whole page, so the savings > will be marginal. Huh, why? We can just keep the current WAL logging for sequences, or something very similar, can't we? -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera <alvherre@commandprompt.com> writes: > Hannu Krosing wrote: >> I guess we can't easily start locking some subarea of a page, say 256 >> byte subpage, or just the tuple. > Huh, we _can_ lock individual tuples, using LockTuple() (or rather, > heap_lock_tuple). Since the tuple is modified in place, there's no need > to lock the whole page. But heap_lock_tuple is pretty expensive and subject to deadlocks. I think getting the buffer content lock on the page will still be the right thing. >> OTOOH, I'm afraid we still need to WAL the whole page, so the savings >> will be marginal. > Huh, why? We can just keep the current WAL logging for sequences, or > something very similar, can't we? In the case of the first touch of a sequence page after checkpoint, we'd need to WAL the whole page image to defend against page breaks during write. After that though the WAL entries would be *smaller* than they are now, since there'd be no need to log the entire content of the changed tuple; we'd know we only need to log the counter advance. It's hard to say whether this'd be a win, loss, or wash without testing. It'd probably depend on how many nextval's per checkpoint you want to assume. regards, tom lane
On Wednesday 22 March 2006 13:11, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > How does one get at the missing fields. The only way I know is > > selecting from the sequence, but how does one work this into this > > query? Somehow it seems that these things should be stored in a real > > system catalog. > > Yeah. I've occasionally toyed with the idea that sequences should be > rows in a single catalog instead of independent tables as they are now. > This would make for a much smaller disk footprint (with consequent I/O > savings) and would solve problems like the one you have. Unfortunately > the backward-compatibility issues seem a bit daunting :-(. It's > probably not completely impossible, but how do we preserve the existing > behavior that you can "SELECT * FROM seqname" and get the parameters? > > Ideally I'd like > SELECT * FROM seqname; -- gets params of one sequence > SELECT * FROM pg_sequence; -- gets params of all sequences > > One possible kluge is to make all the sequences be child tables of a > pg_sequence catalog that exists only to be their inheritance parent. > This seems pretty ugly from a performance point of view though. > Selecting from pg_sequence would be really expensive if you have a lot > of sequences, and there wouldn't be any opportunity for reducing the > disk footprint. > > (Thinks a bit...) Maybe it would work for pg_sequence to be a real > catalog with a row per sequence, and we also create a view named after > the sequence that simply selects from pg_sequence with an appropriate > WHERE condition. I'd think that would be a workable solution, with documentation notes that this will be deprecated in favor of information_schema in an upcoming release ? > > Plan C would be to say that we don't need to preserve "SELECT * FROM > seqname", but I'll bet there would be some hollering.? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Darcy Buskermolen Wavefire Technologies Corp. http://www.wavefire.com ph: 250.717.0200 fx: 250.763.1759
Darcy Buskermolen <darcy@wavefire.com> writes: > On Wednesday 22 March 2006 13:11, Tom Lane wrote: >> (Thinks a bit...) Maybe it would work for pg_sequence to be a real >> catalog with a row per sequence, and we also create a view named after >> the sequence that simply selects from pg_sequence with an appropriate >> WHERE condition. > I'd think that would be a workable solution, with documentation notes that > this will be deprecated in favor of information_schema in an upcoming > release ? Yeah, we could consider the views a transitional thing, and get rid of them after a release or two. Tell people to change over to either look in the pg_sequence catalog, or use the information_schema view. Does that view expose everything that there is, though, or will we have proprietary extensions that are not in SQL2003? regards, tom lane
Tom Lane said: > Darcy Buskermolen <darcy@wavefire.com> writes: >> On Wednesday 22 March 2006 13:11, Tom Lane wrote: >>> (Thinks a bit...) Maybe it would work for pg_sequence to be a real >>> catalog with a row per sequence, and we also create a view named >>> after the sequence that simply selects from pg_sequence with an >>> appropriate WHERE condition. > >> I'd think that would be a workable solution, with documentation notes >> that this will be deprecated in favor of information_schema in an >> upcoming release ? > > Yeah, we could consider the views a transitional thing, and get rid of > them after a release or two. Tell people to change over to either look > in the pg_sequence catalog, or use the information_schema view. Does > that view expose everything that there is, though, or will we have > proprietary extensions that are not in SQL2003? > What happens to sequence ACLs? cheers andrew
"Andrew Dunstan" <andrew@dunslane.net> writes: > What happens to sequence ACLs? Hm, good point. We could put 'em in pg_sequence, except that most of the operations on pg_sequence rows will be nontransactional, and that doesn't seem to square nicely with transactional updates on ACLs. Maybe we need two catalogs just to separate the transactional and nontransactional data for a sequence? Ugh. regards, tom lane
> Hm, good point. We could put 'em in pg_sequence, except that most of > the operations on pg_sequence rows will be nontransactional, and that > doesn't seem to square nicely with transactional updates on ACLs. > Maybe we need two catalogs just to separate the transactional and > nontransactional data for a sequence? Ugh. Is it possible to have an SRF that can peek into the lastval data and present it, and make no changes to our catalogs at all? Or can't we use in the schema view something like: CREATE VIEW sequences AS SELECT CAST(current_database() AS sql_identifier) AS sequence_catalog, CAST(nc.nspnameAS sql_identifier) AS sequence_schema, CAST(c.relname AS sql_identifier) AS sequence_name, (SELECT seq_info('sequence_name', 'max')) AS maximum_value, (SELECT seq_info('sequence_name', 'min')) AS minimum_value, (SELECT seq_info('sequence_name', 'inc')) AS increment, (SELECT seq_info('sequence_name', 'cycle'))AS cycle_option FROM pg_namespace nc, pg_class c WHERE c.relnamespace = nc.oid AND c.relkind = 's'; Chris
Ühel kenal päeval, K, 2006-03-22 kell 21:50, kirjutas Andrew Dunstan: > Tom Lane said: > > Darcy Buskermolen <darcy@wavefire.com> writes: > >> On Wednesday 22 March 2006 13:11, Tom Lane wrote: > >>> (Thinks a bit...) Maybe it would work for pg_sequence to be a real > >>> catalog with a row per sequence, and we also create a view named > >>> after the sequence that simply selects from pg_sequence with an > >>> appropriate WHERE condition. > > > >> I'd think that would be a workable solution, with documentation notes > >> that this will be deprecated in favor of information_schema in an > >> upcoming release ? > > > > Yeah, we could consider the views a transitional thing, and get rid of > > them after a release or two. Tell people to change over to either look > > in the pg_sequence catalog, or use the information_schema view. Does > > that view expose everything that there is, though, or will we have > > proprietary extensions that are not in SQL2003? > > > > What happens to sequence ACLs? perhaps we can keep pg_class part of seqs and just make the pg_class.relfilenode to point to row oid in pg_sequence table ? ------------- Hannu
On Thu, Mar 23, 2006 at 12:10:54AM +0200, Hannu Krosing wrote: > ??hel kenal p??eval, K, 2006-03-22 kell 16:11, kirjutas Tom Lane: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > How does one get at the missing fields. The only way I know is > > > selecting from the sequence, but how does one work this into this > > > query? Somehow it seems that these things should be stored in a real > > > system catalog. > > > > Yeah. I've occasionally toyed with the idea that sequences should be > > rows in a single catalog instead of independent tables as they are now. > > This would make for a much smaller disk footprint (with consequent I/O > > savings) and would solve problems like the one you have. > > Would it not make page locking problems much worse with all get_next()'s > competeing to update the same page? What about bumping up the default cache setting a bit? Even going to a fairly conservative value, like 10 or 25 would probably make a huge difference. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> Plan C would be to say that we don't need to preserve "SELECT * FROM > seqname", but I'll bet there would be some hollering. I'd like to hear this hollering first, before we create tons of views :-) Imho it is not a problem to remove it, I am for Plan C. (Those with need for the select can still create their view by hand. A release note would be sufficient imho.) Of course if we still need one row in pg_class for the ACL's, that row might as well be a view. Andreas
"Zeugswetter Andreas DCP SD" <ZeugswetterA@spardat.at> writes: > Of course if we still need one row in pg_class for the ACL's, that row > might as well be a view. Yeah, view or view-like thingie. Given the thought that we need both transactional and nontransactional state for a sequence, I'm kind of inclined to leave the transactional data in pg_class. We could still imagine putting the nontransactional state into a new pg_sequence catalog indexed by, say, the pg_class OID of the sequences. OTOH I'm not sure how much that buys for Peter's problem --- it might be better for him just to invent some functions that can grab the required data given the sequence OID. regards, tom lane
Tom Lane wrote: >Given the thought that we need both >transactional and nontransactional state for a sequence, I'm kind of >inclined to leave the transactional data in pg_class. We could still >imagine putting the nontransactional state into a new pg_sequence >catalog indexed by, say, the pg_class OID of the sequences. OTOH I'm >not sure how much that buys for Peter's problem --- it might be better >for him just to invent some functions that can grab the required data >given the sequence OID. > > > Yes, this seems a lot of lifting for a fairly small need. If there aren't other advantages, a simple function or two seems a better way to go, and then there are no legacy problems. cheers andrew