Thread: Extension table data
I've read the documentation for extensions and how their data is not normally dumped by pg_dump and how to configure the table so it should dump its data http://www.postgresql.org/docs/9.1/static/extend-extensions.html However, after setting this option for the tables, the data is not being dumped when I do a pg_dump of either individual tables or the schema I've installed the extension to. The database I'm working with right now is far too big to do a full pg_dump or pg_dumpall, so I haven't been able to test that. I would assume that doing just the table or schema should work, though? Extension I'm working on: https://github.com/omniti-labs/pg_jobmon Anyone else having this issue or am I doing something wrong? -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. 443.325.1357 x251
Keith Fiske <keith@omniti.com> writes: > I've read the documentation for extensions and how their data is not > normally dumped by pg_dump and how to configure the table so it should > dump its data > http://www.postgresql.org/docs/9.1/static/extend-extensions.html > However, after setting this option for the tables, the data is not > being dumped when I do a pg_dump of either individual tables or the > schema I've installed the extension to. IIRC, the design intention is that such a table's data would be dumped when (and only when) the extension is dumped. That is, I'd expect to see a "CREATE EXTENSION foo;" and then data for the extension's tables. The partial-dump scenarios you mention wouldn't dump extensions, hence not extension table data either. Whether this design is a good one is still under debate, but I think pg_dump is operating as designed here ... regards, tom lane
With the current design, I understand what you're saying now. Just doing some more testing, I was able to do a pg_dump -Fc -s for the entire database and looking through the resulting object list with pg_restore -l I'm actually seeing the extension table data included in the dump file. Doing a restore on the schema I put the extension in, I see the COPY commands to restore the data. I think this is a serious shortcoming, and a confusing state of affairs. The only way to get this data out seems to be to do a full database dump and the schema-only option to pg_dump is outputting data. Looking at the docs, I think the extension authors may have only had configuration data in mind for extension tables. I don't see any reason why we shouldn't be able to put any sort of table in our extensions, some having actual data, not just config. That's actually what I'm doing with my pg_jobmon extension, which could potentially have millions of rows over time. Not having the tables included in the extension definition would just make setting the whole thing up more difficult and error prone. If extensions really are going to be self contained like this as far as dumps, perhaps another option to pg_dump is needed, and have the schema-only or data-only options be honored in that case as well. -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. 443.325.1357 x251 On Fri, Jun 8, 2012 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Keith Fiske <keith@omniti.com> writes: >> I've read the documentation for extensions and how their data is not >> normally dumped by pg_dump and how to configure the table so it should >> dump its data >> http://www.postgresql.org/docs/9.1/static/extend-extensions.html >> However, after setting this option for the tables, the data is not >> being dumped when I do a pg_dump of either individual tables or the >> schema I've installed the extension to. > > IIRC, the design intention is that such a table's data would be dumped > when (and only when) the extension is dumped. That is, I'd expect to > see a "CREATE EXTENSION foo;" and then data for the extension's tables. > The partial-dump scenarios you mention wouldn't dump extensions, hence > not extension table data either. > > Whether this design is a good one is still under debate, but I think > pg_dump is operating as designed here ... > > regards, tom lane
Just found something else rather disturbing. If you try to exclude the schema that the extension tables are in, their data is still output. Explicitly naming other schemas doesn't seem to dump the extension data. So the only way to avoid getting the extension data in a schema-only dump is to explicitly name all schemas but the one your extension is in, which I think is another bug you had actually fixed for 9.1.3 where extension data was always being dumped. -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. 443.325.1357 x251 On Sat, Jun 9, 2012 at 2:56 AM, Keith Fiske <keith@omniti.com> wrote: > With the current design, I understand what you're saying now. Just > doing some more testing, I was able to do a pg_dump -Fc -s for the > entire database and looking through the resulting object list with > pg_restore -l I'm actually seeing the extension table data included in > the dump file. Doing a restore on the schema I put the extension in, I > see the COPY commands to restore the data. I think this is a serious > shortcoming, and a confusing state of affairs. The only way to get > this data out seems to be to do a full database dump and the > schema-only option to pg_dump is outputting data. > > Looking at the docs, I think the extension authors may have only had > configuration data in mind for extension tables. I don't see any > reason why we shouldn't be able to put any sort of table in our > extensions, some having actual data, not just config. That's actually > what I'm doing with my pg_jobmon extension, which could potentially > have millions of rows over time. Not having the tables included in the > extension definition would just make setting the whole thing up more > difficult and error prone. > > If extensions really are going to be self contained like this as far > as dumps, perhaps another option to pg_dump is needed, and have the > schema-only or data-only options be honored in that case as well. > > -- > Keith Fiske > Database Administrator > OmniTI Computer Consulting, Inc. > 443.325.1357 x251 > > > On Fri, Jun 8, 2012 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Keith Fiske <keith@omniti.com> writes: >>> I've read the documentation for extensions and how their data is not >>> normally dumped by pg_dump and how to configure the table so it should >>> dump its data >>> http://www.postgresql.org/docs/9.1/static/extend-extensions.html >>> However, after setting this option for the tables, the data is not >>> being dumped when I do a pg_dump of either individual tables or the >>> schema I've installed the extension to. >> >> IIRC, the design intention is that such a table's data would be dumped >> when (and only when) the extension is dumped. That is, I'd expect to >> see a "CREATE EXTENSION foo;" and then data for the extension's tables. >> The partial-dump scenarios you mention wouldn't dump extensions, hence >> not extension table data either. >> >> Whether this design is a good one is still under debate, but I think >> pg_dump is operating as designed here ... >> >> regards, tom lane
On 2012-06-09 08:56, Keith Fiske wrote: > Looking at the docs, I think the extension authors may have only had > configuration data in mind for extension tables. I don't see any > reason why we shouldn't be able to put any sort of table in our > extensions, some having actual data, not just config. True. We've made several extensions that consist only of table data, such as contents of medical codesystems, where the actual terminology comes from another extension that has functions and empty codesystem tables. I remember there were some issues making it. IIRC the table data would not be dropped on extension drop. regards, Yeb -- Yeb Havinga http://www.mgrid.net/ Mastering Medical Data
Keith Fiske <keith@omniti.com> writes: > Just found something else rather disturbing. If you try to exclude the > schema that the extension tables are in, their data is still output. This is a common misconception: extensions do not live within schemas. (An extension might own a schema, not the other way around.) So a dump with a -n switch is never going to select an extension. By and large, if the current behavior bothers you, ISTM it probably means you are using these tables in a way other than what the concept of an extension configuration table was meant for: namely, to hold configuration data that would be referenced by the functions in that extension, but would not normally be considered part of the user's data. There has been some talk of trying to cater for a more general notion of tables created by extensions, but we do not have a design or even a clear idea of a set of requirements for that. Perhaps it would be good if you explained what is your use-case --- why are you concerned about being able to manage these tables as if they were regular data? regards, tom lane
On Sat, Jun 9, 2012 at 5:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Keith Fiske <keith@omniti.com> writes: >> Just found something else rather disturbing. If you try to exclude the >> schema that the extension tables are in, their data is still output. > > This is a common misconception: extensions do not live within schemas. > (An extension might own a schema, not the other way around.) So a > dump with a -n switch is never going to select an extension. > > By and large, if the current behavior bothers you, ISTM it probably > means you are using these tables in a way other than what the concept of > an extension configuration table was meant for: namely, to hold > configuration data that would be referenced by the functions in that > extension, but would not normally be considered part of the user's data. > There has been some talk of trying to cater for a more general notion of > tables created by extensions, but we do not have a design or even a > clear idea of a set of requirements for that. Perhaps it would be good > if you explained what is your use-case --- why are you concerned about > being able to manage these tables as if they were regular data? > Here's a use case I have been thinking a lot about lately. I am not sure that extensions is the right vehicle for it, but it may be a good starting point. LedgerSMB is moving towards a more modular structure and LedgerSMB 1.5 will probably require PostgreSQL 9.1 or higher (owing to the writable CTE's). I have been looking at how the extension system can be used to simplify our maintenance and it helps quite a bit. However, one key aspect that would be really nice would be managing schema changes along with changing versions of an extension. For example, one might have a fixed asset module, and that module might have a series of stored procedures. We would have some tables that store configuration data regarding the module. For example we might have a table that stores info on stored procedures that track depreciation methods. The use of tables for configuration data in the current approach fits this nicely. However we might have other tables which store data, things like which fixed assets get depreciated in which ways, and the like. These may need to have columns added from time to time, or have other alter table operations performed. It would be nice to be able to manage these schema changes and upgrading the extension in the same framework. I am wondering if some sort of "database modules" framework might be helpful with modules possibly having extensions, but also having data tables. Best Wishes, Chris Travers
Along with Chris, the need for having tables defined in the extension is about keeping control of the table structure so the extension functions are guaranteed to work as intended and to make upgrading versions easier. As an example, the fact that I can specifically name constraints makes it easier to write an extension upgrade script if that constraint needs to change because I know exactly what I called it. Trying to support extensions and requiring that we write long, explicit instructions for creating and maintaining the associated tables is just asking for trouble. Especially when we can have the control we need to avoid these issues. Honestly, the big issue I have right now is that it is dumping data with the schema-only option and only dumping that data if you do a complete database dump. That is making it very difficult to manage extensions even using them as they're documented now. I think its time to recognize the extension system is more widely usable than it was originally intended. And that's a good thing! We just need to try to find ways to make the existing tools work in a more predictable manner now. -- Keith Fiske Database Administrator OmniTI Computer Consulting, Inc. 443.325.1357 x251 On Sat, Jun 9, 2012 at 9:56 AM, Chris Travers <chris.travers@gmail.com> wrote: > On Sat, Jun 9, 2012 at 5:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Keith Fiske <keith@omniti.com> writes: >>> Just found something else rather disturbing. If you try to exclude the >>> schema that the extension tables are in, their data is still output. >> >> This is a common misconception: extensions do not live within schemas. >> (An extension might own a schema, not the other way around.) So a >> dump with a -n switch is never going to select an extension. >> >> By and large, if the current behavior bothers you, ISTM it probably >> means you are using these tables in a way other than what the concept of >> an extension configuration table was meant for: namely, to hold >> configuration data that would be referenced by the functions in that >> extension, but would not normally be considered part of the user's data. >> There has been some talk of trying to cater for a more general notion of >> tables created by extensions, but we do not have a design or even a >> clear idea of a set of requirements for that. Perhaps it would be good >> if you explained what is your use-case --- why are you concerned about >> being able to manage these tables as if they were regular data? >> > Here's a use case I have been thinking a lot about lately. I am not > sure that extensions is the right vehicle for it, but it may be a good > starting point. > > LedgerSMB is moving towards a more modular structure and LedgerSMB 1.5 > will probably require PostgreSQL 9.1 or higher (owing to the writable > CTE's). I have been looking at how the extension system can be used > to simplify our maintenance and it helps quite a bit. However, one > key aspect that would be really nice would be managing schema changes > along with changing versions of an extension. For example, one might > have a fixed asset module, and that module might have a series of > stored procedures. We would have some tables that store configuration > data regarding the module. For example we might have a table that > stores info on stored procedures that track depreciation methods. The > use of tables for configuration data in the current approach fits this > nicely. > > However we might have other tables which store data, things like which > fixed assets get depreciated in which ways, and the like. These may > need to have columns added from time to time, or have other alter > table operations performed. It would be nice to be able to manage > these schema changes and upgrading the extension in the same > framework. > > I am wondering if some sort of "database modules" framework might be > helpful with modules possibly having extensions, but also having data > tables. > > Best Wishes, > Chris Travers