Thread: Feature request: temporary schemas
Hi, I'd like to propose the ability to create temporary schemas. Unlike temporary tables, this feature would enable developers to create a temporary schema once and execute CREATE TABLE statements without the TEMPORARY parameter. This would facilitate running unit tests, where developers would like to run the same creation script for unit tests and production code but do not wish to parameterize each CREATE TABLE statement (both environments are expected to execute identical scripts). It further enables the use of temporary functions, something which is not possible today (apparently you can hack this too, but there isn't an "official" way of doing so). See http://dba.stackexchange.com/q/76494/4719 for a related discussion. Should I move this discussion to a different mailing list or is this the correct location? Thanks, Gili -- View this message in context: http://postgresql.1045698.n5.nabble.com/Feature-request-temporary-schemas-tp5819001.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On September 14, 2014 5:01:54 PM EDT, cowwoc <cowwoc@bbs.darktech.org> wrote: >Hi, > >I'd like to propose the ability to create temporary schemas. > >This would facilitate running unit tests, where developers would like >to run >the same creation script for unit tests and production code but do not What if you ran a temporary database instead? You can use some combination of PGHOST, PGPORT, PGUSER and PGDATABASE to "pg_ctlstart" an empty DB and then point your unit tests at it. Then you aren't in danger of wiping out your productionDB either.
Hi Nick,
I don't think this would help for three reasons:
On 14/09/2014 5:52 PM, Nick Guenther [via PostgreSQL] wrote:
I don't think this would help for three reasons:
- I'm already planning to run unit tests against a separate (but identical) database than production, so there's no danger of wiping out the production database.
- I need to create a new temporary schema per test, and run 4-10 of tests per second. I'm guessing this wouldn't perform well using "pg_ctl".
- And finally, I'd like to configure all this through JDBC (pure Java). "pg_ctl" would require me to interact with native code.
On 14/09/2014 5:52 PM, Nick Guenther [via PostgreSQL] wrote:
On September 14, 2014 5:01:54 PM EDT, cowwoc <[hidden email]> wrote:
>Hi,
>
>I'd like to propose the ability to create temporary schemas.
>
>This would facilitate running unit tests, where developers would like
>to run
>the same creation script for unit tests and production code but do not
What if you ran a temporary database instead? You can use some combination of PGHOST, PGPORT, PGUSER and PGDATABASE to "pg_ctl start" an empty DB and then point your unit tests at it. Then you aren't in danger of wiping out your production DB either.
--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-generalIf you reply to this email, your message will be added to the discussion below:http://postgresql.1045698.n5.nabble.com/Feature-request-temporary-schemas-tp5819001p5819003.html
View this message in context: Re: Feature request: temporary schemas
Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On September 14, 2014 6:01:15 PM EDT, cowwoc <cowwoc@bbs.darktech.org> wrote: > >On 14/09/2014 5:52 PM, Nick Guenther [via PostgreSQL] wrote: >> >> >> On September 14, 2014 5:01:54 PM EDT, cowwoc <[hidden email] >> </user/SendEmail.jtp?type=node&node=5819003&i=0>> wrote: >> >Hi, >> > >> >I'd like to propose the ability to create temporary schemas. >> > >> >This would facilitate running unit tests, where developers would >like >> >to run >> >the same creation script for unit tests and production code but do >not >> >> What if you ran a temporary database instead? You can use some >> combination of PGHOST, PGPORT, PGUSER and PGDATABASE to "pg_ctl >start" >> an empty DB and then point your unit tests at it. Then you aren't in > >> danger of wiping out your production DB either. >> >> >Hi Nick, > >I don't think this would help for three reasons: > > 1. I'm already planning to run unit tests against a separate (but > identical) database than production, so there's no danger of wiping > out the production database. > 2. I need to create a new temporary schema per test, and run 4-10 of > tests per second. I'm guessing this wouldn't perform well using > "pg_ctl". > 3. And finally, I'd like to configure all this through JDBC (pure > Java). "pg_ctl" would require me to interact with native code. Well, forget pg_ctl then. Can you use DROP DATABASE instead?
On 09/14/2014 02:01 PM, cowwoc wrote: > Hi, > > I'd like to propose the ability to create temporary schemas. > > Unlike temporary tables, this feature would enable developers to create a > temporary schema once and execute CREATE TABLE statements without the > TEMPORARY parameter. > > This would facilitate running unit tests, where developers would like to run > the same creation script for unit tests and production code but do not wish > to parameterize each CREATE TABLE statement (both environments are expected > to execute identical scripts). It further enables the use of temporary > functions, something which is not possible today (apparently you can hack > this too, but there isn't an "official" way of doing so). > > See http://dba.stackexchange.com/q/76494/4719 for a related discussion. > > Should I move this discussion to a different mailing list or is this the > correct location? So from the above link and the discussion here so far I gather you want: 1) A CREATE TEMPORARY SCHEMA that behaves like CREATE TEMPORARY TABLE, where it lasts only for a session and masks any existing schema and contained objects with the same name for the duration. 2) You want to run this on a test database as so you can have 'masked' tests over the permanent schema. Presumably because the tests can be destructive and you do not want change the underlying structure and/or data in your test database. 3) This needs to happen on the order of 4-10 times a second and it needs to be tied to a session, as multiple transactions need to be run in each test. As to whether this is the appropriate list, I would say yes at least for sounding out the idea. The list to address for dealing with the folks that actually would implement the idea I would suggest --hackers. In the meantime, the only way I can see doing this is creating and dropping databases. Not sure that would meet your 4-10 times a second requirement though. > > Thanks, > Gili > -- Adrian Klaver adrian.klaver@aklaver.com
Hi Adrian, Replies below. On 14/09/2014 8:34 PM, Adrian Klaver wrote: > On 09/14/2014 02:01 PM, cowwoc wrote: >> See http://dba.stackexchange.com/q/76494/4719 for a related discussion. > So from the above link and the discussion here so far I gather you want: > > 1) A CREATE TEMPORARY SCHEMA that behaves like CREATE TEMPORARY TABLE, > where it lasts only for a session and masks any existing schema and > contained objects with the same name for the duration. Yes. Tests should not be able to see each other's tables, data (and ideally not even each other's schema). > 2) You want to run this on a test database as so you can have 'masked' > tests over the permanent schema. Presumably because the tests can be > destructive and you do not want change the underlying structure and/or > data in your test database. Not exactly. Each test is responsible for populating its own schema (creating tables, inserting data). The main purpose of using temporary schemas is to ensure that each test runs in isolation so that data from other tests cannot influence the outcome of the test. This ensures test execution/results are 100% reproducible. > 3) This needs to happen on the order of 4-10 times a second and it > needs to be tied to a session, as multiple transactions need to be run > in each test. > > As to whether this is the appropriate list, I would say yes at least > for sounding out the idea. The list to address for dealing with the > folks that actually would implement the idea I would suggest --hackers. > > In the meantime, the only way I can see doing this is creating and > dropping databases. Not sure that would meet your 4-10 times a second > requirement though. One of the requirements is that if someone kills the process running the unit tests, it can't leave behind any dangling schemas. I expect all test data to get dropped automatically when the connection is closed unexpectedly, so DROP DATABASE won't do. Gili
On 09/14/2014 08:21 PM, cowwoc wrote: > Hi Adrian, > > Replies below. > > On 14/09/2014 8:34 PM, Adrian Klaver wrote: >> On 09/14/2014 02:01 PM, cowwoc wrote: >>> See http://dba.stackexchange.com/q/76494/4719 for a related discussion. >> So from the above link and the discussion here so far I gather you want: >> >> 1) A CREATE TEMPORARY SCHEMA that behaves like CREATE TEMPORARY TABLE, >> where it lasts only for a session and masks any existing schema and >> contained objects with the same name for the duration. > > Yes. Tests should not be able to see each other's tables, data (and > ideally not even each other's schema). > >> 2) You want to run this on a test database as so you can have 'masked' >> tests over the permanent schema. Presumably because the tests can be >> destructive and you do not want change the underlying structure and/or >> data in your test database. > > Not exactly. Each test is responsible for populating its own schema > (creating tables, inserting data). The main purpose of using temporary > schemas is to ensure that each test runs in isolation so that data from > other tests cannot influence the outcome of the test. This ensures test > execution/results are 100% reproducible. So the tests may or may not have anything to do with the existing test database? > >> 3) This needs to happen on the order of 4-10 times a second and it >> needs to be tied to a session, as multiple transactions need to be run >> in each test. >> >> As to whether this is the appropriate list, I would say yes at least >> for sounding out the idea. The list to address for dealing with the >> folks that actually would implement the idea I would suggest --hackers. >> >> In the meantime, the only way I can see doing this is creating and >> dropping databases. Not sure that would meet your 4-10 times a second >> requirement though. > > One of the requirements is that if someone kills the process running the > unit tests, it can't leave behind any dangling schemas. I expect all > test data to get dropped automatically when the connection is closed > unexpectedly, so DROP DATABASE won't do. I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the beginning of the test would handle that. > > Gili > > -- Adrian Klaver adrian.klaver@aklaver.com
On 15/09/2014 9:39 AM, Adrian Klaver wrote: >> >> Not exactly. Each test is responsible for populating its own schema >> (creating tables, inserting data). The main purpose of using temporary >> schemas is to ensure that each test runs in isolation so that data from >> other tests cannot influence the outcome of the test. This ensures test >> execution/results are 100% reproducible. > > So the tests may or may not have anything to do with the existing test > database? Hi Adrian, I don't understand what you mean by "the existing test database". In my mind, tests have nothing in common with each other. They are meant to execute in complete isolation of each other. It sounds to me like you thought I create a test database once (containing the tables, functions, triggers used by tests) and then running tests against that one at a time. In actuality, each test is expected to create its own tables, functions, triggers and execute concurrently and in complete isolation with other tests. >> One of the requirements is that if someone kills the process running the >> unit tests, it can't leave behind any dangling schemas. I expect all >> test data to get dropped automatically when the connection is closed >> unexpectedly, so DROP DATABASE won't do. > > I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the > beginning of the test would handle that. This would only clean up the next time tests are run. I'm looking for a cleanup at the end of the tests, not the beginning. As well, the fact that I have concurrent test execution means that I don't know how many databases/schemas there are to drop. I guess I could scan the database metadata for all test-related schemas but clearly this isn't as clean/fun as having temporary schemas in the first place. And lastly, remember that we want these tests to run as fast as possible. TEMPORARY/UNLOGGED tables are ideal from that point of view but I can't specify TEMPORARY/UNLOGGED because the unit tests and production code must share the same SQL script. Gili
Interesting enough concept. Please don't forget to test against a realistic data set as well. It does seem to me that thedevs can easily make, fill, clean up their own db. And a central builder (eg Jenkins?) can do the same with, importantlyusing ALL tests. Then again using real data. > On Sep 15, 2014, at 8:08 AM, cowwoc <cowwoc@bbs.darktech.org> wrote: > > On 15/09/2014 9:39 AM, Adrian Klaver wrote: >>> >>> Not exactly. Each test is responsible for populating its own schema >>> (creating tables, inserting data). The main purpose of using temporary >>> schemas is to ensure that each test runs in isolation so that data from >>> other tests cannot influence the outcome of the test. This ensures test >>> execution/results are 100% reproducible. >> >> So the tests may or may not have anything to do with the existing test database? > > Hi Adrian, > > I don't understand what you mean by "the existing test database". In my mind, tests have nothing in common with each other.They are meant to execute in complete isolation of each other. > > It sounds to me like you thought I create a test database once (containing the tables, functions, triggers used by tests)and then running tests against that one at a time. In actuality, each test is expected to create its own tables, functions,triggers and execute concurrently and in complete isolation with other tests. > > >>> One of the requirements is that if someone kills the process running the >>> unit tests, it can't leave behind any dangling schemas. I expect all >>> test data to get dropped automatically when the connection is closed >>> unexpectedly, so DROP DATABASE won't do. >> >> I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the beginning of the test would handle that. > > This would only clean up the next time tests are run. I'm looking for a cleanup at the end of the tests, not the beginning. > As well, the fact that I have concurrent test execution means that I don't know how many databases/schemas there are todrop. I guess I could scan the database metadata for all test-related schemas but clearly this isn't as clean/fun as havingtemporary schemas in the first place. > > And lastly, remember that we want these tests to run as fast as possible. TEMPORARY/UNLOGGED tables are ideal from thatpoint of view but I can't specify TEMPORARY/UNLOGGED because the unit tests and production code must share the same SQLscript. > > Gili > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
On 09/15/2014 07:08 AM, cowwoc wrote: > On 15/09/2014 9:39 AM, Adrian Klaver wrote: >>> >>> Not exactly. Each test is responsible for populating its own schema >>> (creating tables, inserting data). The main purpose of using temporary >>> schemas is to ensure that each test runs in isolation so that data from >>> other tests cannot influence the outcome of the test. This ensures test >>> execution/results are 100% reproducible. >> >> So the tests may or may not have anything to do with the existing test >> database? > > Hi Adrian, > > I don't understand what you mean by "the existing test database". In my > mind, tests have nothing in common with each other. They are meant to > execute in complete isolation of each other. From your second post: " 1. I'm already planning to run unit tests against a separate (but identical) database than production, so there's no danger of wiping out the production database. 2. I need to create a new temporary schema per test, and run 4-10 of tests per second. I'm guessing this wouldn't perform well using "pg_ctl". " I took that to mean you want to create the temporary schemas over an existing test database. > > It sounds to me like you thought I create a test database once > (containing the tables, functions, triggers used by tests) and then > running tests against that one at a time. In actuality, each test is > expected to create its own tables, functions, triggers and execute > concurrently and in complete isolation with other tests. > > >>> One of the requirements is that if someone kills the process running the >>> unit tests, it can't leave behind any dangling schemas. I expect all >>> test data to get dropped automatically when the connection is closed >>> unexpectedly, so DROP DATABASE won't do. >> >> I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the >> beginning of the test would handle that. > > This would only clean up the next time tests are run. I'm looking for a > cleanup at the end of the tests, not the beginning. > As well, the fact that I have concurrent test execution means that I > don't know how many databases/schemas there are to drop. I guess I could > scan the database metadata for all test-related schemas but clearly this > isn't as clean/fun as having temporary schemas in the first place. I will admit to not being a testing expert, but from what I have done, I know test suites have setup and teardown sections. Would this not work? My previous suggestion was a fall through for the case you mentioned where a process is terminated outside the test. > > And lastly, remember that we want these tests to run as fast as > possible. TEMPORARY/UNLOGGED tables are ideal from that point of view > but I can't specify TEMPORARY/UNLOGGED because the unit tests and > production code must share the same SQL script. I can see that being possible. There will always be some difference though as unit tests are not the same as rolling out production scripts. The test code will need to include the actual test, unless I am missing something obvious. Entirely possible:) > > Gili > > -- Adrian Klaver adrian.klaver@aklaver.com
On 15/09/2014 10:37 AM, Adrian Klaver wrote: > From your second post: > > " 1. I'm already planning to run unit tests against a separate (but > identical) database than production, so there's no danger of wiping > out the production database. > 2. I need to create a new temporary schema per test, and run 4-10 of > tests per second. I'm guessing this wouldn't perform well using > "pg_ctl". > " > > I took that to mean you want to create the temporary schemas over an > existing test database. Sorry, I am having problems with the Postgresql terminology. H2 "database" = Postgresql "schema". To answer your original question: ideally I would want each test to run in its own database but in practice I don't think you can create/drop a database through JDBC, can you? So instead, I'm thinking about manually creating a test database and have unit tests create/drop schemas inside of that. The test database wouldn't contain anything. It would just isolate unit tests from the production database to make sure they don't clobber each other. If I could get each unit test to run inside its own database, that would be even better. >>> I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the >>> beginning of the test would handle that. >> >> This would only clean up the next time tests are run. I'm looking for a >> cleanup at the end of the tests, not the beginning. >> As well, the fact that I have concurrent test execution means that I >> don't know how many databases/schemas there are to drop. I guess I could >> scan the database metadata for all test-related schemas but clearly this >> isn't as clean/fun as having temporary schemas in the first place. > > I will admit to not being a testing expert, but from what I have done, > I know test suites have setup and teardown sections. > > Would this not work? > > My previous suggestion was a fall through for the case you mentioned > where a process is terminated outside the test. Right, dropping databases in the teardown section is certainly doable. It's just that the teardown section never runs if the unit tests are terminated. > >> And lastly, remember that we want these tests to run as fast as >> possible. TEMPORARY/UNLOGGED tables are ideal from that point of view >> but I can't specify TEMPORARY/UNLOGGED because the unit tests and >> production code must share the same SQL script. > > I can see that being possible. There will always be some difference > though as unit tests are not the same as rolling out production > scripts. The test code will need to include the actual test, unless I > am missing something obvious. Entirely possible:) Unit tests proceed as follows: 1. Run SQL script to populate the test schema (identical for all tests) 2. Run test code (different for each test) I want unit tests and production to share #1 but have different #2. Currently I am forced to produce nearly identical #1 for production and unit tests. This feature request would enable me to have identical #1 :) Gili
On 09/15/2014 08:05 AM, cowwoc wrote: > On 15/09/2014 10:37 AM, Adrian Klaver wrote: >> From your second post: >> >> " 1. I'm already planning to run unit tests against a separate (but >> identical) database than production, so there's no danger of wiping >> out the production database. >> 2. I need to create a new temporary schema per test, and run 4-10 of >> tests per second. I'm guessing this wouldn't perform well using >> "pg_ctl". >> " >> >> I took that to mean you want to create the temporary schemas over an >> existing test database. > > Sorry, I am having problems with the Postgresql terminology. H2 > "database" = Postgresql "schema". > > To answer your original question: ideally I would want each test to run > in its own database but in practice I don't think you can create/drop a > database through JDBC, can you? So instead, I'm thinking about manually > creating a test database and have unit tests create/drop schemas inside > of that. The test database wouldn't contain anything. It would just > isolate unit tests from the production database to make sure they don't > clobber each other. Well DROP DATABASE is a SQL command so it could be used in a query: http://www.postgresql.org/docs/9.3/interactive/sql-dropdatabase.html The caveat is you cannot DROP the database you are connected to. You would need to connect one of the system databases, postgres for instance, and then run the DROP DATABASE command there. > > If I could get each unit test to run inside its own database, that would > be even better. That would be possible, though some thought would need to be given per your desire to run concurrent tests. Would need to think carefully about naming issues. >>>> I would think a DROP DATABASE IF EXISTS, CREATE DATABASE at the >>>> beginning of the test would handle that. >>> >>> This would only clean up the next time tests are run. I'm looking for a >>> cleanup at the end of the tests, not the beginning. >>> As well, the fact that I have concurrent test execution means that I >>> don't know how many databases/schemas there are to drop. I guess I could >>> scan the database metadata for all test-related schemas but clearly this >>> isn't as clean/fun as having temporary schemas in the first place. >> >> I will admit to not being a testing expert, but from what I have done, >> I know test suites have setup and teardown sections. >> >> Would this not work? >> >> My previous suggestion was a fall through for the case you mentioned >> where a process is terminated outside the test. > > Right, dropping databases in the teardown section is certainly doable. > It's just that the teardown section never runs if the unit tests are > terminated. Just a food for thought item. This is something I think you would have to deal with any case. I think it is plausible that even in the TEMPORARY SCHEMA case a unit test could terminate and leave a connection(session) open and therefore the schema still active. Subsequent runs would create new temp schemas, but you could have old carcasses lingering. My guess is that might end up impacting your test results. > >> >>> And lastly, remember that we want these tests to run as fast as >>> possible. TEMPORARY/UNLOGGED tables are ideal from that point of view >>> but I can't specify TEMPORARY/UNLOGGED because the unit tests and >>> production code must share the same SQL script. >> >> I can see that being possible. There will always be some difference >> though as unit tests are not the same as rolling out production >> scripts. The test code will need to include the actual test, unless I >> am missing something obvious. Entirely possible:) > > Unit tests proceed as follows: > > 1. Run SQL script to populate the test schema (identical for all tests) > 2. Run test code (different for each test) > > I want unit tests and production to share #1 but have different #2. > Currently I am forced to produce nearly identical #1 for production and > unit tests. This feature request would enable me to have identical #1 :) I see the logic. The thing is this is a time issue. What you want would be considered a feature change. Feature changes only happen in new major versions. Currently work has started on the next major version, 9.5. So you would need to convince developers to make the change in the near future in order to get it out with 9.5. Even then you are probably looking at around a year before it hits production. In the meantime you will need a Plan B. Doing the DROP/CREATE DATABASE adds just a little difference to the script in #1. After that everything else would be the same. > > Gili -- Adrian Klaver adrian.klaver@aklaver.com
On 14 September 2014 22:01, cowwoc <cowwoc@bbs.darktech.org> wrote:
session's temporary schema. If you set the current schema with "SET
search_path TO pg_temp", all tables, views, functions and so on will be
created in a temporary schema.
The only downside is that you'd need to reference any function calls with
pg_temp explicitly, as pg_temp is "never searched for function or operator
names" (see
http://www.postgresql.org/docs/current/static/runtime-config-client.html).
Hi,I think you can use "pg_temp" for this - it's an alias to the current
I'd like to propose the ability to create temporary schemas.
Unlike temporary tables, this feature would enable developers to create a
temporary schema once and execute CREATE TABLE statements without the
TEMPORARY parameter.
session's temporary schema. If you set the current schema with "SET
search_path TO pg_temp", all tables, views, functions and so on will be
created in a temporary schema.
The only downside is that you'd need to reference any function calls with
pg_temp explicitly, as pg_temp is "never searched for function or operator
names" (see
http://www.postgresql.org/docs/current/static/runtime-config-client.html).
I have just replied to your Stack Exchange answer too.
Hi guys, I wanted to update you on this topic. Pete provided an excellent answer at http://dba.stackexchange.com/a/76661/4719 that almost worked (we couldn't find a solution for functions) but it turns out that my original problem definition was incorrect. I assumed that my unit tests only use one connection but upon implementing the aforementioned solution I discovered that while the client-side is single threaded, the server-side actually spawns off one database connection per request, where each test fires multiple requests. It's somewhat obvious in retrospect, but I hadn't thought of it :) I further discovered that the database migration tool I am using (Flyway) supports placeholders (http://flywaydb.org/documentation/faq.html#placeholders) so I plan on using that to create UNLOGGED tables for tests and normal tables in production. Thank you for your patience and your thoughtful feedback! Gili -- View this message in context: http://postgresql.1045698.n5.nabble.com/Feature-request-temporary-schemas-tp5819001p5819359.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 14 September 2014 22:01, cowwoc <cowwoc@bbs.darktech.org> wrote:
Hi,
I'd like to propose the ability to create temporary schemas.
Unlike temporary tables, this feature would enable developers to create a
temporary schema once and execute CREATE TABLE statements without the
TEMPORARY parameter.
I think you can use "pg_temp" for this - it's an alias to the current session's temporary schema. If you set the current schema with "SET search_path TO pg_temp", all tables, views, functions and so on will be created in a temporary schema.
The only downside is that you'd need to reference any function calls with pg_temp explicitly, as pg_temp is "never searched for function or operator names" (see http://www.postgresql.org/docs/current/static/runtime-config-client.html).
I have just replied to your Stack Exchange answer too.