Thread: Configure Postgres From SQL
Hi, Is it possible to configure postgres from SQL? I am interested in turning off fsync for a set of queries (that take ages to run) and then turn fsync back on again afterwards. Cheers, Tom
On 12 July 2010 14:29, Tom Wilcox <hungrytom@gmail.com> wrote: > Hi, > > Is it possible to configure postgres from SQL? > > I am interested in turning off fsync for a set of queries (that take ages to > run) and then turn fsync back on again afterwards. > > Cheers, > Tom > You can only change that option in postgresql.conf and I don't see how that could only apply to a single query. I'd focus more on optimising your queries either by rewriting them, adding indexes (partial/functional where appropriate), keeping things VACUUM'd or using prepared statements. Regards Thom
> Hi, > > Is it possible to configure postgres from SQL? > > I am interested in turning off fsync for a set of queries (that take > ages to run) and then turn fsync back on again afterwards. There are things that can be changed at runtime using SQL - in that case you may just type "SET enable_seqscan = Off" etc. But you can't change fsync, it does not make sense to change this settings for individual queries. As Thom Brown already pointed out, it's not a good way to tune your queries. If you don't need to keep consistency (which is the purpose of fsync), then you may change this directly in postgresql.conf. And if you don't need consistency you must not change that. Tomas
On 12 July 2010 14:50, Tom Wilcox <hungrytom@gmail.com> wrote: > Hi Thom, > > I am performing update statements that are applied to a single table that is > about 96GB in size. These updates are grouped together in a single > transaction. This transaction runs until the machine runs out of disk space. > > What I am trying to achieve is for postgresql to complete this updating > transaction without running out of memory. I assume that this is happening > because for a Rollback to be possible, postgres must at least keep track of > the previous values/changes whilst the transaction is not complete and > committed. I figured this would be the most likely cause for us to run out > of disk space and therefore I would like to reconfigure postgresql not to > hold onto previous copies somehow. > > Any suggestions? > > Cheers, > Tom > Hi Tom, Is it not possible to do these updates in batches, or does it have to be atomic? (A small note about replying. Please use "reply to all", and on this mailing list responses should go below.) Regards Thom
On Monday 12 July 2010 6:29:14 am Tom Wilcox wrote: > Hi, > > Is it possible to configure postgres from SQL? Yes to a degree, see here: http://www.postgresql.org/docs/8.4/interactive/functions-admin.html > > I am interested in turning off fsync for a set of queries (that take > ages to run) and then turn fsync back on again afterwards. This is one of the options not covered by above. See here: http://www.postgresql.org/docs/8.4/interactive/runtime-config-wal.html Example: test=# SELECT current_setting('fsync'); current_setting ----------------- on (1 row) test=# SELECT set_config('fsync','off',false); ERROR: parameter "fsync" cannot be changed now > > Cheers, > Tom -- Adrian Klaver adrian.klaver@gmail.com
Hi Thom, Yeah They can be divided up, but my main issue is that I would like these functions wrapped up so that the client (who has little to no experience using PostgreSQL) can just run a SQL function that will execute all of these updates and prepare many tables and functions for a product. (Essentially SELECT install_the_program() to setup up the DB and build the tables). However, I keep running into problems because the queries are very time consuming (several days on fast computers with lots of memory) and individual queries seem to require different configuration parameters.. I have a feeling it is all going to boil down to writing a (python) script to build the DB from CLI in Linux. But they really want all the functionality encapsulated in the PostgreSQL server, including this building process. Cheers, Tom On 12/07/2010 14:57, Thom Brown wrote: > On 12 July 2010 14:50, Tom Wilcox<hungrytom@gmail.com> wrote: > >> Hi Thom, >> >> I am performing update statements that are applied to a single table that is >> about 96GB in size. These updates are grouped together in a single >> transaction. This transaction runs until the machine runs out of disk space. >> >> What I am trying to achieve is for postgresql to complete this updating >> transaction without running out of memory. I assume that this is happening >> because for a Rollback to be possible, postgres must at least keep track of >> the previous values/changes whilst the transaction is not complete and >> committed. I figured this would be the most likely cause for us to run out >> of disk space and therefore I would like to reconfigure postgresql not to >> hold onto previous copies somehow. >> >> Any suggestions? >> >> Cheers, >> Tom >> >> > Hi Tom, > > Is it not possible to do these updates in batches, or does it have to be atomic? > > (A small note about replying. Please use "reply to all", and on this > mailing list responses should go below.) > > Regards > > Thom >
Le 12/07/2010 17:02, Tom Wilcox a écrit : > Hi Thom, > > Yeah They can be divided up, but my main issue is that I would like > these functions wrapped up so that the client (who has little to no > experience using PostgreSQL) can just run a SQL function that will > execute all of these updates and prepare many tables and functions for a > product. (Essentially SELECT install_the_program() to setup up the DB > and build the tables). > > However, I keep running into problems because the queries are very time > consuming (several days on fast computers with lots of memory) and > individual queries seem to require different configuration parameters.. > > I have a feeling it is all going to boil down to writing a (python) > script to build the DB from CLI in Linux. But they really want all the > functionality encapsulated in the PostgreSQL server, including this > building process. > Well, you can still use the adminpack contrib module to write the config file from a PostgreSQL connection. But you won't be able to restart PostgreSQL. -- Guillaume http://www.postgresql.fr http://dalibo.com
On Monday 12 July 2010 15:29:14 Tom Wilcox wrote: > Hi, > > Is it possible to configure postgres from SQL? > > I am interested in turning off fsync for a set of queries (that take > ages to run) and then turn fsync back on again afterwards. disabling fsync is nearly never a good idea. What you can change (and that makes quite a bit of sense in some situations) is the "synchronous_commit" setting. What kind of queries are those? Many small transactions? Andres
On Mon, 2010-07-12 at 14:57 +0100, Thom Brown wrote: > On 12 July 2010 14:50, Tom Wilcox <hungrytom@gmail.com> wrote: > > Hi Thom, > > > > I am performing update statements that are applied to a single table that is > > about 96GB in size. These updates are grouped together in a single > > transaction. This transaction runs until the machine runs out of disk space. As you are updating this table, you are leaving dead tuples behind for each of the updates that are not hot updates and the table is getting bloated. That is most likely why you are running out of disk space. Turning off fsync will not help you with this. What will help you is trying to get the database to use hot updates instead, or batching the updates and letting the table get vacuumed often enough so that the dead tuples can get marked for re-use. Hot updates would be very beneficial, even if batch updating. They will happen if their is no index on the updated column and there is enough space in the physical page to keep the tuple on the same page. You can adjust the fillfactor to try and favour this. You can check if you are doing hot updates by looking at pg_stat_user_tables for the number of hot updates. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp.
On Mon, Jul 12, 2010 at 7:57 AM, Thom Brown <thombrown@gmail.com> wrote: > On 12 July 2010 14:50, Tom Wilcox <hungrytom@gmail.com> wrote: >> Hi Thom, >> >> I am performing update statements that are applied to a single table that is >> about 96GB in size. These updates are grouped together in a single >> transaction. This transaction runs until the machine runs out of disk space. >> >> What I am trying to achieve is for postgresql to complete this updating >> transaction without running out of memory. I assume that this is happening >> because for a Rollback to be possible, postgres must at least keep track of >> the previous values/changes whilst the transaction is not complete and >> committed. I figured this would be the most likely cause for us to run out >> of disk space and therefore I would like to reconfigure postgresql not to >> hold onto previous copies somehow. >> >> Any suggestions? Is there a way to insert the data with these values already set when you first load the db?
Andres Freund wrote: > What you can change (and that makes quite a bit of sense in some situations) > is the "synchronous_commit" setting. > Right. In almost every case where people think they want to disable fsync, what they really should be doing instead is turning off synchronous commit--which is a user-land tunable per session: SET synchronous_commit=false; And potentially increasing wal_writer_delay on the server too: http://www.postgresql.org/docs/current/static/wal-async-commit.html -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
I could perform the settings manually (set config, restart svr, execute script, come back 2 days later, reset config, restart svr, execute more script,...), but that sort of defeats the point. My aim to have the simplest, automatic setup possible. Preferably completely contained within PostgreSQL so that all is need is a dump of the DB for a complete backup... On 12/07/2010 19:26, Scott Marlowe wrote: > On Mon, Jul 12, 2010 at 7:57 AM, Thom Brown<thombrown@gmail.com> wrote: > >> On 12 July 2010 14:50, Tom Wilcox<hungrytom@gmail.com> wrote: >> >>> Hi Thom, >>> >>> I am performing update statements that are applied to a single table that is >>> about 96GB in size. These updates are grouped together in a single >>> transaction. This transaction runs until the machine runs out of disk space. >>> >>> What I am trying to achieve is for postgresql to complete this updating >>> transaction without running out of memory. I assume that this is happening >>> because for a Rollback to be possible, postgres must at least keep track of >>> the previous values/changes whilst the transaction is not complete and >>> committed. I figured this would be the most likely cause for us to run out >>> of disk space and therefore I would like to reconfigure postgresql not to >>> hold onto previous copies somehow. >>> >>> Any suggestions? >>> > Is there a way to insert the data with these values already set when > you first load the db? >
Please don't top post. On Mon, Jul 12, 2010 at 2:20 PM, Tom Wilcox <hungrytom@gmail.com> wrote: > On 12/07/2010 19:26, Scott Marlowe wrote: >> >> On Mon, Jul 12, 2010 at 7:57 AM, Thom Brown<thombrown@gmail.com> wrote: >> >>> >>> On 12 July 2010 14:50, Tom Wilcox<hungrytom@gmail.com> wrote: >>> >>>> >>>> Hi Thom, >>>> >>>> I am performing update statements that are applied to a single table >>>> that is about 96GB in size. Much deleted, so my reply to your question is more obvious. >>>> Any suggestions? >>>> >> >> Is there a way to insert the data with these values already set when >> you first load the db? >> > I could perform the settings manually (set config, restart svr, execute > script, come back 2 days later, reset config, restart svr, execute more > script,...), but that sort of defeats the point. My aim to have the > simplest, automatic setup possible. Preferably completely contained within > PostgreSQL so that all is need is a dump of the DB for a complete backup... Not what I was talking about. Is there a way to NOT perform the update you mention up above, by inserting the data with the values already set properly. I don't see why that can't be incorporated into your solution, but I'm not sure how exactly your solution is working. Note that customer requirement that it all be in SQL is a bit idiotic.