Thread: stopping processes, preventing connections
Hi.
I'm continuing on with the problems I have in our reports/data warehouse system. Basically, the system brings in tables from our various production systems (sybase, postgresql, mssql, different servers) every night. Some tables are brought in whole, and some are brought in based on a date field, and only the relevant interval is imported.
For tables which are brought whole, I first truncate the local table, then copy in the up-to-date data. For the ones that are brought partially, I delete partially first, and then copy in the same way.
The trouble is that sometimes there is a stray select which has been initiated and then abandoned (without cancellation) by the crystal reports system. When these queries happen to last into the night, they lock some of the tables which are supposed to be truncated. Then the whole process hangs until the query quits or dies, which, we have seen in the past, can take several hours sometimes.
What I want to do is write a script that kills any queries or connections from the crystal system, and then prevents new queries from being ran, until I finish loading all the tables, at which point I want to allow queries again.
First, the easy part - regarding allowing/disallowing queries. Is it possible to GRANT or REVOKE access to tables based on the originating IP?
Second, and the more complicated one - what do I do about rogue queries that are running when my process starts? Today we had a query that ran since yesterday. I called pg_cancel_backend() on it several times and waited for almost two hours - to no avail. Eventually I had to ask our sysadmin to shutdown PostgreSQL, which took some five minutes, but eventually worked. Is there a way to do the same thing to a single process without shutting down the whole server, and without causing any harm to the database or memory corruption? Something I can call from within SQL? I run the nightly script from a linux user which is not "postgres", so I'd prefer a way that doesn't require using "kill".
Thank you,
Herouth Maoz
On Wed, Mar 3, 2010 at 8:31 AM, Herouth Maoz <herouth@unicell.co.il> wrote:
I'd suggest separating out access to your tables by roles, and then restricting those roles to certain IP ranges in pg_hba.conf.
On 8.4, you can use pg_terminate_backend(), which sends a SIGTERM instead of pg_cancel_backend's SIGINT. If you're not on 8.4, you can just do this manually with a "kill -SIGTERM backend_pid". If that doesn't work either, you might have to resort to a "kill -SIGKILL backend_pid". Killing a single backend should be much better for you than restarting Postgres entirely. These operations shouldn't result in database corruption.First, the easy part - regarding allowing/disallowing queries. Is it possible to GRANT or REVOKE access to tables based on the originating IP?
I'd suggest separating out access to your tables by roles, and then restricting those roles to certain IP ranges in pg_hba.conf.
Second, and the more complicated one - what do I do about rogue queries that are running when my process starts? Today we had a query that ran since yesterday. I called pg_cancel_backend() on it several times and waited for almost two hours - to no avail. Eventually I had to ask our sysadmin to shutdown PostgreSQL, which took some five minutes, but eventually worked. Is there a way to do the same thing to a single process without shutting down the whole server, and without causing any harm to the database or memory corruption? Something I can call from within SQL? I run the nightly script from a linux user which is not "postgres", so I'd prefer a way that doesn't require using "kill".
You have to be database superuser to use pg_cancel_backend() or pg_terminate_backend(), or have a shell login as the database user to use "kill". No way around that for now.
Though next time you see a query which doesn't respond to pg_cancel_backend(), try gathering information about the query and what the backend is doing; either you're doing something unusual (e.g. an app is restarting the query automatically after getting canceled) or perhaps you've stumbled on a bug in Postgres.
Josh
On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote:
On Wed, Mar 3, 2010 at 8:31 AM, Herouth Maoz <herouth@unicell.co.il> wrote:First, the easy part - regarding allowing/disallowing queries. Is it possible to GRANT or REVOKE access to tables based on the originating IP?
I'd suggest separating out access to your tables by roles, and then restricting those roles to certain IP ranges in pg_hba.conf.
On 8.4, you can use pg_terminate_backend(), which sends a SIGTERM instead of pg_cancel_backend's SIGINT. If you're not on 8.4, you can just do this manually with a "kill -SIGTERM backend_pid". If that doesn't work either, you might have to resort to a "kill -SIGKILL backend_pid". Killing a single backend should be much better for you than restarting Postgres entirely. These operations shouldn't result in database corruption.Second, and the more complicated one - what do I do about rogue queries that are running when my process starts? Today we had a query that ran since yesterday. I called pg_cancel_backend() on it several times and waited for almost two hours - to no avail. Eventually I had to ask our sysadmin to shutdown PostgreSQL, which took some five minutes, but eventually worked. Is there a way to do the same thing to a single process without shutting down the whole server, and without causing any harm to the database or memory corruption? Something I can call from within SQL? I run the nightly script from a linux user which is not "postgres", so I'd prefer a way that doesn't require using "kill".
You have to be database superuser to use pg_cancel_backend() or pg_terminate_backend(), or have a shell login as the database user to use "kill". No way around that for now.
I guess I'll have to sudo or use local ssh.
Though next time you see a query which doesn't respond to pg_cancel_backend(), try gathering information about the query and what the backend is doing; either you're doing something unusual (e.g. an app is restarting the query automatically after getting canceled) or perhaps you've stumbled on a bug in Postgres.
I'd appreciate it if you tell me what to look for. It was running a join on several tables, but nothing too complicated. It may be that the query is not optimized (one of the tables is not indexed properly) but it still should respond to cancel - shouldn't it?
Thank you very much,
Herouth
Unfortunately, we only have 8.3 - is SIGTERM safe on 8.3?On 8.4, you can use pg_terminate_backend(), which sends a SIGTERM instead of pg_cancel_backend's SIGINT. If you're not on 8.4, you can just do this manually with a "kill -SIGTERM backend_pid". If that doesn't work either, you might have to resort to a "kill -SIGKILL backend_pid". Killing a single backend should be much better for you than restarting Postgres entirely. These operations shouldn't result in database corruption.Second, and the more complicated one - what do I do about rogue queries that are running when my process starts? Today we had a query that ran since yesterday. I called pg_cancel_backend() on it several times and waited for almost two hours - to no avail. Eventually I had to ask our sysadmin to shutdown PostgreSQL, which took some five minutes, but eventually worked. Is there a way to do the same thing to a single process without shutting down the whole server, and without causing any harm to the database or memory corruption? Something I can call from within SQL? I run the nightly script from a linux user which is not "postgres", so I'd prefer a way that doesn't require using "kill".
You have to be database superuser to use pg_cancel_backend() or pg_terminate_backend(), or have a shell login as the database user to use "kill". No way around that for now.I guess I'll have to sudo or use local ssh.
Yeah, I think any of the kill modes on the backends should be "safe" in terms of your data. If you're interested in killing without having to SSH in, you could play with something like:
http://wiki.postgresql.org/wiki/Backend_killer_function
and adjust it as needed (permissions, etc) for your needs.
Though next time you see a query which doesn't respond to pg_cancel_backend(), try gathering information about the query and what the backend is doing; either you're doing something unusual (e.g. an app is restarting the query automatically after getting canceled) or perhaps you've stumbled on a bug in Postgres.I'd appreciate it if you tell me what to look for. It was running a join on several tables, but nothing too complicated. It may be that the query is not optimized (one of the tables is not indexed properly) but it still should respond to cancel - shouldn't it?
I came across an old thread discussing a problem which sounds similar to yours (process not dying with a pg_cancel_backend()):
http://archives.postgresql.org/pgsql-general/2007-10/msg01696.php
there's some good info throughout that thread, including the recommendation from Tom Lane to try to reproduce, assuming your query isn't stuck inside a plpython or similar function.
If you can reproduce your hung query which doesn't respond to a pg_cancel_backend(), try following instructions here (assuming your server is a Linux/BSD machine) to gather more information about what's going on inside the backend:
http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
strace, top, and/or vmstat information might also be useful.
Hope this helps,
Josh
On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote:
Though next time you see a query which doesn't respond to pg_cancel_backend(), try gathering information about the query and what the backend is doing; either you're doing something unusual (e.g. an app is restarting the query automatically after getting canceled) or perhaps you've stumbled on a bug in Postgres.
Hi. A long time has passed since you made that suggestion, but today we stumbled again on a query that wouldn't be canceled. Not only does it not respond to pg_cancel_backend(), it also doesn't respond to kill -SIGTERM.
The query is:
select date_trunc('day',rb.time_stamp),count(*),rb.category,channels.channel_id,channels.name as channel,platforms.platform_id,
platforms.name,rb.operator,item,delivered,msisdn
from public.rb__view as rb,channels,platforms
where
rb.channel_id=channels.channel_id
and rb.platform_id=platforms.platform_id
and rb.time_stamp>='2010-03-14'::date and rb.time_stamp<'2010-03-14'::date + interval '1 day'
and platforms.platform_id=262
and channels.channel_id=1
group by date_trunc('day',rb.time_stamp),rb.category,channels.channel_id,channel,operator,item,delivered,msisdn,platforms.platform_id,
This is nothing too fancy - just an aggregate with group by. And the application on the other side is Crystal Reports, connecting using ODBC. I don't believe the application does anything like restart after cancel, because most of our queries can easily be cancelled and I don't think Crystal has different behaviors for different queries.
rb__view is a union all between two tables (rb and rb__archive) which have the same schema - one holds data from the past 7 weeks and the other holds older data. The channels and platforms tables are basically lookup tables. The fields item,delivered and msisdn all belong to rb__view.
There is nothing in the PostgreSQL log.
If it helps any, this is the EXPLAIN output for the above query. Note that at this time, the query has been running for over a hour and a half.
HashAggregate (cost=221312.77..221318.08 rows=354 width=94)
-> Nested Loop (cost=8078.83..221215.50 rows=3537 width=94)
-> Seq Scan on channels (cost=0.00..3.81 rows=1 width=16)
Filter: (channel_id = 1::numeric)
-> Nested Loop (cost=8078.83..221167.48 rows=3537 width=85)
-> Index Scan using platforms_pkey on platforms (cost=0.00..6.27 rows=1 width=19)
Index Cond: (platform_id = 262::numeric)
-> Append (cost=8078.83..221125.84 rows=3537 width=73)
-> Bitmap Heap Scan on rb (cost=8078.83..221115.42 rows=3536 width=72)
Recheck Cond: ((public.rb.time_stamp >= '2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone))
Filter: ((public.rb.channel_id = 1::numeric) AND (public.rb.platform_id = 262::numeric))
-> Bitmap Index Scan on rb_timestamp_ind (cost=0.00..8077.94 rows=104502 width=0)
Index Cond: ((public.rb.time_stamp >= '2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone))
-> Index Scan using rba_timestamp_ind on rb__archive (cost=0.00..10.42 rows=1 width=73)
Index Cond: ((rb__archive.time_stamp >= '2010-03-14'::date) AND (rb__archive.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone))
Filter: ((rb__archive.channel_id = 1::numeric) AND (rb__archive.platform_id = 262::numeric))
I don't know what other information I may glean for this. Any thoughts?
Thank you,
Herouth
On 17/03/2010 6:32 PM, Herouth Maoz wrote: > > On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: > >> Though next time you see a query which doesn't respond to >> pg_cancel_backend(), try gathering information about the query and >> what the backend is doing; either you're doing something unusual (e.g. >> an app is restarting the query automatically after getting canceled) >> or perhaps you've stumbled on a bug in Postgres. > > Hi. A long time has passed since you made that suggestion, but today we > stumbled again on a query that wouldn't be canceled. Not only does it > not respond to pg_cancel_backend(), it also doesn't respond to kill > -SIGTERM. Interesting. If you attach gdb to the backend and run "backtrace", what's the output? If you strace the backend, what do you see? -- Craig Ringer
On Mar 17, 2010, at 13:34 , Craig Ringer wrote: > On 17/03/2010 6:32 PM, Herouth Maoz wrote: >> >> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: >> >>> Though next time you see a query which doesn't respond to >>> pg_cancel_backend(), try gathering information about the query and >>> what the backend is doing; either you're doing something unusual (e.g. >>> an app is restarting the query automatically after getting canceled) >>> or perhaps you've stumbled on a bug in Postgres. >> >> Hi. A long time has passed since you made that suggestion, but today we >> stumbled again on a query that wouldn't be canceled. Not only does it >> not respond to pg_cancel_backend(), it also doesn't respond to kill >> -SIGTERM. > > Interesting. If you attach gdb to the backend and run "backtrace", what's the output? (gdb) backtrace #0 0x8dfcb410 in ?? () #1 0xbff10a28 in ?? () #2 0x083b1bf4 in ?? () #3 0xbff10a00 in ?? () #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 #5 0x08195d54 in secure_write () #6 0x0819dc7e in pq_setkeepalivesidle () #7 0x0819ddd5 in pq_flush () #8 0x0819de3d in pq_putmessage () #9 0x0819fa63 in pq_endmessage () #10 0x08086dcb in printtup_create_DR () #11 0x08178dc4 in ExecutorRun () #12 0x08222326 in PostgresMain () #13 0x082232c0 in PortalRun () #14 0x0821e27d in pg_parse_query () #15 0x08220056 in PostgresMain () #16 0x081ef77f in ClosePostmasterPorts () #17 0x081f0731 in PostmasterMain () #18 0x081a0484 in main () > > If you strace the backend, what do you see? All I get is this: send(9, "00:00\0\0\0\0011\377\377\377\377\0\0\0\0011\0\0\0\0041"..., 1541, 0 I waited about 20 minutes after receiving that, but nothing further was output. Thank you, Herouth
On 17/03/2010 8:43 PM, Herouth Maoz wrote: > > On Mar 17, 2010, at 13:34 , Craig Ringer wrote: > >> On 17/03/2010 6:32 PM, Herouth Maoz wrote: >>> >>> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: >>> >>>> Though next time you see a query which doesn't respond to >>>> pg_cancel_backend(), try gathering information about the query and >>>> what the backend is doing; either you're doing something unusual (e.g. >>>> an app is restarting the query automatically after getting canceled) >>>> or perhaps you've stumbled on a bug in Postgres. >>> >>> Hi. A long time has passed since you made that suggestion, but today we >>> stumbled again on a query that wouldn't be canceled. Not only does it >>> not respond to pg_cancel_backend(), it also doesn't respond to kill >>> -SIGTERM. >> >> Interesting. If you attach gdb to the backend and run "backtrace", what's the output? > > (gdb) backtrace > #0 0x8dfcb410 in ?? () > #1 0xbff10a28 in ?? () > #2 0x083b1bf4 in ?? () > #3 0xbff10a00 in ?? () > #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 > #5 0x08195d54 in secure_write () > #6 0x0819dc7e in pq_setkeepalivesidle () > #7 0x0819ddd5 in pq_flush () > #8 0x0819de3d in pq_putmessage () > #9 0x0819fa63 in pq_endmessage () > #10 0x08086dcb in printtup_create_DR () > #11 0x08178dc4 in ExecutorRun () > #12 0x08222326 in PostgresMain () > #13 0x082232c0 in PortalRun () > #14 0x0821e27d in pg_parse_query () > #15 0x08220056 in PostgresMain () > #16 0x081ef77f in ClosePostmasterPorts () > #17 0x081f0731 in PostmasterMain () > #18 0x081a0484 in main () OK, so it seems to be stuck sending data down a socket. The fact that strace isn't reporting any new system calls suggests the backend is just blocked on that send() call and isn't doing any work. Is there any chance the client has disconnected/disappeared? -- Craig Ringer
On Mar 17, 2010, at 14:56 , Craig Ringer wrote: > On 17/03/2010 8:43 PM, Herouth Maoz wrote: >> >> On Mar 17, 2010, at 13:34 , Craig Ringer wrote: >> >>> On 17/03/2010 6:32 PM, Herouth Maoz wrote: >>>> >>>> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: >>>> >>>>> Though next time you see a query which doesn't respond to >>>>> pg_cancel_backend(), try gathering information about the query and >>>>> what the backend is doing; either you're doing something unusual (e.g. >>>>> an app is restarting the query automatically after getting canceled) >>>>> or perhaps you've stumbled on a bug in Postgres. >>>> >>>> Hi. A long time has passed since you made that suggestion, but today we >>>> stumbled again on a query that wouldn't be canceled. Not only does it >>>> not respond to pg_cancel_backend(), it also doesn't respond to kill >>>> -SIGTERM. >>> >>> Interesting. If you attach gdb to the backend and run "backtrace", what's the output? >> >> (gdb) backtrace >> #0 0x8dfcb410 in ?? () >> #1 0xbff10a28 in ?? () >> #2 0x083b1bf4 in ?? () >> #3 0xbff10a00 in ?? () >> #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 >> #5 0x08195d54 in secure_write () >> #6 0x0819dc7e in pq_setkeepalivesidle () >> #7 0x0819ddd5 in pq_flush () >> #8 0x0819de3d in pq_putmessage () >> #9 0x0819fa63 in pq_endmessage () >> #10 0x08086dcb in printtup_create_DR () >> #11 0x08178dc4 in ExecutorRun () >> #12 0x08222326 in PostgresMain () >> #13 0x082232c0 in PortalRun () >> #14 0x0821e27d in pg_parse_query () >> #15 0x08220056 in PostgresMain () >> #16 0x081ef77f in ClosePostmasterPorts () >> #17 0x081f0731 in PostmasterMain () >> #18 0x081a0484 in main () > > OK, so it seems to be stuck sending data down a socket. The fact that strace isn't reporting any new system calls suggeststhe backend is just blocked on that send() call and isn't doing any work. > > Is there any chance the client has disconnected/disappeared? Yes, certainly. In fact, I mentioned in the past that the product we use for our reports, which is an application built ontop of Crystal Reports, when told to cancel a report or when a report times out, instead of telling Crystal to cancel queriesproperly, simply kills Crystal's processes on the Windows machine side - which leaves us with orphan backends. It'sstupid, but it's not under our control. But most of the time the backends respond to cancel requests. Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the client sidecan disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposedto cause a backend to hang, I would assume. Is there anything I can do about it? Herouth
Craig Ringer <craig@postnewspapers.com.au> writes: > On 17/03/2010 8:43 PM, Herouth Maoz wrote: >> (gdb) backtrace >> #0 0x8dfcb410 in ?? () >> #1 0xbff10a28 in ?? () >> #2 0x083b1bf4 in ?? () >> #3 0xbff10a00 in ?? () >> #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 >> #5 0x08195d54 in secure_write () >> #6 0x0819dc7e in pq_setkeepalivesidle () >> #7 0x0819ddd5 in pq_flush () >> #8 0x0819de3d in pq_putmessage () >> #9 0x0819fa63 in pq_endmessage () >> #10 0x08086dcb in printtup_create_DR () >> #11 0x08178dc4 in ExecutorRun () >> #12 0x08222326 in PostgresMain () >> #13 0x082232c0 in PortalRun () >> #14 0x0821e27d in pg_parse_query () >> #15 0x08220056 in PostgresMain () >> #16 0x081ef77f in ClosePostmasterPorts () >> #17 0x081f0731 in PostmasterMain () >> #18 0x081a0484 in main () > OK, so it seems to be stuck sending data down a socket. Since this type of problem is fresh in mind: I wonder if the connection is SSL-encrypted and the session just crossed the 512MB-total-data-sent threshold. If so it would have tried to renegotiate session keys, and some SSL vendors have recently broken that functionality rather badly. Silent hangups seem to be par for the course. If that's it, your choices are to find an SSL library that has an actual fix for CVE-2009-3555 rather than this brain damage, or to update to this week's PG releases so you can set ssl_renegotiation_limit to zero. If you're not using SSL, just ignore me ... regards, tom lane
Herouth Maoz wrote: > Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the clientside can disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposedto cause a backend to hang, I would assume. > Note that you're not in the PostgreSQL code at the point where this is stuck at--you're deep in the libc socket code. Making sure that sockets will always have well behaved behavior at the OS level is not always possible, due to the TPC/IP's emphasis on robust delivery. See section 2.8 "Why does it take so long to detect that the peer died?" at http://www.faqs.org/faqs/unix-faq/socket/ for some background here, and note that the point you're stuck in is inside of keepalive handling in the database trying to do the right thing here. As a general commentary on this area, in most cases where I've seen an unkillable backend, which usually becomes noticed when the server won't shutdown, have resulted from bad socket behavior. It's really a tricky area to get right, and presuming the database backends will be robust in the case of every possible weird OS behavior is hard to guarantee. However, if you can repeatably get the server into this bad state at will, it may be worth spending some more time digging into this in hopes there is something valuable to learn about your situation that can improve the keepalive handling on the server side. Did you mention your PostgreSQL server version and platform? I didn't see the exact code path you're stuck in during a quick look at the code involved (using a snapshot of recent development), which makes me wonder if this isn't already a resolved problem in a newer version. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
quoth Greg Smith: > Herouth Maoz wrote: >> Aren't socket writes supposed to have time outs of some sort? Stupid >> policies notwithstanding, processes on the client side can disappear >> for any number of reasons - bugs, power failures, whatever - and this >> is not something that is supposed to cause a backend to hang, I would >> assume. >> > As a general commentary on this area, in most cases where I've seen an > unkillable backend, which usually becomes noticed when the server > won't shutdown, have resulted from bad socket behavior. It's really a > tricky area to get right, and presuming the database backends will be > robust in the case of every possible weird OS behavior is hard to > guarantee. > However, if you can repeatably get the server into this bad state at > will, it may be worth spending some more time digging into this in > hopes there is something valuable to learn about your situation that > can improve the keepalive handling on the server side. Did you > mention your PostgreSQL server version and platform? I didn't see the > exact code path you're stuck in during a quick look at the code > involved (using a snapshot of recent development), which makes me > wonder if this isn't already a resolved problem in a newer version. > The server version is 8.3.1. Migration to a higher version might be difficult as far as policies go, if there isn't a supported debian package for it, but if you can point out a version where this has been fixed I might be able to persuade my boss and sysadmin. Thank you for referring me to that entry in the FAQ. By the way, the situation repeated itself today as well. Thanks, Herouth
On Sat, Mar 20, 2010 at 11:44 AM, Herouth Maoz <herouth@unicell.co.il> wrote: > The server version is 8.3.1. Migration to a higher version might be > difficult as far as policies go, if there isn't a supported debian package > for it, but if you can point out a version where this has been fixed I might > be able to persuade my boss and sysadmin. Most of the time it is more dangerous to NOT update PostgreSQL to the latest minor point version than to stay on an older minor point version. The occasions when a minor point upgrade come out that is dangerous are rare, and the next minor point version to fix it shows up the next day while the broken one is pulled. I think that's happened 1 or 2 times during the time I've been using postgresql. So, if it's 48 hours old and no alarm bells have gone off that it's being pulled and replaced, a pg update is the right thing to do. Backup beforehand, etc. The danger of a change making your application stop are very low, while the danger of leaving some unpatched bit of nastiness in the backend is much greater a possible problem. I.e. data loss / corruption, things like that. And something as mature as 8.3 is now shouldn't be running in production missing two years of patches. Start with the release notes for 8.3.2 and move forward and see if anything there looks like a problem for your app. Behaviour changing changes rarely get into production releases, they get saved for the next major version. If they do they are well noted in the release notes.
ציטוט Scott Marlowe:
Herouth
The problem is not so much danger in upgrading, but the fact that doing so without using the system's usual security/bugfix update path means non-standard work for the sysadmin, meaning he has to upgrade every package on the system using a different upgrade method, being notified about it from a different source, and needing to check each one in different conditions, which makes his work impossible. So the policy so far has been "Use the packages available through debian". So I'll need to check if there is an upgrade available through that path - and the question is whether it's worthwhile (i.e. whether the bug in question has indeed been fixed).On Sat, Mar 20, 2010 at 11:44 AM, Herouth Maoz <herouth@unicell.co.il> wrote:The server version is 8.3.1. Migration to a higher version might be difficult as far as policies go, if there isn't a supported debian package for it, but if you can point out a version where this has been fixed I might be able to persuade my boss and sysadmin.Most of the time it is more dangerous to NOT update PostgreSQL to the latest minor point version than to stay on an older minor point version. The occasions when a minor point upgrade come out that is dangerous are rare, and the next minor point version to fix it shows up the next day while the broken one is pulled. I think that's happened 1 or 2 times during the time I've been using postgresql. So, if it's 48 hours old and no alarm bells have gone off that it's being pulled and replaced, a pg update is the right thing to do. Backup beforehand, etc. The danger of a change making your application stop are very low, while the danger of leaving some unpatched bit of nastiness in the backend is much greater a possible problem. I.e. data loss / corruption, things like that. And something as mature as 8.3 is now shouldn't be running in production missing two years of patches. Start with the release notes for 8.3.2 and move forward and see if anything there looks like a problem for your app. Behaviour changing changes rarely get into production releases, they get saved for the next major version. If they do they are well noted in the release notes.
Herouth
On Sat, Mar 20, 2010 at 3:57 PM, Herouth Maoz <herouth@unicell.co.il> wrote: > > > The problem is not so much danger in upgrading, but the fact that doing so > without using the system's usual security/bugfix update path means > non-standard work for the sysadmin, meaning he has to upgrade every package > on the system using a different upgrade method, being notified about it from > a different source, and needing to check each one in different conditions, > which makes his work impossible. So the policy so far has been "Use the > packages available through debian". So I'll need to check if there is an > upgrade available through that path - and the question is whether it's > worthwhile (i.e. whether the bug in question has indeed been fixed). I'm certain debian keeps the pgsql packages up to date within a few days or at most weeks of their release .
On 21/03/2010 7:12 AM, Scott Marlowe wrote: > On Sat, Mar 20, 2010 at 3:57 PM, Herouth Maoz<herouth@unicell.co.il> wrote: >> >> >> The problem is not so much danger in upgrading, but the fact that doing so >> without using the system's usual security/bugfix update path means >> non-standard work for the sysadmin, meaning he has to upgrade every package >> on the system using a different upgrade method, being notified about it from >> a different source, and needing to check each one in different conditions, >> which makes his work impossible. So the policy so far has been "Use the >> packages available through debian". So I'll need to check if there is an >> upgrade available through that path - and the question is whether it's >> worthwhile (i.e. whether the bug in question has indeed been fixed). > > I'm certain debian keeps the pgsql packages up to date within a few > days or at most weeks of their release . In sid (unstable), sure. But the stable releases don't usually see major version upgrades (like 8.3 to 8.4) unless they're done via unofficial channels like backports.org . -- Craig Ringer
On Sun, Mar 21, 2010 at 5:33 AM, Craig Ringer <craig@postnewspapers.com.au> wrote: > On 21/03/2010 7:12 AM, Scott Marlowe wrote: >> >> On Sat, Mar 20, 2010 at 3:57 PM, Herouth Maoz<herouth@unicell.co.il> >> wrote: >>> >>> >>> The problem is not so much danger in upgrading, but the fact that doing >>> so >>> without using the system's usual security/bugfix update path means >>> non-standard work for the sysadmin, meaning he has to upgrade every >>> package >>> on the system using a different upgrade method, being notified about it >>> from >>> a different source, and needing to check each one in different >>> conditions, >>> which makes his work impossible. So the policy so far has been "Use the >>> packages available through debian". So I'll need to check if there is an >>> upgrade available through that path - and the question is whether it's >>> worthwhile (i.e. whether the bug in question has indeed been fixed). >> >> I'm certain debian keeps the pgsql packages up to date within a few >> days or at most weeks of their release . > > In sid (unstable), sure. But the stable releases don't usually see major > version upgrades (like 8.3 to 8.4) unless they're done via unofficial > channels like backports.org . It was a few posts back, but our discussion point was minor point upgrades and the fact that OP was running 8.3.1 and not sure there were updates to 8.3.9 (or latest) out there for debian. I'm quite sure debian has 8.3.9 out by now.
Scott Marlowe <scott.marlowe@gmail.com> writes: > It was a few posts back, but our discussion point was minor point > upgrades and the fact that OP was running 8.3.1 and not sure there > were updates to 8.3.9 (or latest) out there for debian. I'm quite > sure debian has 8.3.9 out by now. Yes: http://packages.debian.org/lenny/postgresql-8.3 http://packages.debian.org/etch-backports/postgresql-8.3 You wont' find it in testing/unstable though, because next stable will contain 8.4 only, as far as I understand. Regards, -- dim