Thread: 8.2 Autovacuum BUG ?
Hi, We turned on autovacuums on 8.2 and we have a database which is read only , it is basically a USPS database used only for address lookups (only SELECTS, no updates/deletes/inserts). This database has about 10gig data and yesterday autovacuum started on this database and all of a sudden I see lot of archive logs generated during this time, I guess it might have generated close to 3-4gig data during this period. It was doing only vacuum not vacuum analyze. My question is why does it have to generate so many archive logs on static tables ? I am thinking these archive logs are mostly empty , the reason I am saying that because I noticed that when I restore the db using PITR backups for my reporting db these same logs are recovered in seconds compared to the logs generated while vacuums are not running. Is this a BUG ? or am I missing something here ? Vacuum Settings --------------------- vacuum_cost_delay = 30 vacuum_cost_limit = 150 checkpoint_segments = 64 checkpoint_timeout = 5min checkpoint_warning = 30s autovacuum = on autovacuum_naptime = 120min autovacuum_vacuum_threshold = 500 autovacuum_analyze_threshold = 250 autovacuum_vacuum_scale_factor = 0.001 autovacuum_analyze_scale_factor = 0.001 autovacuum_freeze_max_age = 200000000 autovacuum_vacuum_cost_delay = -1 autovacuum_vacuum_cost_limit = -1 Thanks! Pallav.
Pallav Kalva <pkalva@livedatagroup.com> writes: > We turned on autovacuums on 8.2 and we have a database which is read > only , it is basically a USPS database used only for address lookups > (only SELECTS, no updates/deletes/inserts). > This database has about 10gig data and yesterday autovacuum started > on this database and all of a sudden I see lot of archive logs generated > during this time, I guess it might have generated close to 3-4gig data > during this period. Probably represents freezing of old tuples, which is a WAL-logged operation as of 8.2. Is it likely that the data is 200M transactions old? regards, tom lane
Tom Lane wrote: > Pallav Kalva <pkalva@livedatagroup.com> writes: > >> We turned on autovacuums on 8.2 and we have a database which is read >> only , it is basically a USPS database used only for address lookups >> (only SELECTS, no updates/deletes/inserts). >> > > >> This database has about 10gig data and yesterday autovacuum started >> on this database and all of a sudden I see lot of archive logs generated >> during this time, I guess it might have generated close to 3-4gig data >> during this period. >> > > Probably represents freezing of old tuples, which is a WAL-logged > operation as of 8.2. Is it likely that the data is 200M transactions > old? > If nothing changed on these tables how can it freeze old tuples ? Does it mean that once it reaches 200M transactions it will do the same thing all over again ? If I am doing just SELECTS on these tables ? how can there be any transactions ? or SELECTS considered transactions too ? > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Pallav Kalva wrote: > Tom Lane wrote: >> Probably represents freezing of old tuples, which is a WAL-logged >> operation as of 8.2. Is it likely that the data is 200M transactions >> old? >> > If nothing changed on these tables how can it freeze old tuples ? > Does it mean that once it reaches 200M transactions it will do the same > thing all over again ? No -- once tuples are frozen, they don't need freezing again (unless they are modified by UPDATE or DELETE). > If I am doing just SELECTS on these tables ? how can there be any > transactions ? or SELECTS considered transactions too ? Selects are transactions too. They just don't modify data. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Pallav Kalva wrote: > >> Tom Lane wrote: >> > > >>> Probably represents freezing of old tuples, which is a WAL-logged >>> operation as of 8.2. Is it likely that the data is 200M transactions >>> old? >>> >>> >> If nothing changed on these tables how can it freeze old tuples ? >> Does it mean that once it reaches 200M transactions it will do the same >> thing all over again ? >> > > No -- once tuples are frozen, they don't need freezing again (unless > they are modified by UPDATE or DELETE). > > >> If I am doing just SELECTS on these tables ? how can there be any >> transactions ? or SELECTS considered transactions too ? >> > > Selects are transactions too. They just don't modify data. > > Can you please correct me if I am wrong, I want to understand how this works. Based on what you said, it will run autovacuum again when it passes 200M transactions, as SELECTS are transactions too and are going on these tables. But the next time when it runs autovacuum, it shouldnt freeze the tuples again as they are already frozen and wont generate lot of archive logs ? Or is this because of it ran autovacuum for the first time on this db ? just the first time it does this process ?
On Fri, 2007-08-31 at 12:25 -0400, Pallav Kalva wrote: > Can you please correct me if I am wrong, I want to understand how this > works. > Based on what you said, it will run autovacuum again when it passes 200M > transactions, as SELECTS are transactions too and are going on these > tables. > But the next time when it runs autovacuum, it shouldnt freeze the tuples > again as they are already frozen and wont generate lot of archive logs ? > Or is this because of it ran autovacuum for the first time on this db ? > just the first time it does this process ? That is correct. The tuples are now frozen, which means that they will not need to be frozen ever again unless you insert/update any records.
pkalva@livedatagroup.com (Pallav Kalva) writes: > Tom Lane wrote: >> Pallav Kalva <pkalva@livedatagroup.com> writes: >> >>> We turned on autovacuums on 8.2 and we have a database which is >>> read only , it is basically a USPS database used only for address >>> lookups (only SELECTS, no updates/deletes/inserts). >>> >> >> >>> This database has about 10gig data and yesterday autovacuum >>> started on this database and all of a sudden I see lot of archive >>> logs generated during this time, I guess it might have generated >>> close to 3-4gig data during this period. >>> >> >> Probably represents freezing of old tuples, which is a WAL-logged >> operation as of 8.2. Is it likely that the data is 200M transactions >> old? >> > If nothing changed on these tables how can it freeze old tuples ? It does so very easily, by changing the XID from whatever it was to 2 (which indicates that a tuple has been "frozen.") I don't imagine you were wondering how it is done - more likely you were wondering why. "Why" is to prevent transaction ID wraparound failures. > Does it mean that once it reaches 200M transactions it will do the > same thing all over again ? It won't freeze those same tuples again, as they're obviously already frozen, but a vacuum next week may be expected to freeze tuples that are roughly a week newer. > If I am doing just SELECTS on these tables ? how can there be any > transactions ? or SELECTS considered transactions too ? Every query submitted comes in the context of a transaction. If there wasn't a BEGIN submitted somewhere, then yes, every SELECT could potentially invoke a transaction, irrespective of whether it writes data or not. If you submit a million SELECT statements, yes, that could, indeed, indicate a million transactions. -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://cbbrowne.com/info/nonrdbms.html How much deeper would the ocean be if sponges didn't live there?
On 8/31/07, Alvaro Herrera <alvherre@commandprompt.com> wrote:
Pallav Kalva wrote:
> Tom Lane wrote:
>> Probably represents freezing of old tuples, which is a WAL-logged
>> operation as of 8.2. Is it likely that the data is 200M transactions
>> old?
>>
> If nothing changed on these tables how can it freeze old tuples ?
> Does it mean that once it reaches 200M transactions it will do the same
> thing all over again ?
No -- once tuples are frozen, they don't need freezing again (unless
they are modified by UPDATE or DELETE).
Off-topic question: the documentation says that XID numbers are 32 bit. Could the XID be 64 bit when running on a 64 bit platform? That would effectively prevent wrap-around issues.
Regards
MP
Mark Lewis wrote: > On Fri, 2007-08-31 at 12:25 -0400, Pallav Kalva wrote: > >> Can you please correct me if I am wrong, I want to understand how this >> works. >> Based on what you said, it will run autovacuum again when it passes 200M >> transactions, as SELECTS are transactions too and are going on these >> tables. >> But the next time when it runs autovacuum, it shouldnt freeze the tuples >> again as they are already frozen and wont generate lot of archive logs ? >> Or is this because of it ran autovacuum for the first time on this db ? >> just the first time it does this process ? >> > > That is correct. The tuples are now frozen, which means that they will > not need to be frozen ever again unless you insert/update any records. > > My main concern is filling up my disk with archive logs, so from all the replies I get is that since tuples are already frozen, next time when it runs autovacuum it wont generate any archive logs. Is my assumption right ? Thanks! everybody on all your replies. It's was very helpful. > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org >
Mikko Partio escribió: > Off-topic question: the documentation says that XID numbers are 32 bit. > Could the XID be 64 bit when running on a 64 bit platform? That would > effectively prevent wrap-around issues. No, because they would take too much space in tuple headers. -- Alvaro Herrera http://www.amazon.com/gp/registry/5ZYLFMCVHXC Al principio era UNIX, y UNIX habló y dijo: "Hello world\n". No dijo "Hello New Jersey\n", ni "Hello USA\n".
Pallav Kalva wrote: > My main concern is filling up my disk with archive logs, so from all the > replies I get is that since tuples are already frozen, next time when it > runs autovacuum it wont generate any archive logs. > > Is my assumption right ? Well, it won't generate any logs for the tuples that were just frozen, but it will generate logs for tuples that weren't frozen. How many of these there are, depends on how many tuples you inserted after the batch that was just frozen. If you want to freeze the whole table completely, you can you VACUUM FREEZE. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera <alvherre@commandprompt.com> writes: > Mikko Partio escribi�: >> Off-topic question: the documentation says that XID numbers are 32 bit. >> Could the XID be 64 bit when running on a 64 bit platform? That would >> effectively prevent wrap-around issues. > No, because they would take too much space in tuple headers. It's worth noting that the patch Florian is working on, to suppress assignment of XIDs for transactions that never write anything, will make for a large reduction in the rate of XID consumption in many real-world applications. That will reduce the need for tuple freezing and probably lessen the attraction of wider XIDs even more. If he gets it done soon (before the HOT dust settles) I will be strongly tempted to try to sneak it into 8.3 ... regards, tom lane
On Aug 31, 2007, at 2:08 PM, Tom Lane wrote: > Alvaro Herrera <alvherre@commandprompt.com> writes: >> Mikko Partio escribió: >>> Off-topic question: the documentation says that XID numbers are >>> 32 bit. >>> Could the XID be 64 bit when running on a 64 bit platform? That >>> would >>> effectively prevent wrap-around issues. > >> No, because they would take too much space in tuple headers. > > It's worth noting that the patch Florian is working on, to suppress > assignment of XIDs for transactions that never write anything, will > make > for a large reduction in the rate of XID consumption in many real- > world > applications. That will reduce the need for tuple freezing and > probably > lessen the attraction of wider XIDs even more. > > If he gets it done soon (before the HOT dust settles) I will be > strongly > tempted to try to sneak it into 8.3 ... > > regards, tom lane Off topic and just out of curiousity, is this the work that will allow standby servers to have selects run on them without stopping WAL replay? Erik Jones Software Developer | Emma® erik@myemma.com 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate & market in style. Visit us online at http://www.myemma.com
Erik Jones <erik@myemma.com> writes: > On Aug 31, 2007, at 2:08 PM, Tom Lane wrote: >> It's worth noting that the patch Florian is working on, to suppress >> assignment of XIDs for transactions that never write anything, will make >> for a large reduction in the rate of XID consumption in many real-world >> applications. > Off topic and just out of curiousity, is this the work that will > allow standby servers to have selects run on them without stopping > WAL replay? It's a small component of that. regards, tom lane
pkalva@livedatagroup.com (Pallav Kalva) writes: > Mark Lewis wrote: >> On Fri, 2007-08-31 at 12:25 -0400, Pallav Kalva wrote: >> >>> Can you please correct me if I am wrong, I want to understand how >>> this works. >>> Based on what you said, it will run autovacuum again when it passes >>> 200M transactions, as SELECTS are transactions too and are going on >>> these tables. >>> But the next time when it runs autovacuum, it shouldnt freeze the >>> tuples again as they are already frozen and wont generate lot of >>> archive logs ? >>> Or is this because of it ran autovacuum for the first time on this >>> db ? just the first time it does this process ? >>> >> >> That is correct. The tuples are now frozen, which means that they will >> not need to be frozen ever again unless you insert/update any records. >> >> > > My main concern is filling up my disk with archive logs, so from all > the replies I get is that since tuples are already frozen, next time > when it runs autovacuum it wont generate any archive logs. > > Is my assumption right ? No, your assumption is wrong. Later vacuums will not generate archive files for the tuples that were *previously* frozen, but if you have additional tuples that have gotten old enough to reach the "freeze point," THOSE tuples will get frozen, and so you'll continue to see archive logs generated. And this is Certainly Not A Bug. If the system did not do this, those unfrozen tuples would eventually disappear when your current transaction XID rolls over. The freezing is *necessary.* -- let name="cbbrowne" and tld="cbbrowne.com" in name ^ "@" ^ tld;; http://linuxdatabases.info/info/unix.html Rules of the Evil Overlord #86. "I will make sure that my doomsday device is up to code and properly grounded." <http://www.eviloverlord.com/>
On Fri, 31 Aug 2007, Tom Lane wrote: > If he gets it done soon (before the HOT dust settles) I will be strongly > tempted to try to sneak it into 8.3 ... Could you or Florian suggest how other people might assist in meeting that goal? It seems like something worthwhile but it's not clear to me how to add manpower to it usefully. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
Greg Smith <gsmith@gregsmith.com> writes: > On Fri, 31 Aug 2007, Tom Lane wrote: >> If he gets it done soon (before the HOT dust settles) I will be strongly >> tempted to try to sneak it into 8.3 ... > Could you or Florian suggest how other people might assist in meeting that > goal? It seems like something worthwhile but it's not clear to me how to > add manpower to it usefully. Review the patch? He posted v2 on -hackers just a little bit ago. I suggested some cosmetic changes but it's certainly ready to read now. regards, tom lane
Hi Pallav, I'm currently on PostgreSQL 9.1. Everything was fine till Dec 27th 2017. But to my wonder archive logs started to increase from December 28th 2017 till date. The configuration parameters were default and everything in the past was fine with default configuration parameters. I'm facing a serious problem with this huge archive generation of 48GB per day, that is 2GB per hour. The DML's statements are almost same. In detail, archive logs are getting generated at 9'th minute and 39'th minute of an hour, preceding with a log message 'checkpoints are occurring too frequently (2 seconds apart).Consider increasing the configuration parameter "checkpoint_segments" '. So how to reduce this abnormal archive log generation. Thanks in Advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
Hello How big is database? Please show result of this query: select * from pg_stat_activity where query like 'autovacuum%'; I think here is running antiwraparound autovacuum. In this case all is normal, antiwraparound will produce a lot of WAL andthis is necessary to continue database working. PS: please note postgresql 9.1 is EOL. regards, Sergei
Hello Sergi, The size of the database is 24GB. The output of the above query is : datid | datname | procpid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | waiting | current_query --------+----------+---------+----------+----------+------------------+-------------+-----------------+-------------+----------------------------------+----------------------------------+----------------------------------+---------+-------------------------------------- 400091 | prod_erp | 19373 | 10 | postgres | | | | | 2018-01-22 15:40:38.163865+05:30 | 2018-01-22 15:40:38.655754+05:30 | 2018-01-22 15:40:38.655754+05:30 | f | autovacuum: ANALYZE public.table1 400091 | prod_erp | 19373 | 10 | postgres | | | | | 2018-01-22 15:40:38.163865+05:30 | 2018-01-22 15:40:38.655754+05:30 | 2018-01-22 15:40:38.655754+05:30 | f | autovacuum: ANALYZE public.table1 400091 | prod_erp | 19373 | 10 | postgres | | | | | 2018-01-22 15:40:38.163865+05:30 | 2018-01-22 15:40:38.218954+05:30 | 2018-01-22 15:40:38.218954+05:30 | f | autovacuum: ANALYZE public.table2 400091 | prod_erp | 18440 | 10 | postgres | | | | | 2018-01-22 15:39:38.128879+05:30 | 2018-01-22 15:39:38.166507+05:30 | 2018-01-22 15:39:38.166507+05:30 | f | autovacuum: VACUUM public.table3 Could you please explain what antiwraparound autovacuum is?? Is it related for preventing transactionID wraparound failures? If so does running vacuum full against the database will suppress this abnormal generation of archive logs?? Please give your kind advice. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
Am 22.01.2018 um 11:21 schrieb pavan95: > Could you please explain what antiwraparound autovacuum is?? Is it related > for preventing transactionID wraparound failures? Yes. > If so does running vacuum > full against the database will suppress this abnormal generation of archive > logs?? Such a vacuum freeze isn't abnormal. Do you have a really problem with it? Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
Hello, Is there any way to check, how many transactions happened till date from the point the database created and started accepting transactions ? The reason for this doubt is to find whether my database has crossed 2 million transactions or not. Strangely had an interesting observation, when I tried to a vacuum full, it is generating 1GB of archive logs per sec, and yes it's true. So I had a doubt like whether this is related to vacuum.... Please help me cope up with this. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
Hi Andreas, Yes I'm facing problem because of this huge WAL(archive log) generation. As it is seriously consuming a lot of disk space almost close to 50GB per day even if the DML's don't have that impact in this WAL generation. Previously the archive_log size is nearly 2 to 3 GB a day. Now with the same set of DML's how is it being generated to 50GB is my burning doubt. I just wanted to know how to stabilize this issue, as checking and deleting the archive logs on hourly basis is not a good idea. Finally, I'm looking how to reduce this back to normal. Thanks in Advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
Am 23.01.2018 um 12:51 schrieb pavan95: > Hi Andreas, > > Yes I'm facing problem because of this huge WAL(archive log) generation. As > it is seriously consuming a lot of disk space almost close to 50GB per day > even if the DML's don't have that impact in this WAL generation. > > Previously the archive_log size is nearly 2 to 3 GB a day. Now with the same > set of DML's how is it being generated to 50GB is my burning doubt. Will so many wals continue to be produced? > > I just wanted to know how to stabilize this issue, as checking and deleting > the archive logs on hourly basis is not a good idea. Don't delete wal's! > Finally, I'm looking how to reduce this back to normal. Thanks in Advance. have you set archive_mode to on and defined an archive_command? Wal-files will be reused after 2 checkpoints. Is there something in the logs? Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
Yes so many wals are continuing to be produced.
Deleting the wals after a backup of the database.
Yes archiving mode is on. And the warning message in log file is
" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".
My doubt is previously the same are the parameters which are reflected as of now. Then what is the point in considering altering those values. Correct me if I am wrong.
Regards,
Pavan
On Jan 23, 2018 7:47 PM, "Andreas Kretschmer" <andreas@a-kretschmer.de> wrote:
Will so many wals continue to be produced?
Am 23.01.2018 um 12:51 schrieb pavan95:Hi Andreas,
Yes I'm facing problem because of this huge WAL(archive log) generation. As
it is seriously consuming a lot of disk space almost close to 50GB per day
even if the DML's don't have that impact in this WAL generation.
Previously the archive_log size is nearly 2 to 3 GB a day. Now with the same
set of DML's how is it being generated to 50GB is my burning doubt.Don't delete wal's!
I just wanted to know how to stabilize this issue, as checking and deleting
the archive logs on hourly basis is not a good idea.have you set archive_mode to on and defined an archive_command? Wal-files will be reused after 2 checkpoints.Finally, I'm looking how to reduce this back to normal. Thanks in Advance.
Is there something in the logs?
Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com
" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".
The custom on these lists is to bottom or inline post.
This tends to appear when someone decide to write a load script of the form:
INSERT INTO tbl (cols) VALUES (...);
INSERT INTO tbl (cols) VALUES (...);
[repeat many, many, times]
(note the lack of BEGIN/END, single transaction help mitigate it somewhat)
David J.
Please don't top-posting Am 23.01.2018 um 15:39 schrieb Pavan Teja: > Yes so many wals are continuing to be produced. you have to identify why. Please check pg_stat_activity for * autovacuum * large inserts * large updates * large deletes Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
Hi David,
If it's yes what needs to be done in order to stabilize this issue??
Thanks in advance.
Regards,
Pavan
On Jan 23, 2018 8:15 PM, "David G. Johnston" <david.g.johnston@gmail.com> wrote:
" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".The custom on these lists is to bottom or inline post.This tends to appear when someone decide to write a load script of the form:INSERT INTO tbl (cols) VALUES (...);INSERT INTO tbl (cols) VALUES (...);[repeat many, many, times](note the lack of BEGIN/END, single transaction help mitigate it somewhat)David J.
Am 23.01.2018 um 16:20 schrieb Pavan Teja: > Hi David, > > If it's yes what needs to be done in order to stabilize this issue?? > Don't top-post ;-) You can't prevent the generation of wal's (apart from using unlogged tables, but i'm sure, that will be not your solution.) Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
On Tue, Jan 23, 2018 at 11:39 AM, Pavan Teja <pavan.postgresdba@gmail.com> wrote:
Yes so many wals are continuing to be produced.Deleting the wals after a backup of the database.Yes archiving mode is on. And the warning message in log file is" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".My doubt is previously the same are the parameters which are reflected as of now. Then what is the point in considering altering those values. Correct me if I am wrong.
You can use pg_xlogdump to inspect those logs and see which relations/transactions are generating so much WAL.
Then you can hunt within your apps which code is responsible for that traffic, or whether it in fact is autovacuum.
On Jan 23, 2018 9:37 PM, "Claudio Freire" <klaussfreire@gmail.com> wrote:
On Tue, Jan 23, 2018 at 11:39 AM, Pavan Teja <pavan.postgresdba@gmail.com> wrote:Yes so many wals are continuing to be produced.Deleting the wals after a backup of the database.Yes archiving mode is on. And the warning message in log file is" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".My doubt is previously the same are the parameters which are reflected as of now. Then what is the point in considering altering those values. Correct me if I am wrong.You can use pg_xlogdump to inspect those logs and see which relations/transactions are generating so much WAL.Then you can hunt within your apps which code is responsible for that traffic, or whether it in fact is autovacuum.
Hi Claudio,
Yes investigated in that area, found DML's and also autovacuum statements for some relations. And the DML's are the same before this huge WAL traffic and normal WAL traffic.
Anyways, thanks for your timely response 😊
Regards,
Pavan
On Tue, Jan 23, 2018 at 1:16 PM, Pavan Teja <pavan.postgresdba@gmail.com> wrote:
While looking at current query activity makes sense, if you can't identify a culprit doing that, inspecting the WAL directly will let you know with precision what is causing all that WAL. Hence the suggestion.
On Jan 23, 2018 9:37 PM, "Claudio Freire" <klaussfreire@gmail.com> wrote:On Tue, Jan 23, 2018 at 11:39 AM, Pavan Teja <pavan.postgresdba@gmail.com> wrote:Yes so many wals are continuing to be produced.Deleting the wals after a backup of the database.Yes archiving mode is on. And the warning message in log file is" checkpoints are frequently occurring (1second apart). Consider increasing checkpoint_segements parameter".My doubt is previously the same are the parameters which are reflected as of now. Then what is the point in considering altering those values. Correct me if I am wrong.You can use pg_xlogdump to inspect those logs and see which relations/transactions are generating so much WAL.Then you can hunt within your apps which code is responsible for that traffic, or whether it in fact is autovacuum.Hi Claudio,Is pg_xlogdump available for postgres 9.1, as my current production is postgres 9.1.
Right, it was added in 9.3
I'm unsure whether it can parse pre-9.3 WAL. I know technically speaking, WAL doesn't have to stay compatible across versions, but it might be for the limited purposes of xlogdump.
Yes investigated in that area, found DML's and also autovacuum statements for some relations. And the DML's are the same before this huge WAL traffic and normal WAL traffic.Anyways, thanks for your timely response 😊
If xlogdump doesn't work in 9.1, I'm not sure what you can do.
One idea that pops to mind, though there's probably a better one, you may want to consider attaching an strace to a recovery process on a replica. Preferrably one you're not worried about slowing down. Analyzing output from that is much harder, but it may give you some insight. You'll have to correlate file handles to file names to relations manually, which can be quite a chore.
One idea that pops to mind, though there's probably a better one, you may want to consider attaching an strace to a recovery process on a replica. Preferrably one you're not worried about slowing down. Analyzing output from that is much harder, but it may give you some insight. You'll have to correlate file handles to file names to relations manually, which can be quite a chore.
Hi Claudio, We didn't configure any replication to our production server. Which strace are you talking about? We did a keen observation that only at the time 9'th minute of the hour and 39'th minute of the hour the so called archive logs are generated even when nobody is connecting from application(off the business hours). Minimum of 76 files are being produced in these two intervals of a hour. Tried to monitor the DML's but those are the same DML's which were in the past. Any idea?? Thanks in advance. -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
On Wed, Jan 24, 2018 at 3:54 AM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hi Claudio,
We didn't configure any replication to our production server. Which strace
are you talking about?
You can attach it to a process (assuming you have the necessary permissions) and it will report all the syscalls the process does. That does slow down the process though.
Then lsof ( https://linux.die.net/man/8/lsof ) can be used to map file descriptor numbers to file paths. You have to do it as soon as you read the output, because files get closed and file descriptors reused. So it's better to have a script that directly reads from /proc/pid/fd or fdinfo, but that takes some programming.
It is nontrivial, but sometimes it's the only tool in your belt. You may want to try something else first though.
We did a keen observation that only at the time 9'th minute of the hour and
39'th minute of the hour the so called archive logs are generated even when
nobody is connecting from application(off the business hours).
Well, if you don't know what happens at those times (and only at those times), it's not that useful.
Since you don't know what is causing this for certain, first thing you have to do is ascertain that. Try increasing logging as much as you can, especially around those times, and see what turns on then and not at other times. You can monitor autovacuum processes as well in pg_stat_activity, so make sure you check that as well, as autovacuum will only log once it's done.
You do know autovacuum is running at those times, you have to check whether it isn't when WAL isn't being generated, and whether autovacuum is vacuuming the same tables over and over or what. Your earlier mails show autoanalyze runs, not vacuum. Those shouldn't cause so much WAL, but if it's running very often and you have lots of stats, then maybe.
You can also try pg_stat_statements: https://www.postgresql.org/docs/9.1/static/pgstatstatements.html
Again, concentrate on the differential - what happens at those times, that doesn't at other times.
Another idea would be to check for freeze runs in autovacuum. Ie, what's described here: https://wiki.postgresql.org/wiki/VacuumHeadaches#FREEZE
There's a nice blog post with some queries to help you with that here: http://www.databasesoup.com/2012/09/freezing-your-tuples-off-part-1.html
(and it's continuation here: http://www.databasesoup.com/2012/10/freezing-your-tuples-off-part-2.html ). I'm not saying you should tune those parameters, what you were showing was autoanalyze activity, not vacuum freeze, but you should check whether you need to anyway.
Hello all, One more interesting observation made by me. I have ran the below query(s) on production: SELECT relname, age(relfrozenxid) as xid_age, pg_size_pretty(pg_table_size(oid)) as table_size FROM pg_class WHERE relkind = 'r' and pg_table_size(oid) > 1073741824 ORDER BY age(relfrozenxid) DESC ; relname | xid_age | table_size ------------------------------------------------------------+---------+------------ *hxxxxxxxxxx* | 7798262 | 3245 MB hrxxxxxxxxx | 7797554 | 4917 MB irxxxxxxxxxx | 7796771 | 2841 MB hr_xxxxxxxxxxxxxxxx | 7744262 | 4778 MB reimbxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | 6767712 | 1110 MB show autovacuum_freeze_max_age; autovacuum_freeze_max_age --------------------------- 200000000 (1 row) SELECT txid_current();---AT 15:09PM on 24th Jan 2018 txid_current -------------- 8204011 (1 row) Then I tried to perform *VACUUM FREEZE* on the *hxxxxxxxxxx*. To my wonder it had generated 107 archive log files, which is nearly 1.67GB. The verbose information of above *VACUUM FREEZE* is shown below: *x_db*=#VACUUM (FREEZE,VERBOSE) hxxxxxxxxxxx; INFO: vacuuming "public.hxxxxxxxxxxx" INFO: scanned index "hxxxxxxxxxxx_pkey" to remove 10984 row versions DETAIL: CPU 0.00s/0.01u sec elapsed 0.04 sec. INFO: scanned index "hxxxxxxxxxxx_x_email_from" to remove 10984 row versions DETAIL: CPU 0.00s/0.04u sec elapsed 0.12 sec. INFO: scanned index "hxxxxxxxxxxx_x_mobile" to remove 10984 row versions DETAIL: CPU 0.00s/0.03u sec elapsed 0.09 sec. INFO: scanned index "hxxxxxxxxxxx_x_pan" to remove 10984 row versions DETAIL: CPU 0.00s/0.02u sec elapsed 0.08 sec. INFO: scanned index "hxxxxxxxxxxx_x_ssn" to remove 10984 row versions DETAIL: CPU 0.00s/0.01u sec elapsed 0.04 sec. INFO: scanned index "hxxxxxxxxxxx_x_email_from_index" to remove 10984 row versions DETAIL: CPU 0.01s/0.03u sec elapsed 0.12 sec. INFO: scanned index "hxxxxxxxxxxx_x_vendor_id_index" to remove 10984 row versions DETAIL: CPU 0.00s/0.01u sec elapsed 0.04 sec. INFO: "hxxxxxxxxxxx": removed 10984 row versions in 3419 pages DETAIL: CPU 0.02s/0.02u sec elapsed 0.18 sec. INFO: index "hxxxxxxxxxxx_pkey" now contains 71243 row versions in 208 pages DETAIL: 2160 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_email_from" now contains 71243 row versions in 536 pages DETAIL: 9386 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_mobile" now contains 71243 row versions in 389 pages DETAIL: 8686 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_pan" now contains 71243 row versions in 261 pages DETAIL: 8979 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_ssn" now contains 71243 row versions in 257 pages DETAIL: 8979 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_email_from_index" now contains 71243 row versions in 536 pages DETAIL: 8979 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: index "hxxxxxxxxxxx_x_vendor_id_index" now contains 71243 row versions in 257 pages DETAIL: 8979 index row versions were removed. 0 index pages have been deleted, 0 are currently reusable. CPU 0.00s/0.00u sec elapsed 0.00 sec. INFO: "hxxxxxxxxxxx": found 2597 removable, 71243 nonremovable row versions in 7202 out of 7202 pages DETAIL: 0 dead row versions cannot be removed yet. There were 10144 unused item pointers. 0 pages are entirely empty. CPU 0.21s/0.66u sec elapsed 3.21 sec. INFO: vacuuming "pg_toast.pg_toast_401161" ^CCancel request sent ERROR: canceling statement due to user request Note: Cancelled because it got struck over there and it seems to be overhead to DB in business hours. Now from this experiment is there something to suspect if I do VACUUM FREEZE on the database will it reduce my HUGE ARCHIVE LOG GENERATION? Please help. Thanks in Advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
On Wed, Jan 24, 2018 at 8:50 AM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hello all,
One more interesting observation made by me.
I have ran the below query(s) on production:
SELECT
relname,
age(relfrozenxid) as xid_age,
pg_size_pretty(pg_table_size(oid)) as table_size
FROM pg_class
WHERE relkind = 'r' and pg_table_size(oid) > 1073741824
ORDER BY age(relfrozenxid) DESC ;
relname |
xid_age | table_size
------------------------------------------------------------ +---------+------------
*hxxxxxxxxxx* |
7798262 | 3245 MB
hrxxxxxxxxx |
7797554 | 4917 MB
irxxxxxxxxxx |
7796771 | 2841 MB
hr_xxxxxxxxxxxxxxxx | 7744262 |
4778 MB
reimbxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx | 6767712 | 1110 MB
show autovacuum_freeze_max_age;
autovacuum_freeze_max_age
---------------------------
200000000
(1 row)
You seem to be rather far from the freeze_max_age. Unless you're consuming txids at a very high rate, I don't think that's your problem.
On Jan 24, 2018 7:57 PM, "Claudio Freire" <klaussfreire@gmail.com> wrote:
On Wed, Jan 24, 2018 at 8:50 AM, pavan95 <pavan.postgresdba@gmail.com> wrote:Hello all,
One more interesting observation made by me.
I have ran the below query(s) on production:
SELECT
relname,
age(relfrozenxid) as xid_age,
pg_size_pretty(pg_table_size(oid)) as table_size
FROM pg_class
WHERE relkind = 'r' and pg_table_size(oid) > 1073741824
ORDER BY age(relfrozenxid) DESC ;
relname |
xid_age | table_size
------------------------------------------------------------ +---------+------------
*hxxxxxxxxxx* |
7798262 | 3245 MB
hrxxxxxxxxx |
7797554 | 4917 MB
irxxxxxxxxxx |
7796771 | 2841 MB
hr_xxxxxxxxxxxxxxxx | 7744262 |
4778 MB
reimbxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxx | 6767712 | 1110 MB
show autovacuum_freeze_max_age;
autovacuum_freeze_max_age
---------------------------
200000000
(1 row)You seem to be rather far from the freeze_max_age. Unless you're consuming txids at a very high rate, I don't think that's your problem.
Hi ,
Any idea??
Regards,
Pavan
Yes, but why doing vacuum freeze of a table is causing a rapid archiving??Any idea??
IIUC Freezing involves physically altering those pages that are not frozen to make them frozen. Those changes are logged just like any (most?) other physical changes to pages. The rapid-ness is because freezing is not that difficult so lots of pages can be changed in a relatively short period of time.
David J.
Please show the output of these queries in the relevant databases: select name, setting, source, sourcefile, sourceline from pg_settings where name like '%vacuum%'; select oid::regclass, reloptions from pg_class where reloptions is not null; -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Álvaro Herrera, Please find the corresponding output: *1).select name, setting, source, sourcefile, sourceline from pg_settings where name like '%vacuum%'; * -[ RECORD 1 ]---------------------------------------- name | autovacuum setting | on source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 437 -[ RECORD 2 ]---------------------------------------- name | autovacuum_analyze_scale_factor setting | 0.1 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 451 -[ RECORD 3 ]---------------------------------------- name | autovacuum_analyze_threshold setting | 50 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 448 -[ RECORD 4 ]---------------------------------------- name | autovacuum_freeze_max_age setting | 200000000 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 452 -[ RECORD 5 ]---------------------------------------- name | autovacuum_max_workers setting | 3 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 443 -[ RECORD 6 ]---------------------------------------- name | autovacuum_naptime setting | 60 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 445 -[ RECORD 7 ]---------------------------------------- name | autovacuum_vacuum_cost_delay setting | 20 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 454 -[ RECORD 8 ]---------------------------------------- name | autovacuum_vacuum_cost_limit setting | -1 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 457 -[ RECORD 9 ]---------------------------------------- name | autovacuum_vacuum_scale_factor setting | 0.2 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 450 -[ RECORD 10 ]--------------------------------------- name | autovacuum_vacuum_threshold setting | 50 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 446 -[ RECORD 11 ]--------------------------------------- name | log_autovacuum_min_duration setting | 100 source | configuration file sourcefile | /etc/postgresql/9.1/main/postgresql.conf sourceline | 439 -[ RECORD 12 ]--------------------------------------- name | vacuum_cost_delay setting | 0 source | default sourcefile | sourceline | -[ RECORD 13 ]--------------------------------------- name | vacuum_cost_limit setting | 200 source | default sourcefile | sourceline | -[ RECORD 14 ]--------------------------------------- name | vacuum_cost_page_dirty setting | 20 source | default sourcefile | sourceline | -[ RECORD 15 ]--------------------------------------- name | vacuum_cost_page_hit setting | 1 source | default sourcefile | sourceline | -[ RECORD 16 ]--------------------------------------- name | vacuum_cost_page_miss setting | 10 source | default sourcefile | sourceline | -[ RECORD 17 ]--------------------------------------- name | vacuum_defer_cleanup_age setting | 0 source | default sourcefile | sourceline | -[ RECORD 18 ]--------------------------------------- name | vacuum_freeze_min_age setting | 50000000 source | default sourcefile | sourceline | -[ RECORD 19 ]--------------------------------------- name | vacuum_freeze_table_age setting | 150000000 source | default sourcefile | sourceline | *2).select oid::regclass, reloptions from pg_class where reloptions is not null; * (No rows) Thanks in Advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
pavan95 wrote: > Hi Álvaro Herrera, > > Please find the corresponding output: OK, these settings look pretty normal, so they don't explain your problem. What is checkpoint_segments set to? And checkpoint_timeout? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Jan 26, 2018 3:00 AM, "Alvaro Herrera" <alvherre@alvh.no-ip.org> wrote:
Hi,pavan95 wrote:OK, these settings look pretty normal, so they don't explain your
> Hi Álvaro Herrera,
>
> Please find the corresponding output:
problem.
What is checkpoint_segments set to? And checkpoint_timeout?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
checkpoint_segments are set to '3' &
checkpoint_timeout was set to '5 min'.
Regards,
Pavan.
On Jan 26, 2018 6:02 AM, "Pavan Teja" <pavan.postgresdba@gmail.com> wrote:
Any clue???Hi,On Jan 26, 2018 3:00 AM, "Alvaro Herrera" <alvherre@alvh.no-ip.org> wrote:pavan95 wrote:OK, these settings look pretty normal, so they don't explain your
> Hi Álvaro Herrera,
>
> Please find the corresponding output:
problem.
What is checkpoint_segments set to? And checkpoint_timeout?
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Servicescheckpoint_segments are set to '3' &checkpoint_timeout was set to '5 min'.Regards,Pavan.
Regards,
Pavan.
Hello all, Will a sudden restart(stop/start) of a postgres database will generate this huge WAL? Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html
On Tue, Jan 30, 2018 at 10:55 AM, pavan95 <pavan.postgresdba@gmail.com> wrote: > Hello all, > > Will a sudden restart(stop/start) of a postgres database will generate this > huge WAL? Shouldn't
Hi all, Regarding this archive log generation found one observation. A table named abc_table id found to be archived every 9'th and 39'th minute. We are able to find number of tuples deleted from the pg_stat_user_tables view. But to my wonder the number of tuple inserts are shown 0. How can there be any delete without any inserts. It was found that the table is having 2060 rows, where in which all rows are getting deleted in every 9'th and 39'th minute of an hour. It implies that those deleted should be inserted before the delete operation. Also performed vacuum freeze on that table before 9'th minute of an hour it generated 36 archive logs, and when I tried to do the same operation after 9'th minute(say 11'th minute of the same hour), it is generating the same number of archive logs. This is possible only if the entire table gets updated/recreated. Now my final doubt is why the tuple inserts in pg_stat_user_tables is showing 0, when corresponding deletes are existing? Please find the below outputs FYR. --Steps performed on production server:-- --1. Found Count Of Rows in Production --****************************************** prod_erp=# select count(*) from abc_table;; count ------- 2060 (1 row) --2. Issued 'Select pg_stat_reset();' --3. Before Delete Statements (Before JAN 31'st 2018 14:09 Hrs) --**************************************************************** Issued: select * from pg_stat_user_tables where relname ='abc_table'; -[ RECORD 1 ]-----+---------------------------- relid | 550314 schemaname | public relname | abc_table seq_scan | 2 seq_tup_read | 4120 idx_scan | 0 idx_tup_fetch | 0 n_tup_ins | 0 n_tup_upd | 0 n_tup_del | 0 n_tup_hot_upd | 0 n_live_tup | 0 n_dead_tup | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 0 --4. After Delete Statements (Before JAN 31'st 2018 14:09 Hrs) --**************************************************************** select * from pg_stat_user_tables where relname ='abc_table'; -[ RECORD 1 ]-----+---------------------------- relid | 550314 schemaname | public relname | abc_table seq_scan | 3 seq_tup_read | 6180 idx_scan | 2060 idx_tup_fetch | 2060 n_tup_ins | 0 n_tup_upd | 0 n_tup_del | 2060 n_tup_hot_upd | 0 n_live_tup | 0 n_dead_tup | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 0 --5. After Delete Statements (Before JAN 31'st 2018 14:39 Hrs) --**************************************************************** select * from pg_stat_user_tables where relname ='abc_table'; -[ RECORD 1 ]-----+---------------------------- relid | 550314 schemaname | public relname | abc_table seq_scan | 4 seq_tup_read | 8240 idx_scan | 4120 idx_tup_fetch | 4120 n_tup_ins | 0 n_tup_upd | 0 n_tup_del | 4120 n_tup_hot_upd | 0 n_live_tup | 0 n_dead_tup | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 0 --6. After Delete Statements (Before JAN 31'st 2018 15:09 Hrs) --**************************************************************** select * from pg_stat_user_tables where relname ='abc_table'; -[ RECORD 1 ]-----+---------------------------- relid | 550314 schemaname | public relname | abc_table seq_scan | 5 seq_tup_read | 10300 idx_scan | 6180 idx_tup_fetch | 6180 n_tup_ins | 0 n_tup_upd | 0 n_tup_del | 6180 n_tup_hot_upd | 0 n_live_tup | 0 n_dead_tup | 0 last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | 0 autovacuum_count | 0 analyze_count | 0 autoanalyze_count | 0 As said above if we compare n_tup_del value in steps 4,5,6 it says us that entire table is getting deleted(correct me if I'm wrong), but n_tup_ins is 0. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html