Re: [GENERAL] huge table occupation after updates - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: [GENERAL] huge table occupation after updates |
Date | |
Msg-id | 1d10c142-92fb-de31-8929-6d618a33b05a@aklaver.com Whole thread Raw |
In response to | Re: [GENERAL] huge table occupation after updates (Tom DalPozzo <t.dalpozzo@gmail.com>) |
List | pgsql-general |
On 12/10/2016 10:15 AM, Tom DalPozzo wrote: > > > 2016-12-10 18:30 GMT+01:00 Francisco Olarte <folarte@peoplecall.com > <mailto:folarte@peoplecall.com>>: > > A couple of things first. > > 1.- This list encourages inline replying, editing the text, and frowns > upon top posting. > > 2.- Your HTML formatting with so a small size makes it harder for me ( > and I can assume some others ) to properly read your messages. > > If you want to discourage people replying to you, keep doing the two > above. > > On Sat, Dec 10, 2016 at 3:25 PM, Tom DalPozzo <t.dalpozzo@gmail.com > <mailto:t.dalpozzo@gmail.com>> wrote: > > you're right, VACUUM FULL recovered the space, completely. > > Well, it always does. ;-) > > > So, at this point I'm worried about my needs. > > I cannot issue vacuum full as I read it locks the table. > > Well, first hint of your needs. Bear in mind vacuum fulls can be very > fast on small tables ( i.e, if you have the 1.5Mb table, and do 2000 > updates and then a vacuum full that will be very fast, time it ). > > > In my DB, I (would) need to have a table with one bigint id field+ > 10 bytea > > fields, 100 bytes long each (more or less, not fixed). > > 5/10000 rows maximum, but let's say 5000. > > As traffic I can suppose 10000 updates per row per day (spread > over groups > > of hours; each update involving two of those fields, randomly. > > Also rows are chosen randomly (in my test I used a block of 2000 > just to try > > one possibility). > > So, it's a total of 50 millions updates per day, hence (50millions > * 100 > > bytes *2 fields updated) 10Gbytes net per day. > > Not at all. That's the volume of updated data, you must multiply by > the ROW size, not just the changed size, in your case 50M * 1100 ( to > have some wiggle room ), 55Gbytes. > > But this is the UPPER BOUND you asked for. Not the real one. > > > I'm afraid it's not possible, according to my results. > > It certaninly is. You can set a very aggresive autovacuum schedule for > the table, or even better, you may vacuum AFTER each hourly update. > This will mark dead tuples for reuse. It will not be as fast, but it > can certainly be fast enough. > > And, if you only update once an hour, you may try other tricks ( like > copy to a temp table, truncate the original and insert the temp table > in the original, although I fear this will lock the table too, but it > will be a very short time, your readers may well tolerate it. ) > > Yours seem a special app with special need, try a few, measure, it is > certainly possible. > > Francisco Olarte. > > > Hi, I think you're right. I was surprised by the huge size of the > tables in my tests but I had not considered the vacuum properly. > My test had a really huge activity so perhaps the autovacuum didn't have > time to make the rows reusable. > Also, issuing plain VACUUM command does nothing visibile at once, but > only after when, inserting new rows, the size doesn't increase. > I will try again as you suggest. To make more sense of this I would suggest reading the following sections of the manual: https://www.postgresql.org/docs/9.5/static/routine-vacuuming.html https://www.postgresql.org/docs/9.5/static/mvcc.html There is a lot of ground covered in the above, more then can be digested in one pass but it will help provide some context for the answers/suggestions provided in this thread. > Thank you very much > Pupillo > > -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: