Thread: Large Objects and and Vacuum
Hi
I am struggling with the volume and number of XML files a new application is storing. The table pg_largeobjects is growing fast, and despite the efforts of vacuumlo, vacuum and auto-vacuum it keeps on growing in size.
The main tables that hold large objects are partitioned and every few days I drop partition tables older than seven days, but despite all this, the system is growing in size and not releasing space back to the OS.
Using either vacuum full or cluster to fix pg_largeobjects will require a large amount of work space which I do not have on this server.
Is there another method of scanning postgres tables, moving active blocks and releasing store back to the OS?
Failing this, I can see an NFS mount being required.
Simon
Simon Windsor
Eml: simon.windsor@cornfield.org.uk
Tel: 01454 617689
Mob: 07590 324560
“There is nothing in the world that some man cannot make a little worse and sell a little cheaper, and he who considers price only is that man's lawful prey.”
Please don't send HTML mail to this list. Simon Windsor wrote: > I am struggling with the volume and number of XML files a new application is storing. The table > pg_largeobjects is growing fast, and despite the efforts of vacuumlo, vacuum and auto-vacuum it keeps > on growing in size. Have you checked if the number of large objects in the database is growing as well? Check the result of SELECT count(DISTINCT loid) FROM pg_largeobject; over time, or before and after "vacuumlo". > The main tables that hold large objects are partitioned and every few days I drop partition tables > older than seven days, but despite all this, the system is growing in size and not releasing space > back to the OS. Do you also delete the large objects referenced in these dropped tables? They won't vanish automatically. If you use large objects in a partitioned table, you probably have a design problem. Having to clean up after orphaned large objects will mitigate the performance gain by dropping partitions instead of deleting data. You might be better off with bytea. > Using either vacuum full or cluster to fix pg_largeobjects will require a large amount of work space > which I do not have on this server. > > Is there another method of scanning postgres tables, moving active blocks and releasing store back to > the OS? If VACUUM does not keep pg_largeobject from growing, VACUUM FULL or something else will not help either. You have to figure out why your large objects don't get deleted. Only after they are deleted, VACUUM can free the space. > Failing this, I can see an NFS mount being required. Beg your pardon? Yours, Laurenz Albe
On 31 December 2011 00:54, Simon Windsor <simon.windsor@cornfield.me.uk> wrote: > I am struggling with the volume and number of XML files a new application is > storing. The table pg_largeobjects is growing fast, and despite the efforts > of vacuumlo, vacuum and auto-vacuum it keeps on growing in size I can't help but wonder why you're using large objects for XML files? Wouldn't a text-field be sufficient? Text-fields get toasted, that would safe you some space. Another option would be to use xml-fields, but that depends on whether you have valid XML and whether you have any desire to make use of any xml-specific features such fields provide. There will probably be a performance hit for this. I do realise that you can stream large objects, that's a typical use-case for choosing for them, but with XML files that doesn't seem particularly useful to me; after all, they're not valid if not complete. You have to read the whole file into memory _somewhere_ before you can interpret them meaningfully. The exception to that rule is if you're using a SAX-parser (which also explains why those parsers usually have fairly limited features). Of course there are valid reasons for choosing to use large objects for XML files, I assume yours are among them. If they're not, however, maybe you should have a thorough look at your problem again. -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
On 12/30/11 3:54 PM, Simon Windsor wrote: > I am struggling with the volume and number of XML files a new > application is storing. how big are these XML files? large_object was meant for storing very large files, like videos, etc. multi-megabyte to gigabytes. XML stuff is typically a lot smaller than that. me, I would be decomposing the XML in my application and storing the data in proper relational tables, and only generate XML output if I absolutely had to send it to another system beyond my control as its easily one of the most inefficient methods of data representation out there. -- john r pierce N 37, W 122 santa cruz ca mid-left coast
Hi Thanks for the response. I am new to small IT company that have recently migrated an Oracle based system Postgres. The system stores full XML responses, ranging in size from a few K to over 55MB, and a sub set of key XML fields are stored on a more permanent basis. The database design was thus determined by the previous Oracle/Java system, with empty LOBS being created and data being streamed in. The data only has to be kept for a few days, and generally the system is performing well, but as stated in the email, regular use of vacuumlo, vacuum and autovacuum leaves the OS disc space slowly shrinking. As a last resort this week, I'm going to get 500+GB of extra file store added, add a tablespace and move pg_largeobjects to this area. Then use CLUSTER to rebuild pg_largeobjects back in the default tablespace. This should fix things I hope, and if needed I'll use Cluster regularly. Simon -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John R Pierce Sent: 02 January 2012 11:18 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Large Objects and and Vacuum On 12/30/11 3:54 PM, Simon Windsor wrote: > I am struggling with the volume and number of XML files a new > application is storing. how big are these XML files? large_object was meant for storing very large files, like videos, etc. multi-megabyte to gigabytes. XML stuff is typically a lot smaller than that. me, I would be decomposing the XML in my application and storing the data in proper relational tables, and only generate XML output if I absolutely had to send it to another system beyond my control as its easily one of the most inefficient methods of data representation out there. -- john r pierce N 37, W 122 santa cruz ca mid-left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Simon Windsor wrote: [pg_largeobject keeps growing] > The data only has to be kept for a few days, and generally the system is > performing well, but as stated in the email, regular use of vacuumlo, vacuum > and autovacuum leaves the OS disc space slowly shrinking. > > As a last resort this week, I'm going to get 500+GB of extra file store > added, add a tablespace and move pg_largeobjects to this area. Then use > CLUSTER to rebuild pg_largeobjects back in the default tablespace. This > should fix things I hope, and if needed I'll use Cluster regularly. You did not pay attention. Did you verify that large objects actually get deleted in your system? If not, no magic will ever make pg_largeobject shrink. In current PostgreSQL, CLUSTER essentially does the same as VACUUM FULL. Yours, Laurenz Albe