Re: Tweaking bytea / large object block sizes? - Mailing list pgsql-general
From | Bill Moran |
---|---|
Subject | Re: Tweaking bytea / large object block sizes? |
Date | |
Msg-id | 4DF54A11.9090502@potentialtech.com Whole thread Raw |
In response to | Tweaking bytea / large object block sizes? (Hanno Schlichting <hanno@hannosch.eu>) |
List | pgsql-general |
On 6/12/11 12:00:19 PM, Hanno Schlichting wrote: > Hi. > > I'm a new Postgres user. If I happen to ask stupid questions please > feel free to point me to any documentation I should read or guidelines > for asking questions. > > I'm looking into storing binary data in Postgres and trying to > understand how data is actually stored in the database. The dataset > I'm looking at is images, photos, pdf documents which should commonly > be at a minimum 100kb, on average 10mb and can scale up to 100mb for > each document. I want to store this data in the database, as I need > transactional integrity and want to avoid the extra complexity of > managing shared filesystems between a number of frontend application > servers and database backends. > > The binary data will only be accessed as a whole. So either a complete > new file is written to the DB or a complete file will be read and > cached on a frontend server. I don't need streaming access or be able > to stream partial data. The use-case seems to be well supported by > Oracle 11g with the introduction of "secure files" (pdf: [1]). > > But from what I read of Postgres, my best bet is to store data as > large objects [2]. Going all the way down this means storing the > binary data as 2kb chunks and adding table row overhead for each of > those chunks. Using the bytea type and the toast backend [3] it seems > to come down to the same: data is actually stored in 2kb chunks for a > page size of 8kb. > > I'm assuming I'll be able to get ~8kb jumbo frame packets over a > gigabit network connection and would be able to use a ext4 volume with > a block size of either 32kb or 64kb for the volume housing the binary > data, but a smaller block size for the one housing the relational > data. > > Given those other constraints, it seems silly to split data up into > 2kb chunks on the database level. Is there any way the chunk size for > binary data can be increased here independent of the one for "normal > relational" data? You could redefine LOBLKSIZE and/or BLCKSZ such that the result was larger chunks stored for large objects and then build PG from source. I assume that LOBLKSIZE is defined as a multiple of BLKSIZE for a reason, and that adjusting BLKSIZE is the better way to go. Before doing this, I would set yourself up a performance test case so that you can be sure that your changes are actually leading to an improvement. Since an RDBMS isn't typically used to simulate a large-scale filestore, it's likely that he 8K page size and 2K LO block size are inefficient if that's what it's being used for, but I wouldn't assume that larger sizes automatically mean more performance until you actually test it. -- Bill Moran
pgsql-general by date: