Re: large object write performance - Mailing list pgsql-performance

From Bram Van Steenlandt
Subject Re: large object write performance
Date
Msg-id 5616588B.5000308@diomedia.be
Whole thread Raw
In response to large object write performance  (Bram Van Steenlandt <bram@diomedia.be>)
Responses Re: large object write performance
List pgsql-performance

Op 08-10-15 om 13:13 schreef Graeme B. Bell:
>> 1. The part is "fobj = lobject(db.db,0,"r",0,fpath)", I don't think there is anything there
> Can you include the surrounding code please (e.g. setting up the db connection) so we can see what’s happening, any
sync/committype stuff afterwards. 
connect:
         self.db =

psycopg2.connect(dbname=self.confighelp.get("dbname"),user=self.confighelp.get("dbuser"),password=self.confighelp.get("dbpassword"),host=self.confighelp.get("dbhost"),port=int(self.confighelp.get("dbport")),sslmode=self.confighelp.get("dbsslmode"))


upload:
                            self.statusupdate("Backing up %s
(%s)"%(fpath,nicesizeprint(size)))
                             starttime =datetime.datetime.now()
                             try:
                                 fobj = lobject(db.db,0,"r",0,fpath)
                             except psycopg2.OperationalError,e:
                                 if e.__str__().find("could not open
file")>-1:
badfiles.append([fpath,str(e).rstrip("\n").rstrip("\r")])
                                     self.statusupdate("Can't backup
%s"%fpath)
                                 else:
                                     self.emsg = str(e)
                                     return False
                             except Exception,e:
                                 self.emsg= str(e)
                                 return False
                             else:
                                 cursor.execute("insert into ${table}
(set,path,modtime,size,file,basepath) values
(%s,%s,%s,%s,%s,%s)".replace("${table}",tablename),[bset,urpath,modtime,size,fobj.oid,path])
                                 db.commit()
>
>> 2.gigabit ethernet, the scp copy I did was over the network to that harddrive using
>> scp FreeBSD-10.1-RELEASE-amd64-dvd1.iso x.x.x.x:/datapool/db/test
>>
>> 3.I agree but if scp can write to the drive at 37mb/sec, I should be able to achieve more than 8mb/sec.
>> I can indeed speed up the ZFS but it's more the difference between scp and my script that bugs me.
> It is either being caused by
>
> a) your script
> b) postgres working in a different way to scp
>
> To solve the first you need to send us more of your script
> To solve the second, it may be possible to reconfigure postgres but you may have to reconfigure your OS or hardware
tobe more suitable for the type of thing postgres does. 
>
> Put simply, scp does the absolute minimum of work to put the data onto the disk without any safety measures.
> Postgres is doing other things in the background - analyzing things, keep a synchronous log for rollback etc..
Cruciallyit’s using it’s own internal storage format. 
> You’re comparing chalk with cheese and expecting them to taste quite similar.
I agree, my question is also more, what can I do to make it easier for
postgresql, can I turn things off that will speed things up.
>
> If you try the advice I gave + read the blog post, about configuring ZFS to be friendly to the type of activity
postgreslikes to do, you may see some improvement. 
>
> If the problem is your script you’ll need to send a greater amount of the code so it can be examined.
>
>> 4.
>> dd bs=1M count=256 if=/dev/zero of=./test
>> 256+0 records in
>> 256+0 records out
>> 268435456 bytes transferred in 5.401050 secs (49700605 bytes/sec)
> good
>
>> 5. a tgz file with scp is 33.8MB/sec.
> (you can speed that up probably by changing to a lightweightcompression algorithm)
>
>> 6. the server is running all the time, speed varies , it's between 5 and 8mb/sec actually (depending also on the
numberof clients performing a backup). 
> How is the connection to postgres being made, incidentally?
>
> Graeme.
>
>
>
>
>
>>
>>
>> Op 08-10-15 om 11:45 schreef Graeme B. Bell:
>>> Seems a bit slow.
>>>
>>> 1. Can you share the script (the portion that does the file transfer) to the list? Maybe you’re doing something
unusualthere by mistake. 
>>> Similarly the settings you’re using for scp.
>>>
>>> 2. What’s the network like?
>>> For example, what if the underlying network is only capable of 10MB/s peak, and scp is using compression and the
filesare highly compressible? 
>>> Have you tried storing zip or gzip’d versions of the file into postgres? (that’s probably a good idea anyway)
>>>
>>> 3. ZFS performance can depend on available memory and use of caches (memory + L2ARC for reading, ZIL cache for
writing).
>>> Maybe put an intel SSD in there (or a pair of them) and use it as a ZIL cache.
>>>
>>> 4. Use dd to measure the write performance of ZFS doing a local write to the machine. What speed do you get?
>>>
>>> 5. Transfer a zip’d file over the network using scp. What speed do you get?
>>>
>>> 6. Is your postgres running all the time or do you start it before this test? Perhaps check if any background tasks
arerunning when you use postgres - autovacuum, autoanalyze etc. 
>>>
>>> Graeme Bell
>>>
>>>> On 08 Oct 2015, at 11:17, Bram Van Steenlandt <bram@diomedia.be> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I use postgresql often but I'm not very familiar with how it works internal.
>>>>
>>>> I've made a small script to backup files from different computers to a postgresql database.
>>>> Sort of a versioning networked backup system.
>>>> It works with large objects (oid in table, linked to large object), which I import using psycopg
>>>>
>>>> It works well but slow.
>>>>
>>>> The database (9.2.9) on the server (freebsd10) runs on a zfs mirror.
>>>> If I copy a file to the mirror using scp I get 37MB/sec
>>>> My script achieves something like 7 or 8MB/sec on large (+100MB) files.
>>>>
>>>> I've never used postgresql for something like this, is there something I can do to speed things up ?
>>>> It's not a huge problem as it's only the initial run that takes a while (after that, most files are already in the
db).
>>>> Still it would be nice if it would be a little faster.
>>>> cpu is mostly idle on the server, filesystem is running 100%.
>>>> This is a seperate postgresql server (I've used freebsd profiles to have 2 postgresql server running) so I can
changethis setup so it will work better for this application. 
>>>>
>>>> I've read different suggestions online but I'm unsure which is best, they all speak of files which are only a few
Kb,not 100MB or bigger. 
>>>>
>>>> ps. english is not my native language
>>>>
>>>> thx
>>>> Bram
>>>>
>>>>
>>>> --
>>>> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>>>> To make changes to your subscription:
>>>> http://www.postgresql.org/mailpref/pgsql-performance
>



pgsql-performance by date:

Previous
From: "Graeme B. Bell"
Date:
Subject: Re: large object write performance
Next
From: Bram Van Steenlandt
Date:
Subject: Re: large object write performance