Re: New to the list; would this be an okay question? - Mailing list pgsql-general

From Richard Huxton
Subject Re: New to the list; would this be an okay question?
Date
Msg-id 40D70C53.3010707@archonet.com
Whole thread Raw
In response to Re: New to the list; would this be an okay question?  (Madison Kelly <linux@alteeve.com>)
Responses Re: New to the list; would this be an okay question?  (Madison Kelly <linux@alteeve.com>)
List pgsql-general
Madison Kelly wrote:
> Richard Huxton wrote:
>
>> Madison Kelly wrote:
>>
>>> Hi all,
>>>
>>>   I am new to the list and I didn't want to seem rude at all so I
>>> wanted to ask if this was okay first.
>>
>>
>>
>> No problem. Reading your message below, you might want to try the
>> performance list, but general is a good place to start.
>>
>>>   I have a program I have written in perl which uses a postgresSQL
>>> database as the backend. The program works but the performance is
>>> really bad. I have been reading as much as I can on optimizing
>>> performance but still it isn't very reasonable. At one point I had my
>>> program able to process 175,000 records in 16min 10sec on a Pentium3
>>> 650MHz, 448MB RAM test machine. Since then I got a Pentium3 1GHz,
>>> 512MB system and I have tried a lot of things to get the performance
>>> up but now it is substantially slower and I can't seem to figure out
>>> what I am doing wrong.
>>
>>
>>
>> A few places to start:
>> 1. VACUUM FULL
>>    This will make sure any unused space is reclaimed
>> 2. ANALYZE
>>    This will recalculate stats for the tables
>> 3. Basic performce tuning:
>>    http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
>>    There's also a good guide to the postgresql.conf file on varlena.com
>>
>>>   Would it be appropriate to ask for help on my program on this list?
>>> Full disclosure: The program won't be initially GPL'ed because it is
>>> for my company but it will be released for free to home users and the
>>> source code will be made available (similar to other split-license
>>> programs) though once my company makes it's money back I think they
>>> will fully GPL it (I am on my boss's case about it :p ).
>>
>>
>>
>> No problem - what you licence your software under is your concern.
>> Once you've taken the basic steps described above, try to pick out a
>> specific query that you think is too slow and provide:
>>
>> 1. PostgreSQL version
>> 2. Basic hardware info (as you have)
>> 3. Sizes of tables.
>> 4. Output of EXPLAIN ANALYZE <query here>
>>
>> The EXPLAIN ANALYZE runs the query and shows how much work PG thought
>> it would be and how much it actually turned out to be.
>>
>> HTH
>
>
>     Thank you very much!! I am using Psql 7.4 on a stock install of
> Fedora Core 2 on my IBM thinkpad a22m (P3 1GHz, 512MB RAM, not the
> fastest HDD). The drive carrier I am using is connected via USB2 and
> uses a few different hard drives with the fastest being a couple of
> Barracuda 7200.7 drives (2MB cache, 7,200rpm). I described the program
> in my reply to Martijn so here is some of the code (code not related to
> psql snipped, let me know if posting it would help - sorry for the
> wrapping...):

I'm not clear if the database is on the local disk or attached to the
USB2. Not sure it's important, since neither will be that fast.

If I understand, you scan thousands or millions of files for backup
purposes and then issue a select + update/insert for each.

Once a partition is scanned, a flag is cleared on all rows.

Once all selected files have been dealt with a vaccum/analyse is issued.


Some things to look at:
1. How many files are you handling per second? Are the disks involved in
the backup as well as the database?
2. What does the output of "vmstat 10" show when the system is running.
Is your I/O saturated? CPU?
3. Is your main index (file_src_uuid,file_name,file_parent_dir) being
used? Your best bet is to select from "pg_stat_indexes" before and after.
4. If you are updating several hundred thousand rows then you probably
don't have enought vacuum memory set aside - try a vacuum full after
each set of updates.
5. You might want to batch together queries into transactions of a few
hundred or even few thousand updates.
--
   Richard Huxton
   Archonet Ltd

pgsql-general by date:

Previous
From: Dave Cramer
Date:
Subject: Re: JDBC driver inserting into a table with Bytea type
Next
From: Tom Lane
Date:
Subject: Re: Possible SET SESSION AUTHORIZATION bug