Thread: Optimizing POSTGRESQL config

Optimizing POSTGRESQL config

From
Joanne Formoso
Date:
I'm new to POSTGRESQL but me and my friend are trying
to benchmark the database by populating one table with
100,000 records, affecting around 5 columns in one
row.  The problem is our processing time is very
slow...around 30 minutes to 1 hour.  There are times
it just hangs and stop execution.  I was just
wondering if this was normal.

We structured our PHP code in such a way that isn't
optimized in order to isolate possible problems with
the database.   We also used indices to speed up the
process from the database side.  Any suggestions on
how to optimize Postgre SQL through its config files?
Thanks in advance!

__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com


Re: Optimizing POSTGRESQL config

From
Tom Lane
Date:
Joanne Formoso <joanneformoso@yahoo.com> writes:
> I'm new to POSTGRESQL but me and my friend are trying
> to benchmark the database by populating one table with
> 100,000 records, affecting around 5 columns in one
> row.  The problem is our processing time is very
> slow...around 30 minutes to 1 hour.  There are times
> it just hangs and stop execution.  I was just
> wondering if this was normal.

Seems mighty slow --- but it's impossible to really tell with this lack
of information.  Show us the table schema, the actual query, and
EXPLAIN's output for the query (or better yet, EXPLAIN ANALYZE),
and then you might get some useful advice.

Of course there's the standard newbie mistake: have you run VACUUM
and/or ANALYZE on this table?  You might also care to boost the
shared_buffers setting, which is unreasonably low out-of-the-box.

> We structured our PHP code in such a way that isn't
> optimized in order to isolate possible problems with
> the database.

Are you sure it's not the PHP code that's the problem?  One way to
investigate is to watch "top" while running your test.  If the php
process is taking more CPU than the postgres backend, then I'd look
to fixing the PHP side first ...

            regards, tom lane


Re: Optimizing POSTGRESQL config

From
eurlix
Date:
Hello Joanne,

Le Thu, 1 May 2003 06:02:38 -0700 (PDT)
Joanne Formoso <joanneformoso@yahoo.com> a écrit:

> I'm new to POSTGRESQL but me and my friend are trying
> to benchmark the database by populating one table with
> 100,000 records, affecting around 5 columns in one
> row.  The problem is our processing time is very
> slow...around 30 minutes to 1 hour.  There are times
> it just hangs and stop execution.  I was just
> wondering if this was normal.
>
> We structured our PHP code in such a way that isn't
> optimized in order to isolate possible problems with
> the database.   We also used indices to speed up the
> process from the database side.  Any suggestions on
> how to optimize Postgre SQL through its config files?
> Thanks in advance!

Perhaps I don't well understand your question, but
I have some times (for test) populated a table of
46 rows and 1,348,215 tuples in 6 MINUTES using \copy
from a sequential file (on an AMD 900 MHz).
This take more than 50 HOURS if I do that by a "C" program
using libpq ... !

And don't create secondary indexes before the table was populated,
depending of number of indexes this can take ... some days.

Hope this help,
--
Alain Lucari (Eurlix)


Re: Optimizing POSTGRESQL config (Speed Test PHP Code)

From
Joanne Formoso
Date:
Thanks for all the suggestions!  We're really quite
new with database in general but were experimenting
with PostgreSQL.  Our knowledge of databases were
really from MySQL.  Anyway, we will be moving on to
trying to learn PostgreSQL.

We modified our old code, to 'batch' the processes
instead of just one whole command.  Based on some of
the suggestions earlier here is our modified PHP code:

******************************************************

$query = "VACUUM ANALYZE dbspeed2;";
$result = pg_query($connection, $query); //or die("
Error in query: $query. " .
pg_last_error($connection));
if (!$result)
{
    $time_elapsed = time() - $time_connect;
    echo "Database failure: Error during VACUUM ANALYZE
1<br>Time Elapsed: $time_elapsed <br><br> POSTGRESQL
ERROR:<br>" . pg_last_error($connection);
    exit;
}

$count = 1;
for ($i = 0; $i < 100; $i++)
{
    $query = "BEGIN;";
    for($j = 0; $j <1000; $j++)
    {
        $query = $query."INSERT INTO dbspeed2 VALUES
($count, 'This is loop pass # $i-$j',  random()*1000,
0, 0, 0, 0);";
        $count = $count + 1;
    }
    $query = $query."COMMIT;";
    $result = pg_query($connection, $query);
    if (!$result)
    {
        $time_elapsed = time() - $time_connect;
        echo "Database failure: INSERT GROUP $i-$j<br>Time
Elapsed: $time_elapsed <br><br> POSTGRESQL ERROR:<br>"
. pg_last_error($connection);
        exit;
    }
}



$query = "VACUUM ANALYZE dbspeed2;";
$result = pg_query($connection, $query);
if (!$result)
{
    $time_elapsed = time() - $time_connect;
    echo "Database failure: Error during VACUUM ANALYZE
2<br>Time Elapsed: $time_elapsed <br><br> POSTGRESQL
ERROR:<br>" . pg_last_error($connection);
    exit;
}

$count = 1;
for ($i = 0; $i < 100; $i++)
{
    $query = "BEGIN;";
    for ($j = 0; $j < 1000; $j++)
    {
        $query = $query."UPDATE dbspeed2 SET number1 =
number/4, number2 = number/4, number3 = number/4,
number4 = number/4 WHERE pky = $count;";
        $count++;
    }
    $query = $query."COMMIT;";
    $result = pg_query($connection, $query);
    if (!$result)
    {
        $time_elapsed = time() - $time_connect;
        echo "Database failure: UPDATE $i-$j<br>Time
Elapsed: $time_elapsed <br><br> POSTGRESQL ERROR:<br>"
. pg_last_error($connection);
        exit;
    }
}

******************************************************

The code above executes in about 20 minutes. We have
an placed indexes in columns that is being used by the
code.  I know the code is a little long...but is there
any way to improve the code and POSTGRESQL's
performance?  The basic goal of this code is to gauge
how fast POSTGRESQL will run given high volume data.

I'll post the config file of POSTGRESQL in a minute.

Once again thanks in advance!

Regards,
Joanne Formoso





__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com


Re: Optimizing POSTGRESQL config

From
Joanne Formoso
Date:
Hello,

Attached to this email is the configuration file that
we are currently using for POSTGRESQL
(postgresql.txt).  We are using Apache webserver
together with PHP and PostgreSQL 7.3.2.  Our test
server has the following specs:  AMD Athlon 1.2 Gz
with 128 MB of ram. Any help and suggestions on the
matter would be very much appreciated.  Thank you.

Regards,
Joanne Formoso


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com#
# PostgreSQL configuration file
# -----------------------------
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# "pg_ctl reload".


#========================================================================


#
#    Connection Parameters
#
tcpip_socket = true
#ssl = false

max_connections = 64
#superuser_reserved_connections = 2

port = 5432
#hostname_lookup = false
#show_source_port = false

#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777    # octal

#virtual_host = ''

#krb_server_keyfile = ''


#
#    Shared Memory Size
#
shared_buffers = 11207        # min max_connections*2 or 16, 8KB each
#max_fsm_relations = 1000    # min 10, fsm is free space map, ~40 bytes
#max_fsm_pages = 10000        # min 1000, fsm is free space map, ~6 bytes
#max_locks_per_transaction = 64    # min 10
#wal_buffers = 8        # min 4, typically 8KB each

#
#    Non-shared Memory Sizes
#
sort_mem = 32168        # min 64, size in KB
#vacuum_mem = 8192        # min 1024, size in KB


#
#    Write-ahead log (WAL)
#
#checkpoint_segments = 3    # in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300    # range 30-3600, in seconds
#
#commit_delay = 0        # range 0-100000, in microseconds
#commit_siblings = 5        # range 1-1000
#
fsync = false
#wal_sync_method = fsync    # the default varies across platforms:
#                # fsync, fdatasync, open_sync, or open_datasync
#wal_debug = 0            # range 0-16


#
#    Optimizer Parameters
#
#enable_seqscan = true
#enable_indexscan = true
#enable_tidscan = true
#enable_sort = true
#enable_nestloop = true
#enable_mergejoin = true
#enable_hashjoin = true

#effective_cache_size = 1000    # typically 8KB each
#random_page_cost = 4        # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01        # (same)
#cpu_index_tuple_cost = 0.001    # (same)
#cpu_operator_cost = 0.0025    # (same)

#default_statistics_target = 10    # range 1-1000

#
#    GEQO Optimizer Parameters
#
#geqo = true
#geqo_selection_bias = 2.0    # range 1.5-2.0
#geqo_threshold = 11
#geqo_pool_size = 0        # default based on tables in statement,
                # range 128-1024
#geqo_effort = 1
#geqo_generations = 0
#geqo_random_seed = -1        # auto-compute seed


#
#    Message display
#
#server_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   info, notice, warning, error, log, fatal,
                #   panic
#client_min_messages = notice    # Values, in order of decreasing detail:
                #   debug5, debug4, debug3, debug2, debug1,
                #   log, info, notice, warning, error
#silent_mode = false

#log_connections = false
#log_pid = false
#log_statement = false
#log_duration = false
#log_timestamp = false

#log_min_error_statement = error # Values in order of increasing severity:
                 #   debug5, debug4, debug3, debug2, debug1,
                 #   info, notice, warning, error, panic(off)

#debug_print_parse = false
#debug_print_rewritten = false
#debug_print_plan = false
#debug_pretty_print = false

#explain_pretty_print = true

# requires USE_ASSERT_CHECKING
#debug_assertions = true


#
#    Syslog
#
#syslog = 0            # range 0-2
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'


#
#    Statistics
#
#show_parser_stats = false
#show_planner_stats = false
#show_executor_stats = false
#show_statement_stats = false

# requires BTREE_BUILD_STATS
#show_btree_build_stats = false


#
#    Access statistics collection
#
#stats_start_collector = true
#stats_reset_on_server_start = true
#stats_command_string = false
#stats_row_level = false
#stats_block_level = false


#
#    Lock Tracing
#
#trace_notify = false

# requires LOCK_DEBUG
#trace_locks = false
#trace_userlocks = false
#trace_lwlocks = false
#debug_deadlocks = false
#trace_lock_oidmin = 16384
#trace_lock_table = 0


#
#    Misc
#
#autocommit = true
#dynamic_library_path = '$libdir'
#search_path = '$user,public'
#datestyle = 'iso, us'
#timezone = unknown        # actually, defaults to TZ environment setting
#australian_timezones = false
#client_encoding = sql_ascii    # actually, defaults to database encoding
#authentication_timeout = 60    # 1-600, in seconds
#deadlock_timeout = 1000    # in milliseconds
#default_transaction_isolation = 'read committed'
#max_expr_depth = 10000        # min 10
#max_files_per_process = 1000    # min 25
#password_encryption = true
#sql_inheritance = true
#transform_null_equals = false
#statement_timeout = 0        # 0 is disabled, in milliseconds
#db_user_namespace = false



#
#    Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'en_US.iso885915'
LC_MONETARY = 'en_US.iso885915'
LC_NUMERIC = 'en_US.iso885915'
LC_TIME = 'en_US.iso885915'

Re: Optimizing POSTGRESQL config

From
Manfred Koizar
Date:
On Fri, 2 May 2003 00:28:34 -0700 (PDT), Joanne Formoso
<joanneformoso@yahoo.com> wrote:
>We are using Apache webserver
>together with PHP and PostgreSQL 7.3.2.  Our test
>server has the following specs:  AMD Athlon 1.2 Gz
>with 128 MB of ram.

Who sold you a 1.2 GHz machine with 128 MB?  Next time better buy 600
MHz and 256 MB RAM ;-)  So the short answer is:  Buy more memory.
OTOH you *can* configure Postgres to run satisfactorily on small
hardware.

>shared_buffers = 11207        # min max_connections*2 or 16, 8KB each

This means almost 90 MB which is way too much for your small machine.
I'd recommend shared_buffers = 1000.

>sort_mem = 32168        # min 64, size in KB

32 MB sort mem together with 90 MB shared memory, so a single sort can
cause your system to start swapping, not to mention Apache and other
processes running on that system...

>fsync = false

Only do this if you don't care for consistency after a crash.

>LC_MESSAGES = 'en_US.iso885915'
>LC_MONETARY = 'en_US.iso885915'
>LC_NUMERIC = 'en_US.iso885915'
>LC_TIME = 'en_US.iso885915'

This may slow down certain kinds of queries, locale C seems to be a
better choice if you have speed in mind. Search the archives for
details.

Servus
 Manfred


Re: Optimizing POSTGRESQL config

From
Josh Berkus
Date:
Joanne,

> Attached to this email is the configuration file that
> we are currently using for POSTGRESQL
> (postgresql.txt).  We are using Apache webserver
> together with PHP and PostgreSQL 7.3.2.  Our test
> server has the following specs:  AMD Athlon 1.2 Gz
> with 128 MB of ram. Any help and suggestions on the
> matter would be very much appreciated.  Thank you.

Buy more RAM.   In my experience, Apache tends to use about 128mb by itself
under moderate concurrent user load, causing it to fight with PostgreSQL for
RAM.   I think you'll find that doubling your RAM will quintuple your
PostgreSQL performance on large queries.


--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
                                        Josh Berkus
   Complete information technology     josh@agliodbs.com
    and data management solutions     (415) 565-7293
   for law firms, small businesses      fax 621-2533
    and non-profit organizations.     San Francisco


Re: Optimizing POSTGRESQL config

From
eurlix
Date:
Le Fri, 02 May 2003 16:22:00 +0200
Manfred Koizar <mkoi-pg@aon.at> a écrit:

>
> Who sold you a 1.2 GHz machine with 128 MB?  Next time better buy
> 600 MHz and 256 MB RAM ;-)  So the short answer is:  Buy more
> memory. OTOH you *can* configure Postgres to run satisfactorily on
> small hardware.
>
 I agree with you, memory seem to be more important than CPU speed.
 (But it's not easy to found now less than 4 GHz CPU speed at the street corner)

 Have you an idea how to configure an AMD 800 MHz with 512 Mo of RAM
 for <= 16 users ?

> 32 MB sort mem together with 90 MB shared memory, so a single sort
> can cause your system to start swapping, not to mention Apache and
> other processes running on that system...
>
  Without Apache or others ... for now.

> >fsync = false
>
> Only do this if you don't care for consistency after a crash.

 Really dangerous with a Mylex DAC960 Accelraid ... wich seem
 to sync when it want ?
>
> >LC_MESSAGES = 'en_US.iso885915'
> >LC_MONETARY = 'en_US.iso885915'
> >LC_NUMERIC = 'en_US.iso885915'
> >LC_TIME = 'en_US.iso885915'
>
> This may slow down certain kinds of queries, locale C seems to be a
> better choice if you have speed in mind. Search the archives for
> details.
>
 Can you explain a little more or give the archives references ?

Many thanks for answer,
--
Alain Lucari (Eurlix)


Re: Optimizing POSTGRESQL config

From
Joanne Formoso
Date:
Thanks for all the tips!  I'll check out the database
again when I get back to work tommorrow.  Most likely
I'll have some follow-ups but thanks again!

Regards,
Joanne Formoso

__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com


Re: Optimizing POSTGRESQL config

From
Manfred Koizar
Date:
On Fri, 2 May 2003 22:55:15 +0200, eurlix <eurlix@libertysurf.fr>
wrote:
> I agree with you, memory seem to be more important than CPU speed.
> (But it's not easy to found now less than 4 GHz CPU speed at the street corner)

 :-) and it's equally hard to find a new PC with less than 1 GB <g>.

> Have you an idea how to configure an AMD 800 MHz with 512 Mo of RAM
> for <= 16 users ?

Without any knowledge about your database size and usage patterns,
only vague ideas  -  I'd start with

    shared_buffers = 1000
    sort_mem = 4096
    effective_cache_size = 50000

>> This may slow down certain kinds of queries, locale C seems to be a
>> better choice if you have speed in mind. Search the archives for
>> details.
>>
> Can you explain a little more or give the archives references ?

Google for something like
    locale index like
or
    default locale considered harmful

Servus
 Manfred