Thread: PostgreSQL insert speed tests

PostgreSQL insert speed tests

From

Sezai YILMAZ

Date:

27 February 2004, 11:15:43

Hello

I need high throughput while inserting into PostgreSQL. Because of that I
did some PostgreSQL insert performance tests.

------------------------------------------------------------
-- Test schema
create table logs (
       logid serial primary key,
       ctime integer not null,
       stime integer not null,
       itime integer not null,
       agentid integer not null,
       subagentid integer not null,
       ownerid integer not null,
       hostid integer not null,
       appname varchar(64) default null,
       logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
------------------------------------------------------------

Test Hardware:
IBM Thinkpad R40
CPU: Pentium 4 Mobile 1993 Mhz (full powered)
RAM: 512 MB
OS: GNU/Linux, Fedora Core 1, kernel 2.4.24

A test program developed with libpq inserts 200.000 rows into table
logs. Insertions are made with 100 row per transaction (total 2.000
transactions).

Some parameter changes from postgresql.conf file follows:
----------------------------------------------------------------
shared_buffers = 2048           # min max_connections*2 or 16, 8KB each
max_fsm_relations = 20000       # min 10, fsm is free space map, ~40 bytes
max_fsm_pages = 200000          # min 1000, fsm is free space map, ~6 bytes
max_locks_per_transaction = 256 # min 10
wal_buffers = 64                # min 4, typically 8KB each
sort_mem = 32768                # min 64, size in KB
vacuum_mem = 16384              # min 1024, size in KB
checkpoint_segments = 6         # in logfile segments, min 1, 16MB each
checkpoint_timeout = 900        # range 30-3600, in seconds
fsync = true
wal_sync_method = fsync         # the default varies across platforms:
enable_seqscan = true
enable_indexscan = true
enable_tidscan = true
enable_sort = true
enable_nestloop = true
enable_mergejoin = true
enable_hashjoin = true
effective_cache_size = 2000     # typically 8KB each
geqo = true
geqo_selection_bias = 2.0       # range 1.5-2.0
geqo_threshold = 11
geqo_pool_size = 0              # default based on tables in statement,
                                # range 128-1024
geqo_effort = 1
geqo_generations = 0
geqo_random_seed = -1           # auto-compute seed
----------------------------------------------------------------

The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
test program was recompiled during version changes).

The results are below (average inserted rows per second).

                           speed for         speed for
# of EXISTING RECORDS    PostgreSQL 7.3.4  PostgreSQL 7.4.1
=========================================================================

      0 initial records   1086 rows/s       1324 rows/s
200.000 initial records    781 rows/s        893 rows/s
400.000 initial records    576 rows/s        213 rows/s
600.000 initial records    419 rows/s        200 rows/s
800.000 initial records    408 rows/s       not tested because of bad
results


When the logs table reconstructed with only one index (primary key) then
2941 rows/s speed is reached. But I need all the seven indexes.

The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Is there a way to speed up inserts without eliminating indexes?

What about concurrent inserts (cocurrent spare test program execution)
into the same table? It did not work.

-sezai

Re: PostgreSQL insert speed tests

From

Shridhar Daithankar

Date:

27 February 2004, 11:32:56

Sezai YILMAZ wrote:
> Test Hardware:
> IBM Thinkpad R40
> CPU: Pentium 4 Mobile 1993 Mhz (full powered)
> RAM: 512 MB
> OS: GNU/Linux, Fedora Core 1, kernel 2.4.24
>
> A test program developed with libpq inserts 200.000 rows into table
> logs. Insertions are made with 100 row per transaction (total 2.000
> transactions).
>
> Some parameter changes from postgresql.conf file follows:
> ----------------------------------------------------------------
> shared_buffers = 2048           # min max_connections*2 or 16, 8KB each

I suggest you up that to say 10000 buffers..

> max_fsm_relations = 20000       # min 10, fsm is free space map, ~40 bytes
> max_fsm_pages = 200000          # min 1000, fsm is free space map, ~6 bytes
> max_locks_per_transaction = 256 # min 10
> wal_buffers = 64                # min 4, typically 8KB each
> sort_mem = 32768                # min 64, size in KB

You need to pull it down a little, I guess. How about 8/16MB?

> vacuum_mem = 16384              # min 1024, size in KB

Not required. 1024 could be done since you are testing inserts anyways. Of
course, it matters only when you run vacuum..

> effective_cache_size = 2000     # typically 8KB each

Is that true? It tells postgresql that it has around 16MB memory. Set it up
around 15000 so that around 100MB+ is used. Might change the results of index
scans.. I always prefer to set it to whatever available.

> The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
> test program was recompiled during version changes).
>
> The results are below (average inserted rows per second).
>
>                           speed for         speed for
> # of EXISTING RECORDS    PostgreSQL 7.3.4  PostgreSQL 7.4.1
> =========================================================================
>
>      0 initial records   1086 rows/s       1324 rows/s
> 200.000 initial records    781 rows/s        893 rows/s
> 400.000 initial records    576 rows/s        213 rows/s
> 600.000 initial records    419 rows/s        200 rows/s
> 800.000 initial records    408 rows/s       not tested because of bad
> results

Do you mean 800000? I believe the '.' is a thousands separator here but not too
sure..:-)

> When the logs table reconstructed with only one index (primary key) then
> 2941 rows/s speed is reached. But I need all the seven indexes.
>
> The question is why the PostgreSQL 7.4.1 is so slow under heavy work?

Can you run vmstat and see where things get stalled? Probably you can up the
number of WAL segments and attempt.

> Is there a way to speed up inserts without eliminating indexes?
>
> What about concurrent inserts (cocurrent spare test program execution)
> into the same table? It did not work.

What does it mean, it didn't work? Any errors?

HTH

  Shridhar

Re: PostgreSQL insert speed tests

From

Bill Moran

Date:

27 February 2004, 12:35:58

I don't know the answer to the question of why 7.4 is slower, but I have
some suggestions on additional things to test, and how to make it faster.

First off, try 200 transactions of 1000 records each, you might even want
to try 20 transactions of 10,000 records each.  Postgres seems to run much
faster the less commits you have, but different configs may change the
sweet spot.

Secondly, one possible solution to your problem is to drop the indexes,
insert the new rows and recreate the indexes.  Of course, for testing,
you'll want to time the entire process of drop/insert/create and compare
it to the raw insert time with indexes intact.  I use a stored procedure
on my databases, i.e.:

select drop_foo_indexes();
...
<commands to insert many rows into table foo>
...
select create_foo_indexes();

Another thing to consider is vacuums.  You don't mention how often you
vacuumed the database during testing, I would recommend a "vacuum full"
between each test (unless, of course, you're testing how much a lack
of vacuum hurts performance ;)

Hope this helps.

Sezai YILMAZ wrote:
> Hello
>
> I need high throughput while inserting into PostgreSQL. Because of that I
> did some PostgreSQL insert performance tests.
>
> ------------------------------------------------------------
> -- Test schema
> create table logs (
>       logid serial primary key,
>       ctime integer not null,
>       stime integer not null,
>       itime integer not null,
>       agentid integer not null,
>       subagentid integer not null,
>       ownerid integer not null,
>       hostid integer not null,
>       appname varchar(64) default null,
>       logbody varchar(1024) not null
> );
>
> create index ctime_ndx on logs using btree (ctime);
> create index stime_ndx on logs using btree (stime);
> create index itime_ndx on logs using btree (itime);
> create index agentid_ndx on logs using hash (agentid);
> create index ownerid_ndx on logs using hash (ownerid);
> create index hostid_ndx on logs using hash (hostid);
> ------------------------------------------------------------
>
> Test Hardware:
> IBM Thinkpad R40
> CPU: Pentium 4 Mobile 1993 Mhz (full powered)
> RAM: 512 MB
> OS: GNU/Linux, Fedora Core 1, kernel 2.4.24
>
> A test program developed with libpq inserts 200.000 rows into table
> logs. Insertions are made with 100 row per transaction (total 2.000
> transactions).
>
> Some parameter changes from postgresql.conf file follows:
> ----------------------------------------------------------------
> shared_buffers = 2048           # min max_connections*2 or 16, 8KB each
> max_fsm_relations = 20000       # min 10, fsm is free space map, ~40 bytes
> max_fsm_pages = 200000          # min 1000, fsm is free space map, ~6 bytes
> max_locks_per_transaction = 256 # min 10
> wal_buffers = 64                # min 4, typically 8KB each
> sort_mem = 32768                # min 64, size in KB
> vacuum_mem = 16384              # min 1024, size in KB
> checkpoint_segments = 6         # in logfile segments, min 1, 16MB each
> checkpoint_timeout = 900        # range 30-3600, in seconds
> fsync = true
> wal_sync_method = fsync         # the default varies across platforms:
> enable_seqscan = true
> enable_indexscan = true
> enable_tidscan = true
> enable_sort = true
> enable_nestloop = true
> enable_mergejoin = true
> enable_hashjoin = true
> effective_cache_size = 2000     # typically 8KB each
> geqo = true
> geqo_selection_bias = 2.0       # range 1.5-2.0
> geqo_threshold = 11
> geqo_pool_size = 0              # default based on tables in statement,
>                                # range 128-1024
> geqo_effort = 1
> geqo_generations = 0
> geqo_random_seed = -1           # auto-compute seed
> ----------------------------------------------------------------
>
> The test was made with both of PostgreSQL 7.3.4 and PostgreSQL 7.4.1 (the
> test program was recompiled during version changes).
>
> The results are below (average inserted rows per second).
>
>                           speed for         speed for
> # of EXISTING RECORDS    PostgreSQL 7.3.4  PostgreSQL 7.4.1
> =========================================================================
>
>      0 initial records   1086 rows/s       1324 rows/s
> 200.000 initial records    781 rows/s        893 rows/s
> 400.000 initial records    576 rows/s        213 rows/s
> 600.000 initial records    419 rows/s        200 rows/s
> 800.000 initial records    408 rows/s       not tested because of bad
> results
>
>
> When the logs table reconstructed with only one index (primary key) then
> 2941 rows/s speed is reached. But I need all the seven indexes.
>
> The question is why the PostgreSQL 7.4.1 is so slow under heavy work?
>
> Is there a way to speed up inserts without eliminating indexes?
>
> What about concurrent inserts (cocurrent spare test program execution)
> into the same table? It did not work.

--
Bill Moran
Potential Technologies
http://www.potentialtech.com

Re: PostgreSQL insert speed tests

From

Sezai YILMAZ

Date:

27 February 2004, 13:31:06

Sezai YILMAZ wrote:

> create index agentid_ndx on logs using hash (agentid);
> create index ownerid_ndx on logs using hash (ownerid);
> create index hostid_ndx on logs using hash (hostid);
> ------------------------------------------------------------
>                           speed for         speed for
> # of EXISTING RECORDS    PostgreSQL 7.3.4  PostgreSQL 7.4.1
> =========================================================================
>
>      0 initial records   1086 rows/s       1324 rows/s
> 200.000 initial records    781 rows/s        893 rows/s
> 400.000 initial records    576 rows/s        213 rows/s
> 600.000 initial records    419 rows/s        200 rows/s
> 800.000 initial records    408 rows/s       not tested because of bad
> results

I changed the three hash indexes to btree.

The performance is increased about 2 times (in PostgreSQL 7.3.4  1905
rows/s).

Concurrent inserts now work.

Changed indexes are more suitable for hash type. Because, there is no
ordering on them, instead exact values are matched which is more natural
for hash type of indexes. But hash indexes has possible dead lock
problems on multiple concurrent inserts. I think I can live with btree
indexes. They work better. :-)

-sezai

Re: PostgreSQL insert speed tests

From

Greg Stark

Date:

27 February 2004, 14:05:55


> > create index agentid_ndx on logs using hash (agentid);
> > create index ownerid_ndx on logs using hash (ownerid);
> > create index hostid_ndx on logs using hash (hostid);

> > What about concurrent inserts (cocurrent spare test program execution) into
> > the same table? It did not work.

Hash indexes have relatively poor concurrency, though I think it should still
work. You probably want to be using btree indexes for everything though,
unless you can actually profile the two and show hash indexes being a big win.

Note that there were bugs in the hash index code at least through most 7.3
versions.

--
greg

Re: PostgreSQL insert speed tests

From

Tom Lane

Date:

27 February 2004, 16:27:57

Sezai YILMAZ <sezai.yilmaz@pro-g.com.tr> writes:
> I changed the three hash indexes to btree.
> The performance is increased about 2 times (in PostgreSQL 7.3.4  1905
> rows/s).
> Concurrent inserts now work.

Concurrent inserts should work with hash indexes in 7.4, though not 7.3.

The slowdown you report probably is due to the rewrite of hash indexing
to allow more concurrency --- the locking algorithm is more complex than
it used to be.  I am surprised that the effect is so large though.
Could you make your test program available?

> Changed indexes are more suitable for hash type.

Are they?  How many distinct values are there in those columns?
I suspect that your test may be stressing the case where only a few hash
buckets are used and each bucket chain gets to be very long.

            regards, tom lane

Re: PostgreSQL insert speed tests

From

Sezai YILMAZ

Date:

28 February 2004, 07:30:40

Tom Lane wrote:

>Sezai YILMAZ <sezai.yilmaz@pro-g.com.tr> writes:
>
>
>>I changed the three hash indexes to btree.
>>The performance is increased about 2 times (in PostgreSQL 7.3.4  1905
>>rows/s).
>>Concurrent inserts now work.
>>
>>
>Concurrent inserts should work with hash indexes in 7.4, though not 7.3.
>
>
I notice this condition. I do not get dead locks with 7.4 on schema with
hash indexes. 7.4 solves this problem but is very slow.

>The slowdown you report probably is due to the rewrite of hash indexing
>to allow more concurrency --- the locking algorithm is more complex than
>it used to be.  I am surprised that the effect is so large though.
>Could you make your test program available?
>
>
The test program and .SQL script is attached

Comiple and link scenarios:

without transactions (where each insert is a transaction)
$ gcc -o tester tester.c -lpq

with default 400 inserts per transaction blocks
$ gcc -DTRANSACTION -o tester tester.c -lpq

with 200 inserts per transaction blocks
$ gcc -DTRANSACTION -DINSERTPERTRANSACTION=200 -o tester tester.c -lpq

I do concurrent tests by starting seperate tester programs from
different xterm windows.

>>Changed indexes are more suitable for hash type.
>>
>>
>
>Are they?  How many distinct values are there in those columns?
>I suspect that your test may be stressing the case where only a few hash
>buckets are used and each bucket chain gets to be very long.
>
>
The biggest one gets 200 distinct values, the others are 5, and 10. More
information is in "tester.c" where INSERT query string is built.

Regards,

-sezai
create table logs (
    logid serial primary key,
    ctime integer not null,
    stime integer not null,
    itime integer not null,
    agentid integer not null,
    subagentid integer not null,
    ownerid integer not null,
    hostid integer not null,
    appname varchar(64) default null,
    logbody varchar(1024) not null
);

create index ctime_ndx on logs using btree (ctime);
create index stime_ndx on logs using btree (stime);
create index itime_ndx on logs using btree (itime);
create index agentid_ndx on logs using hash (agentid);
create index ownerid_ndx on logs using hash (ownerid);
create index hostid_ndx on logs using hash (hostid);
#include <stdio.h>
#include <string.h>
#include <time.h>
#include <libpq-fe.h>

#ifndef INSERTPERTRANSACTION
#define INSERTPERTRANSACTION 400
#endif

void exit_nicely(PGconn *conn)
{
    PQfinish(conn);
    exit(1);
}

int main(int argc, char **argv)
{
    char query[2048];
    char *pghost,
         *pgport,
         *pguser,
         *pgpass,
         *pgoptions,
         *pgtty;
    char *dbName;
      int  nFields;
    int i;

     /* FILE *debug; */

    PGconn     *conn;
    PGresult   *res;

    pghost = NULL;
    pgport = NULL;
    pgoptions = NULL;
    pgtty = NULL;
    pguser = "postgres";
    pgpass = NULL;
    dbName = "pgtest";


    conn = PQsetdbLogin(pghost, pgport, pgoptions, pgtty, dbName,
                        pguser, pgpass);

    /*
    * check to see that the backend connection was successfully made
      */
    if (PQstatus(conn) == CONNECTION_BAD)
    {
        fprintf(stderr, "Connection to database '%s' failed.\n", dbName);
        fprintf(stderr, "%s", PQerrorMessage(conn));
        exit_nicely(conn);
    }


    /* Do 200000 inserts for this instance */
    for (i = 0; i < 200000; i++)
    {
#ifdef TRANSACTION
        /* First insert in transaction block begins, BEGIN TRANSACTION */
        if (!(i % INSERTPERTRANSACTION)) {
            /* start a transaction block */
            res = PQexec(conn, "BEGIN");
            if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
            {
                fprintf(stderr, "BEGIN command failed\n");
                PQclear(res);
                exit_nicely(conn);
            }

            /*
            * should PQclear PGresult whenever it is no longer needed to avoid
            * memory leaks
            */
            PQclear(res);
        }
#endif

        snprintf(query, 2048,
            "INSERT INTO LOGS (CTIME, STIME, ITIME, AGENTID, SUBAGENTID, "
            "OWNERID, HOSTID, APPNAME, LOGBODY) VALUES "
            "('%d', '%d', '%d', '%d', '%d', '%d', '%d', '%s', '%s');",
            (int) time(NULL),         /* simulate CTIME epoch */
            (int) time(NULL) + 23,    /* simulate STIME epoch */
            (int) time(NULL) + 45,    /* simulate ITIME epoch */
            (int) time(NULL) % 5,     /* simulate 5 different AGENTID */
            (int) time(NULL) % 15,    /* simulate 5 different SUBAGENTID */
            (int) time(NULL) % 200,   /* simulate 200 different OWNERID */
            (int) time(NULL) % 10,    /* simulate 10 different HOSTID */
            argv[0],
            "The dummy log for testing PostgreSQL 7.3.4 "
            "speed under heavy conditions."
            );

        /* start a transaction block */
        res = PQexec(conn, query);
        if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
        {
            fprintf(stderr, "INSERT command failed\n");
            fprintf(stderr, "Backend Says:%s\n", PQresultErrorMessage(res));
            PQclear(res);
            exit_nicely(conn);
        }

        /*
        * should PQclear PGresult whenever it is no longer needed to avoid
        * memory leaks
        */
        PQclear(res);

#ifdef TRANSACTION
        /* The last insert in this transaction, it is time to COMMIT */
        if ((i % INSERTPERTRANSACTION) == (INSERTPERTRANSACTION - 1)) {
            /* start a transaction block */
            res = PQexec(conn, "COMMIT");
            if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
            {
                fprintf(stderr, "COMMIT command failed\n");
                PQclear(res);
                exit_nicely(conn);
            }

            /*
            * should PQclear PGresult whenever it is no longer needed to avoid
            * memory leaks
            */
            PQclear(res);
        }
#endif

        if (!(i%1000)) { /* display time information for each 1000 inserts */
            fprintf(stderr, "%8d\t %d\n", i, (int) time(NULL));
        }

    }

#ifdef TRANSACTION
    /* Do truncated commit */
    if ((i % INSERTPERTRANSACTION) != 0) {
        printf("COMMIT TRANSACTION\n");
        /* start a transaction block */
        res = PQexec(conn, "COMMIT");
        if (!res || PQresultStatus(res) != PGRES_COMMAND_OK)
        {
            fprintf(stderr, "COMMIT command failed\n");
            PQclear(res);
            exit_nicely(conn);
        }

        /*
        * should PQclear PGresult whenever it is no longer needed to avoid
        * memory leaks
        */
        PQclear(res);
    }
#endif

    exit_nicely(conn);

}

Re: PostgreSQL insert speed tests

From

Shridhar Daithankar

Date:

28 February 2004, 12:26:14

On Saturday 28 February 2004 13:59, Sezai YILMAZ wrote:
> Tom Lane wrote:
> >Sezai YILMAZ <sezai.yilmaz@pro-g.com.tr> writes:
> >>I changed the three hash indexes to btree.
> >>The performance is increased about 2 times (in PostgreSQL 7.3.4  1905
> >>rows/s).
> >>Concurrent inserts now work.
> >
> >Concurrent inserts should work with hash indexes in 7.4, though not 7.3.
>
> I notice this condition. I do not get dead locks with 7.4 on schema with
> hash indexes. 7.4 solves this problem but is very slow.
>
> >The slowdown you report probably is due to the rewrite of hash indexing
> >to allow more concurrency --- the locking algorithm is more complex than
> >it used to be.  I am surprised that the effect is so large though.
> >Could you make your test program available?
>
> The test program and .SQL script is attached
>
> Comiple and link scenarios:
>
> without transactions (where each insert is a transaction)
> $ gcc -o tester tester.c -lpq
>
> with default 400 inserts per transaction blocks
> $ gcc -DTRANSACTION -o tester tester.c -lpq
>
> with 200 inserts per transaction blocks
> $ gcc -DTRANSACTION -DINSERTPERTRANSACTION=200 -o tester tester.c -lpq
>
> I do concurrent tests by starting seperate tester programs from
> different xterm windows.

Some tests on CVS head in case somebody finds the data interesting. It is a
single IDE disk system with linux 2.6.2 running. It has 512MB RAM and 2.66GHz
P-IV. The file system is reiserfs.

I pulled CVS head couple of days back.

Everything default except for shared_buffers=100 and effective cache=25000, I
got 1980 inserts/sec  in a single run.

With checkpoint segments 10, I got 1923 inserts per sec.

With two concurrent processes and 10 checkpoint segments, I got 1673 req/sec.

I noted that in vmstat, the IO wasn't pushed really hard. The block out were
varying about 1000-5000 per sec. However occasionally that would spike to
18000 blocks. I guess that would be some checkpoint going on.

and I could not find sort_mem in postgresql.conf. Is work_mem new name for it?
I recall the discussion to sanitize the name but not the result of it..

 Shridhar

Re: PostgreSQL insert speed tests

From

Tom Lane

Date:

28 February 2004, 14:57:27

Shridhar Daithankar <shridhar@frodo.hserus.net> writes:
> Everything default except for shared_buffers=100 and effective cache=25000,

100?

> and I could not find sort_mem in postgresql.conf. Is work_mem new name for it?

Yeah.

            regards, tom lane

Re: PostgreSQL insert speed tests

From

Shridhar Daithankar

Date:

01 March 2004, 05:37:33

On Saturday 28 February 2004 21:27, Tom Lane wrote:
> Shridhar Daithankar <shridhar@frodo.hserus.net> writes:
> > Everything default except for shared_buffers=100 and effective
> > cache=25000,
>
> 100?

1000.. That was a typo..

 Shridhar

Re: PostgreSQL insert speed tests

From

Tom Lane

Date:

01 March 2004, 17:00:14

Sezai YILMAZ <sezai.yilmaz@pro-g.com.tr> writes:
> Tom Lane wrote:
>> The slowdown you report probably is due to the rewrite of hash indexing
>> to allow more concurrency --- the locking algorithm is more complex than
>> it used to be.  I am surprised that the effect is so large though.
>> Could you make your test program available?
>>
> The test program and .SQL script is attached

I did some profiling and found that essentially all the slowdown as the
table gets larger is associated with searching the increasingly longer
hash chains to find free space for new index tuples.  The 7.3-to-7.4
slowdown you see must be due to some marginally slower code in
ReadBuffer.  Given the overall speedup at the more normal end of the
range, I'm not too concerned about that.

What this test basically shows is that a hash index is a loser for
indexing a column with only five distinct values.  Actually, any index
structure is a loser with only five distinct values; there is no case in
which it wouldn't be faster to just seqscan the table instead of using
the index.  If the test is accurately modeling your expected data
distribution, then you do not need the agentid and hostid indexes and
should get rid of them entirely.  The index on ownerid (200 distinct
values) is the only one that's marginally useful.

            regards, tom lane

Re: PostgreSQL insert speed tests

From

Bruce Momjian

Date:

03 March 2004, 13:13:32

Tom Lane wrote:
> Sezai YILMAZ <sezai.yilmaz@pro-g.com.tr> writes:
> > Tom Lane wrote:
> >> The slowdown you report probably is due to the rewrite of hash indexing
> >> to allow more concurrency --- the locking algorithm is more complex than
> >> it used to be.  I am surprised that the effect is so large though.
> >> Could you make your test program available?
> >>
> > The test program and .SQL script is attached
>
> I did some profiling and found that essentially all the slowdown as the
> table gets larger is associated with searching the increasingly longer
> hash chains to find free space for new index tuples.  The 7.3-to-7.4
> slowdown you see must be due to some marginally slower code in
> ReadBuffer.  Given the overall speedup at the more normal end of the
> range, I'm not too concerned about that.
>
> What this test basically shows is that a hash index is a loser for
> indexing a column with only five distinct values.  Actually, any index
> structure is a loser with only five distinct values; there is no case in
> which it wouldn't be faster to just seqscan the table instead of using
> the index.  If the test is accurately modeling your expected data
> distribution, then you do not need the agentid and hostid indexes and
> should get rid of them entirely.  The index on ownerid (200 distinct
> values) is the only one that's marginally useful.

This brings up whether we should have a "hint" mode that suggests
removing indexes on columns with only a few distinct values.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073