Thread: Runtime dependency from size of a bytea field

Runtime dependency from size of a bytea field

From
"Sander, Ingo (NSN - DE/Munich)"
Date:

Hi,

We have done the following test with PostgreSQL 9.0.

create table bytea_demo ( index int, part1 bytea)");

Then we have instantiated a loop (1000) with the following action:

insert into bytea_demo ( index, part1, ) values ('%d', '%s');
       ", i, entry);

1a) In a first measurement part is supported with a bytea area (entry) of 4000 bytes (4000 characters)
1b) In a second run part is supported with a bytea area (entry) of 5000 bytes (5000 characters).

Result: The runtime of case 1a) is ~ 3 sec, however for case 1b) the runtime is ~ 43 sec. Why here we have such a large difference in runtime.

BR
Ingo Sander

Best Regards/mfG
Ingo Sander
=========================================================
Nokia Siemens Networks GmbH &Co. KG
NWS EP CP SVSS Platform Tech Support DE
St.-Martin-Str. 76
D-81541 München
(Tel.:  +49-89-515938390
+ingo.sander@nsn.com

Nokia Siemens Networks GmbH & Co. KG
Sitz der Gesellschaft: München / Registered office: Munich

Registergericht: München / Commercial registry: Munich, HRA 88537
WEEE-Reg.-Nr.: DE 52984304

Persönlich haftende Gesellschafterin / General Partner: Nokia Siemens Networks Management GmbH
Geschäftsleitung / Board of Directors: Lydia Sommer, Olaf Horsthemke
Vorsitzender des Aufsichtsrats / Chairman of supervisory board: Herbert Merz
Sitz der Gesellschaft: München / Registered office: Munich
Registergericht: München / Commercial registry: Munich, HRB 163416

Re: Runtime dependency from size of a bytea field

From
Merlin Moncure
Date:
On Tue, Oct 5, 2010 at 3:23 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> Hi,
>
> We have done the following test with PostgreSQL 9.0.
>
> create table bytea_demo ( index int, part1 bytea)");
>
> Then we have instantiated a loop (1000) with the following action:
>
> insert into bytea_demo ( index, part1, ) values ('%d', '%s');
>        ", i, entry);
>
> 1a) In a first measurement part is supported with a bytea area (entry) of
> 4000 bytes (4000 characters)
> 1b) In a second run part is supported with a bytea area (entry) of 5000
> bytes (5000 characters).
>
> Result: The runtime of case 1a) is ~ 3 sec, however for case 1b) the runtime
> is ~ 43 sec. Why here we have such a large difference in runtime.

Probably you are hitting toast threshold and running into compression.
 compression you can disable, but toast you cannot (short of
recompiling with higher blocksz).

merlin

Re: Runtime dependency from size of a bytea field

From
Craig Ringer
Date:
On 10/06/2010 12:11 AM, Merlin Moncure wrote:

> Probably you are hitting toast threshold and running into compression.
>   compression you can disable, but toast you cannot (short of
> recompiling with higher blocksz).

For the OP's reference:

http://www.postgresql.org/docs/current/static/storage-toast.html
http://www.postgresql.org/docs/current/static/sql-altertable.html

While (I think) PLAIN storage could be used, the inability to span rows
over blocks means you would't get over 8k anyway.

--
Craig Ringer

Re: Runtime dependency from size of a bytea field

From
"Sander, Ingo (NSN - DE/Munich)"
Date:
Changing of the storage method ( alter table bytea_demo Alter part1 Set
storage EXTERNAL)
or the increasing of the BLOCK_SIZE (new compilation of the code with
--with-blocksize=32) change the behaviour.

Ingo Sander


-----Original Message-----
From: ext Craig Ringer [mailto:craig@postnewspapers.com.au]
Sent: Wednesday, October 06, 2010 3:24 AM
To: Merlin Moncure
Cc: Sander, Ingo (NSN - DE/Munich); pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Runtime dependency from size of a bytea field

On 10/06/2010 12:11 AM, Merlin Moncure wrote:

> Probably you are hitting toast threshold and running into compression.
>   compression you can disable, but toast you cannot (short of
> recompiling with higher blocksz).

For the OP's reference:

http://www.postgresql.org/docs/current/static/storage-toast.html
http://www.postgresql.org/docs/current/static/sql-altertable.html

While (I think) PLAIN storage could be used, the inability to span rows
over blocks means you would't get over 8k anyway.

--
Craig Ringer

Re: Runtime dependency from size of a bytea field

From
Merlin Moncure
Date:
On Wed, Oct 6, 2010 at 1:39 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> Changing of the storage method ( alter table bytea_demo Alter part1 Set
> storage EXTERNAL)
> or the increasing of the BLOCK_SIZE (new compilation of the code with
> --with-blocksize=32) change the behaviour.

yeah -- however changing block size is major surgery and is going to
have other effects (some of them negative) besides raising toast
threshold.  I would start with disabling compression and see where you
stood on performance terms.

merlin

Re: Runtime dependency from size of a bytea field

From
"Sander, Ingo (NSN - DE/Munich)"
Date:
Hi,

I thougth I have disabled compressing by setting alter command? Or is
there another command?

BR
Ingo

-----Original Message-----
From: ext Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: Wednesday, October 06, 2010 2:51 PM
To: Sander, Ingo (NSN - DE/Munich)
Cc: ext Craig Ringer; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Runtime dependency from size of a bytea field

On Wed, Oct 6, 2010 at 1:39 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> Changing of the storage method ( alter table bytea_demo Alter part1
Set
> storage EXTERNAL)
> or the increasing of the BLOCK_SIZE (new compilation of the code with
> --with-blocksize=32) change the behaviour.

yeah -- however changing block size is major surgery and is going to
have other effects (some of them negative) besides raising toast
threshold.  I would start with disabling compression and see where you
stood on performance terms.

merlin

Re: Runtime dependency from size of a bytea field

From
Merlin Moncure
Date:
On Wed, Oct 6, 2010 at 10:22 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> Hi,
>
> I thougth I have disabled compressing by setting alter command? Or is
> there another command?

yes.  have you re-run the test? got any performance results?

merlin

Re: Runtime dependency from size of a bytea field

From
"Sander, Ingo (NSN - DE/Munich)"
Date:
As written before I have rerun the test a) without compression and b)
with enlarged BLOCK_SIZE. Result was the same.

BR
Ingo

-----Original Message-----
From: ext Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: Wednesday, October 06, 2010 4:50 PM
To: Sander, Ingo (NSN - DE/Munich)
Cc: ext Craig Ringer; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Runtime dependency from size of a bytea field

On Wed, Oct 6, 2010 at 10:22 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> Hi,
>
> I thougth I have disabled compressing by setting alter command? Or is
> there another command?

yes.  have you re-run the test? got any performance results?

merlin

Re: Runtime dependency from size of a bytea field

From
Merlin Moncure
Date:
On Thu, Oct 7, 2010 at 12:11 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> As written before I have rerun the test a) without compression and b)
> with enlarged BLOCK_SIZE. Result was the same.

Using libpqtypes (source code follows after sig), stock postgres,
stock table, I was not able to confirm your results.  4000 bytea
blocks, loops of 1000 I was able to send in about 600ms.  50000 byte
blocks I was able to send in around 2 seconds on workstation class
hardware -- maybe something else is going on?.

merlin

#include "libpq-fe.h"
#include "libpqtypes.h"

#define DATASZ 50000

int main()
{
  int i;
  PGbytea b;
  char data[DATASZ];
  PGconn *c = PQconnectdb("host=localhost dbname=postgres");
  if(PQstatus(c) != CONNECTION_OK)
  {
    printf("bad connection");
    return -1;
  }

  PQtypesRegister(c);

  b.data = data;
  b.len = DATASZ;

  for(i=0; i<1000; i++)
  {
    PGresult *res = PQexecf(c, "insert into bytea_demo(index, part1)
values (%int4, %bytea)", i, &b);

    if(!res)
    {
      printf("got %s\n", PQgeterror());
      return -1;
    }
    PQclear(res);
  }

  PQfinish(c);
}

Re: Runtime dependency from size of a bytea field

From
Merlin Moncure
Date:
On Thu, Oct 7, 2010 at 10:49 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Oct 7, 2010 at 12:11 AM, Sander, Ingo (NSN - DE/Munich)
> <ingo.sander@nsn.com> wrote:
>> As written before I have rerun the test a) without compression and b)
>> with enlarged BLOCK_SIZE. Result was the same.
>
> Using libpqtypes (source code follows after sig), stock postgres,
> stock table, I was not able to confirm your results.  4000 bytea
> blocks, loops of 1000 I was able to send in about 600ms.  50000 byte
> blocks I was able to send in around 2 seconds on workstation class
> hardware -- maybe something else is going on?.

I re-ran the test, initializing the bytea data to random values (i
wondered if uninitialized data getting awesome compression was skewing
the results).

This slowed down 50000 bytea case to around 3.5-4 seconds.  That's
12-15mb/sec from single thread which is IMNSHO not too shabby.   If
your data compresses decently and you hack a good bang/buck
compression alg into the backend like lzo you can easily double that
number.

merlin

Re: Runtime dependency from size of a bytea field

From
"Sander, Ingo (NSN - DE/Munich)"
Date:
Hi,

The difference to my test is that we use the ODBC interface in our C program. Could it be that the difference in the
runtimesis caused by the ODBC? 

BR
Ingo

-----Original Message-----
From: ext Merlin Moncure [mailto:mmoncure@gmail.com]
Sent: Thursday, October 07, 2010 7:17 PM
To: Sander, Ingo (NSN - DE/Munich)
Cc: ext Craig Ringer; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Runtime dependency from size of a bytea field

On Thu, Oct 7, 2010 at 10:49 AM, Merlin Moncure <mmoncure@gmail.com> wrote:
> On Thu, Oct 7, 2010 at 12:11 AM, Sander, Ingo (NSN - DE/Munich)
> <ingo.sander@nsn.com> wrote:
>> As written before I have rerun the test a) without compression and b)
>> with enlarged BLOCK_SIZE. Result was the same.
>
> Using libpqtypes (source code follows after sig), stock postgres,
> stock table, I was not able to confirm your results.  4000 bytea
> blocks, loops of 1000 I was able to send in about 600ms.  50000 byte
> blocks I was able to send in around 2 seconds on workstation class
> hardware -- maybe something else is going on?.

I re-ran the test, initializing the bytea data to random values (i
wondered if uninitialized data getting awesome compression was skewing
the results).

This slowed down 50000 bytea case to around 3.5-4 seconds.  That's
12-15mb/sec from single thread which is IMNSHO not too shabby.   If
your data compresses decently and you hack a good bang/buck
compression alg into the backend like lzo you can easily double that
number.

merlin

Re: Runtime dependency from size of a bytea field

From
Robert Haas
Date:
On Fri, Oct 8, 2010 at 12:53 AM, Sander, Ingo (NSN - DE/Munich)
<ingo.sander@nsn.com> wrote:
> The difference to my test is that we use the ODBC interface in our C program. Could it be that the difference in the
runtimesis caused by the ODBC? 

I've heard tell that ODBC is substantially slower than a native libpq
connection, but I don't know that for a fact, not being an ODBC user.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company