Thread: Bad performance with large objects

Bad performance with large objects

From
tomas@nocrew.org (Tomas Skäre)
Date:
Hi,

I'm using psqlodbc (latest from CVS) together with PostgreSQL 7.4.3 in
a project. In it, I have a table that uses large objects (type
"lo"). Functions for lo was imported from the contrib/lo-script.

The table looks like this:

 timestamp | bigint            | not null
 jobid     | bigint            | not null
 objectid  | bigint            | not null
 class     | integer           | not null
 field     | character varying | not null
 data      | lo                |

The idea is that each object consists of a number of fields with
corresponding lo-data. The program is responsible for putting together
the fields to full objects. There is also a timestamp, to be able to
have history of changes of objects.

To this, I have a statement that picks out the latest timestamp for
each field of each object:

SELECT objectid,class,field,data
FROM cjm_object t1
WHERE EXISTS (SELECT MAX(timestamp) FROM cjm_object
              WHERE objectid=t1.objectid AND class=t1.class
              AND field=t1.field
              GROUP BY objectid,class,field
              HAVING t1.timestamp=MAX(timestamp))
AND data IS NOT NULL
ORDER BY objectid,class,field;

So far so good, this works well. With about 2300 rows (about 2200
active, and 100 old changes), explain analyze in psql says that this
query takes about 300ms on my machine. I've also made a view that
picks out the lo-data as bytea (from pg_largeobject) instead, and this
takes about 700ms.

Now, when I run this query from my program through ODBC, using
SQLFetch (or SQLFetchScroll) to get the data, some 100 rows at a time,
it all takes about 6 seconds instead. I've tried to set different
rowset sizes, only use FETCH_NEXT, tried to optimize for sequential
fetching, but it won't get under at least 5 seconds for the whole
operation.

So, why does it take so much longer from ODBC, than from psql (even if
I pick out the bytea-data)? The program and postgresql both run on the
same machine, so there is no network delay. I've measured that it's
not my program that is slow, it's the ODBC calls.

I'm very thankful for any help to speed this up.


Greetings,

Tomas


Re: Bad performance with large objects

From
tomas@nocrew.org (Tomas Skäre)
Date:
tomas@nocrew.org (Tomas Skäre) writes:

> I'm using psqlodbc (latest from CVS) together with PostgreSQL 7.4.3 in
> a project. In it, I have a table that uses large objects (type
> "lo"). Functions for lo was imported from the contrib/lo-script.
>
> The table looks like this:
>
>  timestamp | bigint            | not null
>  jobid     | bigint            | not null
>  objectid  | bigint            | not null
>  class     | integer           | not null
>  field     | character varying | not null
>  data      | lo                |

...

> So, why does it take so much longer from ODBC, than from psql (even if
> I pick out the bytea-data)? The program and postgresql both run on the
> same machine, so there is no network delay. I've measured that it's
> not my program that is slow, it's the ODBC calls.

An addition to this:

If I change the table from lo to bytea, the program is able to read
the data much much faster. But, psqlodbc doesn't seem to like bytea
when writing to the database, so it's only the fetching that works
with this.


Tomas