Array types and loading - Mailing list pgsql-performance

From Aaron Birkland
Subject Array types and loading
Date
Msg-id 19ab0ccd040818123936c1cdd3@mail.gmail.com
Whole thread Raw
Responses Re: Array types and loading
List pgsql-performance
Hi,

I noticed an interesting phenomenon when loding (COPY) some tables
from a file.  For some reason, one load was taking longer than I
assumed it would.  As it turns out, one of the columns was an array
containing elements that were of a user defined type.  Using strace
(on linux) and truss (on solaris), most of the time was spent on
stat64() calls on the shared library that contained the in/out
functions for the user defined type.

In a nutshell, it looks like whenever COPY is invoked, and when a user
defined type is used in an array, then stat64() will be called for
each row accessed on the shared library relevant for the user defined
type.

As an example, I created a simple unsigned 2 byte integer type (called
uint2) as follows:
...
typedef uint16 uint2;

PG_FUNCTION_INFO_V1(uint2_in);
Datum
uint2_in (PG_FUNCTION_ARGS)
{
        int val;
        val = atoi(PG_GETARG_CSTRING(0));
        if (val > 65535 || val < 0)
                elog (ERROR, "Value %i is not in range for uint2", val);
        PG_RETURN_INT16((uint2) val);
}

PG_FUNCTION_INFO_V1(uint2_out);
Datum
uint2_out (PG_FUNCTION_ARGS)
{
        uint2 e = (uint2) PG_GETARG_INT16(0);
        char * ret_string = (char *) palloc(9);

        snprintf(ret_string, 9, "%i", (int) e);
        PG_RETURN_CSTRING(ret_string);
}
....
I compiled this, making shared library uint2.so, and in psql created the type.

I then created two test tables:

CREATE TABLE scratch0 (value uint2);
INSERT INTO scratch0 values('1');
INSERT INTO scratch0 values('2');
INSERT INTO scratch0 values('3');
INSERT INTO scratch0 values('4');

CREATE TABLE scratch1 (value uint2[]);
INSERT INTO scratch1 values('{1,2,3}');
INSERT INTO scratch1 values('{10,20,30}');
INSERT INTO scratch1 values('{100,200,300}');
INSERT INTO scratch1 values('{1000,2000,300

Now, I ran strace (and truss) tattached to the postmaster process
looking for stat64, and here's what I found:

------
SELECT * from scratch0 LIMIT 1;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0

SELECT * from scratch0 LIMIT 2;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0

SELECT * from scratch0 LIMIT 3;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
------
.
.so each SELECT stats the uint2.so file once, regardless of the number
of fows.  fair enough.  now for the array case:

------
SELECT * from scratch1 LIMIT 1;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0

SELECT * from scratch1 LIMIT 2;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0

SELECT * from scratch1 LIMIT 3;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
------

..so for *every row* involving an array, a stat64 is called.  This is
true for COPY too.. we were loading 2.5 billion rows, so that's quite
a bit of overhead!  Now, we used strace (on linux) and truss (on
solaris) to see how much this was actually affecting us.:

Output from truss COPYING large file with millions of {1,2,3} unint2 arrays:
syscall               seconds   calls  errors
read                     .055     527
write                   1.507   11358
close                    .000       6
time                     .000       3
getpid                   .000       1
kill                     .000       1
semop                    .002     116
fdsync                   .000       3
llseek                   .081    5042
stat64                 24.205 1078764 539382
open64                   .000       6
                     --------  ------   ----
sys totals:            25.853 1095827 539382
usr time:              21.567
elapsed:              113.310

so in other words, almost 25% of the total time and 50+% of the
execution time time was wasted on stat64.  on Linux, the proportion of
time relative to other calls in strace was similar.

So.. Is this a bug or are the stat64 calls necessary?  I doubt arrays
of user-defined types (in C) are common, and I couldn't find anything
that looked relevent in the lists.

pgsql-performance by date:

Previous
From: "gnari"
Date:
Subject: Re: I could not get postgres to utilizy indexes
Next
From: Tom Lane
Date:
Subject: Re: Array types and loading