Hi,
I noticed an interesting phenomenon when loding (COPY) some tables
from a file. For some reason, one load was taking longer than I
assumed it would. As it turns out, one of the columns was an array
containing elements that were of a user defined type. Using strace
(on linux) and truss (on solaris), most of the time was spent on
stat64() calls on the shared library that contained the in/out
functions for the user defined type.
In a nutshell, it looks like whenever COPY is invoked, and when a user
defined type is used in an array, then stat64() will be called for
each row accessed on the shared library relevant for the user defined
type.
As an example, I created a simple unsigned 2 byte integer type (called
uint2) as follows:
...
typedef uint16 uint2;
PG_FUNCTION_INFO_V1(uint2_in);
Datum
uint2_in (PG_FUNCTION_ARGS)
{
int val;
val = atoi(PG_GETARG_CSTRING(0));
if (val > 65535 || val < 0)
elog (ERROR, "Value %i is not in range for uint2", val);
PG_RETURN_INT16((uint2) val);
}
PG_FUNCTION_INFO_V1(uint2_out);
Datum
uint2_out (PG_FUNCTION_ARGS)
{
uint2 e = (uint2) PG_GETARG_INT16(0);
char * ret_string = (char *) palloc(9);
snprintf(ret_string, 9, "%i", (int) e);
PG_RETURN_CSTRING(ret_string);
}
....
I compiled this, making shared library uint2.so, and in psql created the type.
I then created two test tables:
CREATE TABLE scratch0 (value uint2);
INSERT INTO scratch0 values('1');
INSERT INTO scratch0 values('2');
INSERT INTO scratch0 values('3');
INSERT INTO scratch0 values('4');
CREATE TABLE scratch1 (value uint2[]);
INSERT INTO scratch1 values('{1,2,3}');
INSERT INTO scratch1 values('{10,20,30}');
INSERT INTO scratch1 values('{100,200,300}');
INSERT INTO scratch1 values('{1000,2000,300
Now, I ran strace (and truss) tattached to the postmaster process
looking for stat64, and here's what I found:
------
SELECT * from scratch0 LIMIT 1;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
SELECT * from scratch0 LIMIT 2;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
SELECT * from scratch0 LIMIT 3;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
------
.
.so each SELECT stats the uint2.so file once, regardless of the number
of fows. fair enough. now for the array case:
------
SELECT * from scratch1 LIMIT 1;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
SELECT * from scratch1 LIMIT 2;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
SELECT * from scratch1 LIMIT 3;
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
stat64("/pg/lib/postgresql/uint2", 0xFFBFE808) Err#2 ENOENT
stat64("/pg/lib/postgresql/uint2.so", 0xFFBFE808) = 0
------
..so for *every row* involving an array, a stat64 is called. This is
true for COPY too.. we were loading 2.5 billion rows, so that's quite
a bit of overhead! Now, we used strace (on linux) and truss (on
solaris) to see how much this was actually affecting us.:
Output from truss COPYING large file with millions of {1,2,3} unint2 arrays:
syscall seconds calls errors
read .055 527
write 1.507 11358
close .000 6
time .000 3
getpid .000 1
kill .000 1
semop .002 116
fdsync .000 3
llseek .081 5042
stat64 24.205 1078764 539382
open64 .000 6
-------- ------ ----
sys totals: 25.853 1095827 539382
usr time: 21.567
elapsed: 113.310
so in other words, almost 25% of the total time and 50+% of the
execution time time was wasted on stat64. on Linux, the proportion of
time relative to other calls in strace was similar.
So.. Is this a bug or are the stat64 calls necessary? I doubt arrays
of user-defined types (in C) are common, and I couldn't find anything
that looked relevent in the lists.