Locked stdio has noticeable performance cost for COPY - Mailing list pgsql-hackers

From Andres Freund
Subject Locked stdio has noticeable performance cost for COPY
Date
Msg-id 20141102003406.GS17790@awork2.anarazel.de
Whole thread Raw
List pgsql-hackers
Hi,

Every now and then I've noticed that glibc's stdio functions show up
prominently in profiles. Which seems somewhat odd, given they pretty
much should delegate all work to the kernel and memcpy(). By accident I
looked at a the disassembly and oops: A large part is due to
locking. Especially in ferror()s case that's quite noticeable...

Locked it's:

int
_IO_ferror (fp)    _IO_FILE* fp;
{ int result; CHECK_FILE (fp, EOF); _IO_flockfile (fp); result = _IO_ferror_unlocked (fp); _IO_funlockfile (fp); return
result;
}
unlocked it's just:
#define _IO_ferror_unlocked(__fp) (((__fp)->_flags & _IO_ERR_SEEN) != 0)

Fun...

glibc also provide fread/fwrite/ferror_unlocked. Some quick performance
results (quickest of three) show:

before:
postgres[1]=# COPY notsolargedata (data) TO '/dev/null' WITH BINARY;
COPY 10000000
Time: 2811.623 ms
after replacing ferror/fread/fwrite in copy.c:
postgres[1]=# COPY notsolargedata (data) TO '/dev/null' WITH BINARY;
COPY 10000000
Time: 2593.969 ms

That's not nothing.

As we really, especially not in COPY, don't need the locking I guess we
should do something about. I'm unsure what exactly. Getting rid of stdio
seems more work than it's worth. So maybe just add a configure check for
fread/fwrite/ferror_unlocked and use it in copy.c?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Let's drop two obsolete features which are bear-traps for novices
Next
From: Robert Haas
Date:
Subject: Re: group locking: incomplete patch, just for discussion