One of the things I notice which comes up in a profile of a pgbench -S
test is the generation of the command completion tag.
At the moment, this is done by a snprintf() call, namely:
snprintf(completionTag, COMPLETION_TAG_BUFSIZE,
tag == CMDTAG_INSERT ?
"%s 0 " UINT64_FORMAT : "%s " UINT64_FORMAT,
tagname, qc->nprocessed);
Ever since aa2387e2f, and the relevant discussion in [1], it's been
clear that the sprintf() functions are not the fastest. I think
there's just some unavoidable overhead to __VA_ARGS__.
The generation of the completion tag is not hugely dominant in the
profiles, but it does appear:
0.36% postgres [.] dopr.constprop.0
In the attached, there are a few things done to make the generation of
completion tags faster:
Namely:
1. Store the tag length in struct CommandTagBehavior so that we can
memcpy() a fixed length rather than having to copy byte-by-byte
looking for the \0.
2. Use pg_ulltoa_n to write the number of rows affected by the command tag.
3. Have the function that builds the tag return its length so save
from having to do a strlen before writing the tag in pq_putmessage().
It's difficult to measure the performance of something that takes
0.36% of execution. I have previously seen the tag generation take
over 1% of execution time.
One thing that's changed in the patch vs master is that if the
snprintf's buffer, for some reason had not been long enough to store
the entire completion tag, it would have truncated it and perhaps sent
a truncated version of the row count to the client. For this to
happen, we'd have to have some excessively long command name. Since
these cannot be added by users, I've opted to just add an Assert that
will trigger if we're ever asked to write a command tag that won't
fit. Assuming we test our new excessively-long-named-command, we'll
trigger that Assert and realise long before it's a problem.
Both Andres and I seemed to have independently written almost exactly
the same patch for this. The attached is mine but with the
GetCommandTagNameAndLen function from his. I'd written
GetCommandTagLen() for mine, which required a couple of function calls
instead of Andres' 1 function call to get the name and length in one
go.
Does anyone object to this?
David
[1] https://www.postgresql.org/message-id/CAKJS1f8oeW8ZEKqD4X3e+TFwZt+MWV6O-TF8MBpdO4XNNarQvA@mail.gmail.com