Re: Make printtup a bit faster - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Make printtup a bit faster
Date
Msg-id 87bk1aj2go.fsf@163.com
Whole thread Raw
In response to Make printtup a bit faster  (Andy Fan <zhihuifan1213@163.com>)
Responses Re: Make printtup a bit faster
List pgsql-hackers
David Rowley <dgrowleyml@gmail.com> writes:

Hello David,

>> My high level proposal is define a type specific print function like:
>>
>> oidprint(Datum datum, StringInfo buf)
>> textprint(Datum datum, StringInfo buf)
>
> I think what we should do instead is make the output functions take a
> StringInfo and just pass it the StringInfo where we'd like the bytes
> written.
>
> That of course would require rewriting all the output functions for
> all the built-in types, so not a small task.  Extensions make that job
> harder. I don't think it would be good to force extensions to rewrite
> their output functions, so perhaps some wrapper function could help us
> align the APIs for extensions that have not been converted yet.

I have the similar concern as Tom that this method looks too
aggressive. That's why I said: 

"If a type's print function is not defined, we can still using the out 
function."

AND

"Hard coding the relationship between [common] used type and {type}print
function OID looks not cool, Adding a new attribute in pg_type looks too 
aggressive however. Anyway this is the next topic to talk about."

What would be the extra benefit we redesign all the out functions?

> There's a similar problem with input functions not having knowledge of
> the input length. You only have to look at textin() to see how useful
> that could be. Fixing that would probably make COPY FROM horrendously
> faster. Team that up with SIMD for the delimiter char search and COPY
> go a bit better still. Neil Conway did propose the SIMD part in [1],
> but it's just not nearly as good as it could be when having to still
> perform the strlen() calls.

OK, I think I can understand the needs to make in-function knows the
input length and good to know the SIMD part for delimiter char
search. strlen looks like a delimiter char search ('\0') as well. Not
sure if "strlen" has been implemented with SIMD part, but if not, why? 

> I had planned to work on this for PG18, but I'd be happy for some
> assistance if you're willing.

I see you did many amazing work with cache-line-frindly data struct
design, branch predition optimization and SIMD optimization. I'd like to
try one myself. I'm not sure if I can meet the target, what if we handle
the out/in function separately (can be by different people)? 

-- 
Best Regards
Andy Fan




pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: allowing extensions to control planner behavior
Next
From: David Rowley
Date:
Subject: Re: Make printtup a bit faster