Re: Text <-> C string - Mailing list pgsql-hackers
From | Brendan Jurd |
---|---|
Subject | Re: Text <-> C string |
Date | |
Msg-id | 37ed240d0803251033h60fe993bp694dd4ff657f442a@mail.gmail.com Whole thread Raw |
In response to | Re: Text <-> C string (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Text <-> C string
|
List | pgsql-hackers |
On 26/03/2008, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Brendan's patch also included "cstring_text_limit(const char *s, int len)" > which was defined as copying Min(len, strlen(s)) bytes. I didn't find > this to be particularly useful. In the first place, all potential > callers are passing the exact desired length, so the strlen() call is > just a waste of cycles. In the second place, at least some callers pass > text that is not embedded in a known-to-be-null-terminated string (it > could be a section of a text datum instead); which means there is a > nonzero chance of the strlen running off the end of memory and dumping > core. So I propose instead > > extern text *cstring_to_text_with_len(const char *s, int len); > That all makes sense to me. I think the new name is good. It's pretty long, but I'm not seeing a shorter name that accurately describes the function. > which just takes the given length as gospel. Brendan had also proposed > "text_to_cstring_limit(const text *t, int len)" with similar Min() > semantics, but what this was doing was replacing copies into > limited-size local buffers with a palloc. If we did that we might > as well just use text_to_cstring. What I think is more useful is > a strlcpy()-like function that copies into a caller-supplied buffer > of limited size. For lack of a better idea I propose defining it > *exactly* like strlcpy: > > extern size_t textlcpy(char *dst, const text *src, size_t siz); > I'm all for providing a function with this behaviour, but is textlcpy() a bit ambiguous? It's not clear from the name whether the function copies text -> text, text -> cstring or cstring -> text. In fact, if I didn't already know better I'd probably assume that the function copied text -> text with length, in the same way strlcpy copies string -> string. A text_to_cstring_with_len() or text_to_cstring_limit() might be more to the point, and more consistent with the other functions in the family. On the other hand, maybe some difference in naming would help make it obvious to callers that, unlike its siblings, textlcpy() takes the destination string as an argument rather than returning it. text_to_cstring_lcpy()? > I've also found that there are lots and lots of places where the > text end of the conversion needs to be a Datum not a text *, > so it seems worthwhile to introduce a couple of macros to minimize > notation in that case: > > #define CStringGetTextDatum(s) PointerGetDatum(cstring_to_text(s)) > #define TextDatumGetCString(d) text_to_cstring((text *) DatumGetPointer(d)) > Yes, I recall coming across a number of sites where these macros would come in handy. > Lastly, the originally submitted text-to-something functions would > work correctly on plain and 1-byte-header datums, but not on > compressed or toasted-out-of-line datums. There are a whole lot of > places where that's not good enough. Rather than expecting the caller > to use the right detoasting macro everywhere, it seems best to make > these functions cope with any variant. That also avoids memory > leakage by allowing the intermediate copy to be pfree'd. (I had > suggested that the pfree might be pointless, but I reconsidered --- > if the text object is large enough to be compressed or toasted, > we're talking about at least several K, so it's worth not leaking.) > Excellent. My patch didn't contemplate dealing with compressed/toasted datums because, quite frankly, I didn't know *how* to deal with them correctly. Much to learn about varlenas, I still have. Cheers, BJ
pgsql-hackers by date: