Thread: Function structure in formatting.c
Hi hackers, I'm currently poking around in backend/utils/adt/formatting.c with a view to improving to_date() parsing (see thread at http://archives.postgresql.org/pgsql-hackers/2007-07/msg00513.php), and I've noticed that the way the functions are organised is pretty weird. The original author appears to have gone to great lengths to make the various functions work both for conversions *to* string, and *from* string. For each formatting "keyword" (DD, MM, etc), there is just one processing function; dch_global, dch_date or dch_time. Each of these takes an argument called "is_to_char". Since parsing a date out of a string, and formatting a date into a string, are fundamentally different objectives the functions end up reading a lot like this: if (is_to_char) { // do something } else { // do something completely different } In fact, almost all of the actual formatting code in the file is enclosed in one of these if .. else blocks. To my mind, it would make a lot more sense (and make hacking the file a lot easier) if the processing functions were split into to_char and from_char variants. I'm not sure what, if any, advantage is gleaned by having these functions combined. I'd like to hear from someone who has more familiarity with formatting.c on this. Is there some good reason for keeping the functions unified? Obviously there's a fair bit of work in splitting the functions up, but I'd be willing to do it if only to spare my own sanity when working on to_date parsing.
"Brendan Jurd" <direvus@gmail.com> writes: > To my mind, it would make a lot more sense (and make hacking the file > a lot easier) if the processing functions were split into to_char and > from_char variants. I'm not sure what, if any, advantage is gleaned > by having these functions combined. Yeah, I never liked that approach either. I suppose the idea was to keep the to- and from- code for each format code close together, but it blurs what's going on to a much greater extent than that's worth. regards, tom lane
On 8/9/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Brendan Jurd" <direvus@gmail.com> writes: > > To my mind, it would make a lot more sense (and make hacking the file > > a lot easier) if the processing functions were split into to_char and > > from_char variants. I'm not sure what, if any, advantage is gleaned > > by having these functions combined. > > Yeah, I never liked that approach either. I suppose the idea was to > keep the to- and from- code for each format code close together, but > it blurs what's going on to a much greater extent than that's worth. Okay, I'll see what I can do. Incidentally, anybody know what DCH is supposed to stand for? Throughout the code DCH is used to refer to date/time formatting and NUM to numeric formatting. Just curious.
Just quick update on this. It turns out (as it always does) that I want to refactor a bit more intensively than I first suggested. The functions dch_global, dch_time and dch_date seem to be wholly pointless, since dch_global is effectively a one-liner for the FX flag, and the other two merely wrap around a giant switch statement. My thought was to split DCH_processor into a to_char flavour and a from_char flavour, and have those functions do all the work; loop through each of the FormatNodes and run the giant switch statement. This means there is no need to put pointers to "action functions" in each of the KeyWords. It makes the code simpler, more readable -- and considerably shorter. A patch will be on the way shortly.
On 8/9/07, Brendan Jurd <direvus@gmail.com> wrote: > Just quick update on this. It turns out (as it always does) that I > want to refactor a bit more intensively than I first suggested. > [...] > This means there is no need to put pointers to "action functions" in > each of the KeyWords. It makes the code simpler, more readable -- and > considerably shorter. > > A patch will be on the way shortly. > take your time, this seems like it will be for 8.4 anyway -- regards, Jaime Casanova "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs and the universe trying to produce bigger and better idiots. So far, the universe is winning." Richard Cook
On 8/9/07, Jaime Casanova <systemguards@gmail.com> wrote: > > take your time, this seems like it will be for 8.4 anyway > I hear you, unfortunately "taking my time" usually means I forget about it for eight months and by the time I come back to it I've forgotten what I was doing =) I wasn't really expecting this to make it into 8.3. I just need to get it done so I can free up the headspace for other projects.