Re: WIP: Make timestamptz_out less slow. - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: WIP: Make timestamptz_out less slow.
Date
Msg-id CAM3SWZRCHQ7xNZe3oehahSDGKVtgTn+vD9xE4MkLMmasPihaCQ@mail.gmail.com
Whole thread Raw
In response to Re: WIP: Make timestamptz_out less slow.  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: WIP: Make timestamptz_out less slow.  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Fri, Sep 11, 2015 at 9:00 AM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
> Attached is also timestamp_delta.patch which changes pg_ltostr() to use the
> INT_MIN special case that pg_ltoa also uses. I didn't make a similar change
> to pg_ltostr_zeropad() as I'd need to add even more special code to prepend
> the correct number of zero before the 2147483648. This zero padding reason
> was why I originally came up with the alternative way to handle the negative
> numbers. I had just thought that it was best to keep both functions as
> similar as possible.

I've started looking at this -- "timestamp_out_speedup_2015-09-12.patch".

> I've not done anything yet to remove the #ifdef HAVE_INT64_TIMESTAMP from
> AppendSeconds(). The only way I can think to handle this is to just make
> fsec_t unconditionally an int64 (since with float datetimes the precision is
> 10 decimal digits after the dec point, this does not fit in int32). I'd go
> off and do this, but this means I need to write an int64 version of
> pg_ltostr(). Should I do it this way?

Have you thought about *just* having an int64 pg_ltostr()? Are you
aware of an appreciable overhead from having int32 callers just rely
on a promotion to int64?

In general, years are 1900-based in Postgres:

struct pg_tm
{...   int         tm_mday;   int         tm_mon;         /* origin 0, not 1 */   int         tm_year;        /*
relativeto 1900 */...
 
};

While it's not your patch's fault, it is not noted by EncodeDateTime()
and EncodeDateOnly() and others that there pg_tm structs are not
1900-based. This is in contrast to multiple functions within
formatting.c, nabstime.c, and timestamp.c (some of which are clients
or clients of clients for this new code). I think that the rule has
been undermined so much that it doesn't make sense to indicate
exceptions directly, though. I think pg_tm should have comments saying
that there are many cases where tm_year is not relative to 1900.

I have a few minor nitpicks about the patch.

No need for "negative number" comment here -- just use
"Assert(precision >= 0)" code:

+ * Note 'precision' must not be a negative number.
+ * Note callers should assume cp will not be NUL terminated. */
-static void
+static char *AppendSeconds(char *cp, int sec, fsec_t fsec, int precision, bool fillzeros){
+#ifdef HAVE_INT64_TIMESTAMP
+

I would just use a period/full stop here:

+ * Note str is not NUL terminated, callers are responsible for NUL terminating
+ * str themselves.

Don't think you need so many "Note:" specifiers here:

+ * Note: Callers should ensure that 'padding' is above zero.
+ * Note: This function is optimized for the case where the number is not too
+ *      big to fit inside of the specified padding.
+ * Note: Caller must ensure that 'str' points to enough memory to hold the
+        result (at least 12 bytes, counting a leading sign and trailing NUL,
+        or padding + 1 bytes, whichever is larger).
+ */
+char *
+pg_ltostr_zeropad(char *str, int32 value, int32 padding)

Not so sure about this, within the new pg_ltostr_zeropad() function:

+   /*
+    * Handle negative numbers in a special way. We can't just append a '-'
+    * prefix and reverse the sign as on two's complement machines negative
+    * numbers can be 1 further from 0 than positive numbers, we do it this way
+    * so we properly handle the smallest possible value.
+    */
+   if (num < 0)
+   {     ...
+   }
+   else
+   {     ...
+   }

Maybe it's better to just special case INT_MIN and the do an Abs()?
Actually, would any likely caller of pg_ltostr_zeropad() actually care
if INT_MIN was a case that you cannot handle, and assert against? You
could perhaps reasonably make it the caller's problem. Haven't made up
my mind if your timestamp_delta.patch is better than that; cannot
immediately see a problem with putting it on the caller.

More generally, maybe routines like EncodeDateTime() ought now to use
the Abs() macro.

Those are all of my nitpicks. :-)

> I also did some tests on server grade hardware, and the performance increase
> is looking a bit better.
>
> create table ts (ts timestamp not null);
> insert into ts select generate_series('2010-01-01 00:00:00', '2011-01-01
> 00:00:00', '1 sec'::interval);
> vacuum freeze ts;

On my laptop, I saw a 2.4X - 2.6X speedup for this case. That seems
pretty nice to me.

Will make another pass over this soon.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Request: pg_cancel_backend variant that handles 'idle in transaction' sessions
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Foreign join pushdown vs EvalPlanQual