Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? - Mailing list pgsql-hackers
| From | John Naylor |
|---|---|
| Subject | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
| Date | |
| Msg-id | CANWCAZZ2kpm7BN93NkN7b2WRSAQcdBcdbjCOAzd6r9AS0nNHbg@mail.gmail.com Whole thread Raw |
| In response to | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? (Lukas Fittl <lukas@fittl.com>) |
| List | pgsql-hackers |
On Mon, Mar 23, 2026 at 1:13 AM Lukas Fittl <lukas@fittl.com> wrote:
>
> Hi John,
>
> On Sun, Mar 15, 2026 at 5:49 PM John Naylor <johncnaylorls@gmail.com> wrote:
> >
> > On Tue, Mar 10, 2026 at 5:04 PM Lukas Fittl <lukas@fittl.com> wrote:
> >
> > > > Elsewhere in the patch series there are comments that refer to EBX etc
> > > > while the code looks like exx[1]. It hasn't been a problem up to now
> > > > since there were only a couple places that used cpuid. But this
> > > > patchset adds quite a few more, so now seems like a good time to make
> > > > it more readable with register name symbols.
> > >
> > > I'm not a fan of using macros for this, but what if we define
> > > ourselves a struct like this:
> > >
> > > typedef struct CPUIDResult
> > > {
> > > unsigned int eax;
> > > unsigned int ebx;
> > > unsigned int ecx;
> > > unsigned int edx;
> > > } CPUIDResult;
> > >
> > > Then the code reads like:
> > >
> > > pg_cpuid(0x80000001, &r);
> > > X86Features[PG_RDTSCP] = r.edx >> 27 & 1;
> >
> > One objection is that __cpuid() takes an array of 4 integers as an
> > argument. I think it would technically happen to work to pass a
> > pointer to this struct, but it seems the wrong thing to do. If you're
> > not a fan of macros, the other way would be an enum of indices (and
> > named with all caps).
>
> How about we deal with this in the pg_cpuid function by having
> __cpuid() write into a separate array and then assign that to
> CPUIDResult? See the attached 0001 patch.
>
> I think the struct fields are the clearest approach here, but if you
> disagree let me know and I can adjust further.
I still don't see the value of the struct, since it doesn't get rid of
the array, which we still need for some APIs, so it's just adding
indirection:
+static inline void
+pg_cpuid(int leaf, CPUIDResult *r)
+{
+#if defined(HAVE__GET_CPUID)
+ __get_cpuid(leaf, &r->eax, &r->ebx, &r->ecx, &r->edx);
+#elif defined(HAVE__CPUID)
+ int exx[4] = {0};
+
+ __cpuid(exx, leaf);
+ r->eax = exx[0];
+ r->ebx = exx[1];
+ r->ecx = exx[2];
+ r->edx = exx[3];
+static inline bool
+pg_cpuid_subleaf(int leaf, int subleaf, CPUIDResult *r)
+{
+#if defined(HAVE__GET_CPUID_COUNT)
+ return __get_cpuid_count(leaf, subleaf, &r->eax, &r->ebx, &r->ecx,
&r->edx) == 1;
+#elif defined(HAVE__CPUIDEX)
+ int exx[4] = {0};
+
+ __cpuidex(exx, leaf, subleaf);
+ r->eax = exx[0];
+ r->ebx = exx[1];
+ r->ecx = exx[2];
+ r->edx = exx[3];
+ return true;
It's not that bad that the hard-coded indexes are in two places, but
it's also not necessary. I think "#define EAX 0 ...etc" is a fine,
straightforward increase in readability compared to what we have now,
and I don't see any downside. I suppose one argument in favor of the
struct is that it avoids declaring variables of type
array-of-4-unsigned in multiple places, but I think the array is fine,
and adding a new typedef is additional cognitive friction.
Speaking of signedness, why is the array of ints sometimes signed and
sometimes unsigned?
> > This needs a comment to explain the return value.
>
> Good point, added a comment.
+ * Returns false if the CPUID leaf/subleaf is not supported.
Okay, thanks. I'd suggest using "true if X is supported" or "true if X
is supported, false otherwise" phrasing.
--
John Naylor
Amazon Web Services
pgsql-hackers by date: