Re: some namespace.c refactoring - Mailing list pgsql-hackers

From vignesh C
Subject Re: some namespace.c refactoring
Date
Msg-id CALDaNm0xD7+B4EsijJRuNUTk-=wFWOiJ2gycUtgKZQnutOfZCA@mail.gmail.com
Whole thread Raw
In response to Re: some namespace.c refactoring  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
List pgsql-hackers
On Thu, 23 Feb 2023 at 16:38, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 20.02.23 15:03, Peter Eisentraut wrote:
> > On 15.02.23 19:04, Alvaro Herrera wrote:
> >> That said, I think most of this code is invoked for DDL, where
> >> performance is not so critical; probably just fixing
> >> get_object_property_data to not be so naïve would suffice.
> >
> > Ok, I'll look into that.
>
> I did a variety of performance testing on this now.
>
> I wrote a C function that calls the "is visible" functions in a tight loop:
>
> Datum
> pg_test_visible(PG_FUNCTION_ARGS)
> {
>      int32 count = PG_GETARG_INT32(0);
>      Oid relid = PG_GETARG_OID(1);
>      Oid typid = PG_GETARG_OID(2);
>
>      for (int i = 0; i < count; i++)
>      {
>          RelationIsVisible(relid);
>          TypeIsVisible(typid);
>          //ObjectIsVisible(RelationRelationId, relid);
>          //ObjectIsVisible(TypeRelationId, typid);
>      }
>
>      PG_RETURN_VOID();
> }
>
> (It's calling two different ones to defeat the caching in
> get_object_property_data().)
>
> Here are some run times:
>
> unpatched:
>
> select pg_test_visible(100_000_000, 'pg_class', 'int4');
> Time: 4536.747 ms (00:04.537)
>
> select pg_test_visible(100_000_000, 'tenk1', 'widget');
> Time: 10828.802 ms (00:10.829)
>
> (Note that the "is visible" functions special case system catalogs.)
>
> patched:
>
> select pg_test_visible(100_000_000, 'pg_class', 'int4');
> Time: 11409.948 ms (00:11.410)
>
> select pg_test_visible(100_000_000, 'tenk1', 'widget');
> Time: 18649.496 ms (00:18.649)
>
> So, it's slower, but it's not clear whether it matters in practice,
> considering this test.
>
> I also wondered if this is visible through a normal external function
> call, so I tried
>
> do $$ begin perform pg_get_statisticsobjdef(28999) from
> generate_series(1, 1_000_000); end $$;
>
> (where that is the OID of the first object from select * from
> pg_statistic_ext; in the regression database).
>
> unpatched:
>
> Time: 6952.259 ms (00:06.952)
>
> patched (first patch only):
>
> Time: 6993.655 ms (00:06.994)
>
> patched (both patches):
>
> Time: 7114.290 ms (00:07.114)
>
> So there is some visible impact, but again, the test isn't realistic.
>
> Then I tried a few ways to make get_object_property_data() faster.  I
> tried building a class_id+index cache that is qsort'ed (once) and then
> bsearch'ed, that helped only minimally, not enough to make up the
> difference.  I also tried just searching the class_id+index cache
> linearly, hoping maybe that if the cache is smaller it would be more
> efficient to access, but that actually made things (minimally) worse,
> probably because of the indirection.  So it might be hard to get much
> more out of this.  I also thought about PerfectHash, but I didn't code
> that up yet.
>
> Another way would be to not use get_object_property_data() at all but
> write a "common" function that we pass in all it needs hardcodedly, like
>
> bool
> RelationIsVisible(Oid relid)
> {
>      return IsVisible_common(RELOID,
>                              Anum_pg_class_relname
>                              Anum_pg_class_relnamespace);
> }
>
> This would still save a lot of duplicate code.
>
> But again, I don't think the micro-performance really matters here.

I'm seeing that there has been no activity in this thread for almost a
year now, I'm planning to close this in the current commitfest unless
someone is planning to take it forward.

Regards,
Vignesh



pgsql-hackers by date:

Previous
From: vignesh C
Date:
Subject: Re: Using AF_UNIX sockets always for tests on Windows
Next
From: vignesh C
Date:
Subject: Re: Trivial revise for the check of parameterized partial paths