Home > mailing lists

Re: POC: converting Lists into arrays - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: POC: converting Lists into arrays
Date	March 4, 2019 22:01:33
Msg-id	20190304190133.vtv7vifuhkaqwh67@alap3.anarazel.de Whole thread Raw
In response to	Re: POC: converting Lists into arrays (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

Hi,

On 2019-03-02 18:11:43 -0500, Tom Lane wrote:
> On test cases like "pg_bench -S" it seems to be pretty much within the
> noise level of being the same speed as HEAD.

I think that might be because it's bottleneck is just elsewhere
(e.g. very context switch heavy, very few lists of any length).

FWIW, even just taking context switches out of the equation leads to
a ~5-6 %benefit in a simple statement:

DO $f$BEGIN FOR i IN 1..500000 LOOP EXECUTE $s$SELECT aid, bid, abalance, filler FROM pgbench_accounts WHERE aid =
2045530;$s$;ENDLOOP;END;$f$;
 

master:
+    6.05%  postgres  postgres            [.] AllocSetAlloc
+    5.52%  postgres  postgres            [.] base_yyparse
+    2.51%  postgres  postgres            [.] palloc
+    1.82%  postgres  postgres            [.] hash_search_with_hash_value
+    1.61%  postgres  postgres            [.] core_yylex
+    1.57%  postgres  postgres            [.] SearchCatCache1
+    1.43%  postgres  postgres            [.] expression_tree_walker.part.4
+    1.09%  postgres  postgres            [.] check_stack_depth
+    1.08%  postgres  postgres            [.] MemoryContextAllocZeroAligned

patch v3:
+    5.77%  postgres  postgres            [.] base_yyparse
+    4.88%  postgres  postgres            [.] AllocSetAlloc
+    1.95%  postgres  postgres            [.] hash_search_with_hash_value
+    1.89%  postgres  postgres            [.] core_yylex
+    1.64%  postgres  postgres            [.] SearchCatCache1
+    1.46%  postgres  postgres            [.] expression_tree_walker.part.0
+    1.45%  postgres  postgres            [.] palloc
+    1.18%  postgres  postgres            [.] check_stack_depth
+    1.13%  postgres  postgres            [.] MemoryContextAllocZeroAligned
+    1.04%  postgres  libc-2.28.so        [.] _int_malloc
+    1.01%  postgres  postgres            [.] nocachegetattr

And even just pgbenching the EXECUTEd statement above gives me a
reproducible ~3.5% gain when using -M simple, and ~3% when using -M
prepared.

Note than when not using prepared statement (a pretty important
workload, especially as long as we don't have a pooling solution that
actually allows using prepared statement across connections), even after
the patch most of the allocator overhead is still from list allocations,
but it's near exclusively just the "create a new list" case:

+    5.77%  postgres  postgres            [.] base_yyparse
-    4.88%  postgres  postgres            [.] AllocSetAlloc
   - 80.67% AllocSetAlloc
      - 68.85% AllocSetAlloc
         - 57.65% palloc
            - 50.30% new_list (inlined)
               - 37.34% lappend
                  + 12.66% pull_var_clause_walker
                  + 8.83% build_index_tlist (inlined)
                  + 8.80% make_pathtarget_from_tlist
                  + 8.73% get_quals_from_indexclauses (inlined)
                  + 8.73% distribute_restrictinfo_to_rels
                  + 8.68% RewriteQuery
                  + 8.56% transformTargetList
                  + 8.46% make_rel_from_joinlist
                  + 4.36% pg_plan_queries
                  + 4.30% add_rte_to_flat_rtable (inlined)
                  + 4.29% build_index_paths
                  + 4.23% match_clause_to_index (inlined)
                  + 4.22% expression_tree_mutator
                  + 4.14% transformFromClause
                  + 1.02% get_index_paths
               + 17.35% list_make1_impl
               + 16.56% list_make1_impl (inlined)
               + 15.87% lcons
               + 11.31% list_copy (inlined)
               + 1.58% lappend_oid
            + 12.90% expression_tree_mutator
            + 9.73% get_relation_info
            + 4.71% bms_copy (inlined)
            + 2.44% downcase_identifier
            + 2.43% heap_tuple_untoast_attr
            + 2.37% add_rte_to_flat_rtable (inlined)
            + 1.69% btbeginscan
            + 1.65% CreateTemplateTupleDesc
            + 1.61% core_yyalloc (inlined)
            + 1.59% heap_copytuple
            + 1.54% text_to_cstring (inlined)
            + 0.84% ExprEvalPushStep (inlined)
            + 0.84% ExecInitRangeTable
            + 0.84% scanner_init
            + 0.83% ExecInitRangeTable
            + 0.81% CreateQueryDesc
            + 0.81% _bt_search
            + 0.77% ExecIndexBuildScanKeys
            + 0.66% RelationGetIndexScan
            + 0.65% make_pathtarget_from_tlist


Given how hard it is to improve performance with as flatly distributed
costs as the above profiles, I actually think these are quite promising
results.

I'm not even convinced that it makes all that much sense to measure
end-to-end performance here, it might be worthwhile to measure with a
debugging function that allows to exercise parsing, parse-analysis,
rewrite etc at configurable loop counts. Given the relatively evenly
distributed profiles were going to have to make a few different
improvements to make headway, and it's hard to see benefits of
individual ones if you look at the overall numbers.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Alvaro Herrera
Date: 04 March 2019, 21:56:00
Subject: Re: pg_partition_tree crashes for a non-defined relation

From: Andres Freund
Date: 04 March 2019, 22:03:27
Subject: Re: POC: converting Lists into arrays

Re: POC: converting Lists into arrays - Mailing list pgsql-hackers

Previous

Next