Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables |
Date | |
Msg-id | CAEepm=0JfKkAS6Ea8HgsoPJUUnL4++V0q7mPucFEiqn7cPmO0A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables (Thomas Munro <thomas.munro@enterprisedb.com>) |
Responses |
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
|
List | pgsql-hackers |
On Sat, Sep 16, 2017 at 2:41 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote: > On Sat, Sep 16, 2017 at 9:38 AM, Thomas Munro > <thomas.munro@enterprisedb.com> wrote: >> On Sat, Sep 16, 2017 at 9:23 AM, Robert Haas <robertmhaas@gmail.com> wrote: >>> On the overall patch set: >>> >>> - I am curious to know how this has been tested. How much of the new >>> code is covered by the tests in 0007-Partition-wise-join-tests.patch? >>> How much does coverage improve with >>> 0008-Extra-testcases-for-partition-wise-join-NOT-FOR-COMM.patch? What >>> code, if any, is not covered by either of those test suites? Could we >>> do meaningful testing of this with something like Andreas >>> Seltenreich's sqlsmith? >> >> FWIW I'm working on an answer to both of those question, but keep >> getting distracted by other things catching on fire... > > I cobbled together some scripts to figure out the test coverage of > lines actually modified by this patch set. Please see attached. > > I'm not sure if there is an established or better way to do this, but > I used git-blame to figure out which lines of gcov output can be > blamed on Ashutosh and prepended that to the lines of gcov's output. > That allowed me to find new/changed code not covered by "make check". > I found 94 untested new lines with 0007 applied and 88 untested new > lines with 0008 applied. The 6 lines that 0008 reaches and 0007 > doesn't are: > > ======== src/backend/optimizer/path/allpaths.c ======== > -[TOUCHED BY PATCH SET] #####: 3303: mark_dummy_rel(rel); > -[TOUCHED BY PATCH SET] #####: 3304: return; > -[TOUCHED BY PATCH SET] #####: 1515: continue; > -[TOUCHED BY PATCH SET] #####: 1526: continue; > ======== src/backend/optimizer/util/pathnode.c ======== > -[TOUCHED BY PATCH SET] #####: 3433: break; > -[TOUCHED BY PATCH SET] #####: 3435: return NULL; Two obvious questions: 1. What are we missing in the ~90 lines of non-covered code, and are there bugs lurking there? First, here's an easier to read report than the one I posted earlier. It's based on the whole patch stack (including the extra tests) from your v33 tarball: https://codecov.io/gh/postgresql-cfbot/postgresql/commit/19dace6fca0d9c2bca5022158cf28d99aa237550 The main areas of uncovered lines are: code in get_wholerow_ref_from_convert_row_type() and code that calls it, and per node type cases in reparameterize_path_by_child(). It seems like the former could use a test case, and I wonder if there is some way we could write "flat-copy and then apply recursively to all subpaths" code like this without having to handle these cases explicitly. There are a couple of other tiny return cases other than just sanity check errors which it might be interesting to hit too. 2. What queries in the 0008 patch are hitting lines that 0007 doesn't hit? I thought about how to answer questions like this and came up with a shell script that (1) makes computers run really hot for quite a long time and (2) tells you which blocks of SQL hit which lines of C. Please find attached the shell script and its output. The .sql files have been annotated with "block" numbers (blocks being chunks of SQL stuff separated by blank lines), and the C files annotated with references to those block numbers where A<n> = block <n> partition_join.sql and B<n> = block <n> in partition_join_extras.sql. Then to find lines that B queries hit but A queries don't and know which particular queries hit them, you might use something like: grep -v 'SQL blocks: .*A[0-9]' < joinpath.c.aggregated_coverage | \ grep 'SQL blocks: .*B[0-9]' (Off topic but by way of explanation: the attachment name ending .tarball.gz avoids .tgz or .tar.gz so my stupid cfbot doesn't think it's a new patch set. I need to figure something better out for that...) -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: