Re: Performance improvement for joins where outer side is unique - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Performance improvement for joins where outer side is unique
Date
Msg-id 54EE659F.8050004@2ndquadrant.com
Whole thread Raw
In response to Re: Performance improvement for joins where outer side is unique  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Performance improvement for joins where outer side is unique  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Performance improvement for joins where outer side is unique  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
Hi,

I tried to do an initdb with the patch applied, and seems there's a bug
somewhere in analyzejoins.c:

tomas@rimmer ~ $ pg_ctl -D tmp/pg-unidata init
The files belonging to this database system will be owned by user "tomas".
This user must also own the server process.

The database cluster will be initialized with locale "en_US".
The default database encoding has accordingly been set to "LATIN1".
The default text search configuration will be set to "english".

Data page checksums are disabled.

creating directory tmp/pg-unidata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... sysv
creating configuration files ... ok
creating template1 database in tmp/pg-unidata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... TRAP:
FailedAssertion("!(index_vars != ((List *) ((void *)0)))", File:
"analyzejoins.c", Line: 414)
sh: line 1:   339 Aborted
"/home/tomas/pg-unijoins/bin/postgres" --single -F -O -c
search_path=pg_catalog -c exit_on_error=true template1 > /dev/null
child process exited with exit code 134
initdb: removing data directory "tmp/pg-unidata"
pg_ctl: database system initialization failed


The problem seems to be the last command in setup_description() at
src/bin/initdb/initdb.c:1843, i.e. this query:
 WITH funcdescs AS (   SELECT p.oid as p_oid, oprname,   coalesce(obj_description(o.oid, 'pg_operator'),'') as opdesc
FROMpg_proc p JOIN pg_operator o ON oprcode = p.oid ) INSERT INTO pg_description     SELECT p_oid, 'pg_proc'::regclass,
0,      'implementation of ' || oprname || ' operator'     FROM funcdescs     WHERE opdesc NOT LIKE 'deprecated%' AND
 NOT EXISTS (SELECT 1 FROM pg_description       WHERE objoid = p_oid AND classoid = 'pg_proc'::regclass)
 
And particularly the join in the CTE, i.e. this fails
  SELECT * FROM pg_proc p JOIN pg_operator o ON oprcode = p.oid

I'm not quite sure why, but eclassjoin_is_unique_join() never actually
jumps into this part (line ~400):
   if (relvar != NULL && candidaterelvar != NULL)   {       ...       index_vars = lappend(index_vars,
candidaterelvar);      ...   }
 

so the index_vars is NIL. Not sure why, but I'm sure you'll spot the
issue right away.


BTW, I find this coding (first cast, then check) rather strange:
   Var *var = (Var *) ecm->em_expr;
   if (!IsA(var, Var))       continue; /* Ignore Consts */

It's probably harmless, but I find it confusing and I can't remember
seeing it elsewhere in the code (for example clausesel.c and such) use
this style:
   ... clause is (Node*) ...
   if (IsA(clause, Var))   {       Var *var = (Var*)clause;       ...   }

or
   Var * var = NULL;
   if (! IsA(clause, Var))       // error / continue
   var = (Var*)clause;


-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: contrib/fuzzystrmatch/dmetaphone.c license
Next
From: Jim Nasby
Date:
Subject: Re: Partitioning WIP patch (was: Partitioning: issues/ideas)