Sergey Koposov <Sergey.Koposov@ed.ac.uk> writes:
> wsdb=> set cursor_tuple_fraction TO 1;
> SET
> wsdb=> EXPLAIN DECLARE c CURSOR FOR select phot_g_mean_mag,phot_bp_mean_mag,phot_rp_mean_mag,
>
g.source_id,pmra,pmdec,parallax,parallax_error,d.ra,d.dec,d.ebv,d.gmeanpsfmag,rmeanpsfmag,imeanpsfmag,gkronmag,rkronmag
from disk_sample1 as d left join
> gaia_edr3_aux.panstarrs1bestneighbour as gaps1 on (gaps1.original_ext_source_id=d.objid) left join
gaia_edr3.gaia_sourceas g on ( g.source_id = gaps1.source_id
> ) ;
> QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------
> Hash Right Join (cost=112529534.06..422118114.74 rows=65470868 width=100)
> Hash Cond: (g.source_id = gaps1.source_id)
> -> Seq Scan on gaia_source g (cost=0.00..124645680.12 rows=1811786112 width=48)
> -> Hash (cost=111007847.21..111007847.21 rows=65470868 width=60)
> -> Hash Right Join (cost=3231089.53..111007847.21 rows=65470868 width=60)
> Hash Cond: (gaps1.original_ext_source_id = d.objid)
> -> Seq Scan on panstarrs1bestneighbour gaps1 (cost=0.00..17443258.96 rows=951450496 width=16)
> -> Hash (cost=1709402.68..1709402.68 rows=65470868 width=60)
> -> Seq Scan on disk_sample1 d (cost=0.00..1709402.68 rows=65470868 width=60)
Oh, *that* is interesting: now we have two hash joins, and neither one
has panstarrs1bestneighbour as the table to hash, so it's not too
clear which one is going crazy. But you showed the stats for
disk_sample1.objid, and that looked pretty well distributed, so I'm
going to guess that that hash is fine. That leaves the other join
on panstarrs1bestneighbour.source_id as the one under suspicion.
Can we see the stats for that column?
regards, tom lane