Hi all,
We recently ran into an issue in pg_dump that caused the initial
sort-by-name pass to return incorrect results. It doesn't seem to
affect overall correctness, since the later toposort pass takes care
of dependencies, but it does occasionally cause a spurious diff in
dump output before and after a pg_upgrade run.
The key appears to be in this comment, in pg_dump_sort.c:
/*
* Sort by namespace. Note that all objects of the same type should
* either have or not have a namespace link, so we needn't be fancy about
* cases where one link is null and the other not.
*/
This doesn't appear to be correct anymore. From scanning the code, it
looks like the DO_DEFAULT_ACL type can optionally have a NULL
namespace. Even if it were correct, we can get to this part of the
code with objects of different types, as long as they share the same
sort priority (see DO_COLLATION and DO_TRANSFORM). We only ran into
this because of a bug in Greenplum that caused two types to share a
sort priority where they previously did not.
A quick and dirty patch is attached, which simply defines an ordering
between NULL and non-NULL namespaces so that quicksort behaves
rationally.
WDYT?
--Jacob