Thread: Query Hints? No thanks. Data hints?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi -hackers, In another thread about "GUC parameter cursors_tuple_fraction", the debate seems to drift onto query hints. About which the consensus here is pretty clear and strong, no query hints in PostgreSQL, thanks, or we're never gonna have a perfect generic planner. IIRC, I've read here in the past some attempts to begin a proposal on the topic of data hints, allowing the user to describe his data in a way ANALYZE can't figure out by itself, as e.g. "this column value is tied to this other column value in this way". This could be a materialized column, mutual-exclusive NOT NULLs, or any multi-columns relationships, as well as "this table is a fact table", etc. What do you -hackers think about such a plan: - assess cases where the planner is failing short of good statistics - assess data properties SQL does not give us but would be of interrest to internals, and at the same time not so difficult to know about by DBAs - based on this, prepare a descriptive language of some sort tying this all in - implement it in a good way ;) I'm thinking we could have a new set of commands to tell PostgreSQL some "high-level" facts about the data, e.g. "there's a injective function such as f(t.colA) = t.colB" or any useful thing to be found in the firsts proposed step. Is there a chance we're gonna improve the planner this way? And answer Simon's (and many others here and there, -performance etc) concerns? HTH, regards, - -- dim -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iEYEARECAAYFAkgeEjoACgkQlBXRlnbh1bnlHwCfcHL5uOlCpptekwLBMp+E9kUn 4roAoMfwdITByHtxCi35l9jDCTSFw2Ho =whVn -----END PGP SIGNATURE-----
On Sun, May 04, 2008 at 09:44:58PM +0200, Dimitri Fontaine wrote: > IIRC, I've read here in the past some attempts to begin a proposal on > the topic of data hints, allowing the user to describe his data in a > way ANALYZE can't figure out by itself, as e.g. "this column value is > tied to this other column value in this way". There has been talk about this in the past, things like being able to store cross-table correlations which isn't know at all. The problem is two-fold: - How do you calculate the stats? - How do you use them? Cross-table correlations are easy for the second part, because it's fairly simple to see where it could be used. However, no-one has come up with an algorithm to produce a useful number to use. For others it's harder. > I'm thinking we could have a new set of commands to tell PostgreSQL > some "high-level" facts about the data, e.g. "there's a injective > function such as f(t.colA) = t.colB" or any useful thing to be found > in the firsts proposed step. In general postgres could use many bits of information not currently available. For example: A=B implies lower(A)=lower(B), hence an index on lower(A) could be used to optimise comparisons against A. Certain operations preserve order, which may also be useful. You can always propose something, see where it goes. Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > Please line up in a tree and maintain the heap invariant while > boarding. Thank you for flying nlogn airlines.
Am Dienstag, 6. Mai 2008 schrieb Martijn van Oosterhout: > Cross-table correlations are easy for the second part, because it's > fairly simple to see where it could be used. However, no-one has come > up with an algorithm to produce a useful number to use. For others it's > harder. For an algorithm, principal components analysis would be my guess. It is designed to answer the question "this column value is tied to this other column value in this way" [quote Simon], at least for the sort of data that a B-tree would cover. For nonlinear data, it is of course harder. > In general postgres could use many bits of information not currently > available. For example: A=B implies lower(A)=lower(B), hence an index > on lower(A) could be used to optimise comparisons against A. Certain > operations preserve order, which may also be useful. Locale horrors looming ... ;-)